In pursuit of software: major programs you won't want to miss

By Steven Struhl | February 1, 1999

Reading time: 30 minutes

Abstract

Software has reached unprecedented levels of product proliferation. This article reviews four statistics software packages: SPSS 9.0, Systat 8.0, AnswerTree 2.0 and DeltaGraph 4.0.

Research Topics:: Classification Tree Analysis | Conjoint Analysis/Trade-Off Analysis | Data Analysis | Software-Data Analysis | Software-Data Delivery Tools | Software-Data Tabulation | Software-Survey Design & Analysis
Content Type: Software Review

Share Print

Editor’s note: Dr. Steven Struhl is vice president and senior methodologist at Total Research, Inc., Chicago.

Software has reached unprecedented levels of product proliferation. Just as we start to know what a software program really can do and what its bugs are, out comes an updated version with its own new capabilities and failings.

Software companies, as a group, definitely have become more clever. They keep the development engines turning at full speed, and keep announcing that they have the latest thing, and then the latest, latest thing - and so forth. Then they just need to wait for the poor user to feel hopelessly out-of-date. Of course, some hearty souls will keep running software, especially business-related software, that is a version or two behind the newest. But I have met few capable of standing the heat after they fall three or more versions behind.

The result: users buy upgrades, and the software company has a nice, steady stream of revenue.

Now of course, this is just a theory, so all you software manufacturers don’t need to "flame" me (or, send me hate mail, as we used to call it in the old days).

Whatever the cause, there is an amazing amount of software available to learn about, let alone review in some coherent fashion. What started as a review of the new SYSTAT product began to seem inadequate as all these other products (SPSS 9.0, AnswerTree 2.0, etc.) hit the market.

For this review, we’re going to make the rash assumption that our readers understand something about SPSS and what it does. We direct any reader who is resolutely plowing into this review in spite of feeling uncertain about what we are reviewing to look at the sidebar discussion, "Statistical programs vs. the spreadsheets."

SPSS history and the new release’s place in the great chain of updates

By the time you read this, SPSS Version 9.0 will have been released. When we last reviewed SPSS in 1997 it had just emerged as Version 7.5. That update in turn had followed quickly after Version 7.0, which marked a major change for the program.

With this earlier (7.0) release, SPSS had made its most dramatic move away from the typewriter-style output that was then standard for statistics programs. With the new SPSS, many tables and other types of results appeared as "Windows objects," and looked good enough to grace any scientific or technical journal. The output used nice proportional type and offered a variety of fancy formats that you could apply, just as you could in a major spreadsheet such as Excel.

With 7.0, SPSS finally gave you a way to find your place in a long string of output. It did this via a feature called the "output navigator," which replaced the sometimes endless-seeming unmarked string of text output that SPSS and all other statistics programs once generated. It also did away with something called a "chart carousel" - which was not the fun ride that its name implied - to the cheers of appreciative users everywhere.

The navigator organized all the types of output that SPSS could produce. Small "books" went into an organized tree-like display window to the left of the display screen, with each book containing all the output from one procedure or method that you ran on your data. Fig. 1 shows you a small section of some SPSS output as it appeared in the navigator. In this, you can see a series of small books to the left and the actual output corresponding to one of those books on the right.

You could scroll down in the left-side window to any book, and so go quickly to the exact portion of the analysis that you needed. The navigator allowed you to label these books yourself, or simply to leave the program’s own non-specific labels (such as "descriptives," or "analysis of variance"). The left window in Fig. 1 includes some output that has had descriptive names added in the left-hand window.

This output even has the flexibility to allow you to move sections of the analysis by dragging the books to different spots in the tree display. You either can edit the output on the spot - which admittedly at times is not that simple in the SPSS format - or save the output to revise later.

Perhaps most important, starting with the version 7.0 series, you also could copy and paste all or any part of the output into Windows-based word processing, spreadsheet, and presentation packages. You had (and have) several options for cutting and pasting, depending on the target program for the output.

Version 7.0 introduced the pivot table as a new form of SPSS output. Version 7.5, which we last reviewed, seemed mostly concerned with getting these new tables to work more flexibly, and otherwise enhancing and cleaning up its earlier series 7 sibling.

Pivot tables probably are familiar to anybody who has spent much time with Excel, Quattro Pro, or Lotus 1-2-3 in those programs’ more recent incarnations. In a pivot table, you can swap rows and columns simply by selecting the appropriate output and the correct command from a menu. If you happen to have a table with "nesting" (headings within headings), you can change rows, columns and nestings within each.

You may not need to do such fancy maneuvering often -- or ever, for that matter -- but now you can. Figs. 2a and 2b recall our original review. They show two pivoted views of the same data. In an attempt to keep the level of detail somewhat sane, the tables show only counts (or the number in each cell). If you were to choose any extra in-cell statistics, these too would also get pivoted instantly.

Getting closer to version 9.0: Smart Viewers and other things we missed from version 8.0

As the subheadline mentions, by the time SPSS 9.0 came out, we’d missed our chance to review SPSS 8.0. In that version, SPSS added more flexibility to its graphics, reorganized and expanded the program, and renamed the Navigator the "Viewer."

This last innovation came along with a companion program, now called the Smart Viewer. With this, you could send output from the program directly along to other people - or, in its largest version, to have them look at the output on the Web. It is all in its native SPSS-output format, and you can use SPSS Viewer features, like changing tables of numbers directly into charts. (Note that Smart Viewer requires that you pay a fee for each user who receives your output, and the Web server is a major expense - more about this later.)

Smart Viewer is true to its name to the extent that it allows people to look at various views of statistical output but not to go back to your original data and fool around with it. However, anybody armed with the Smart Viewer could push and pull around any output you sent them, pivoting tables, "drilling down into the data," and generally getting themselves completely dizzy.

Your reviewer understands that the idea of distributing and sharing data, even analysis of data, is considered a noble cause in many quarters. Some go so far as calling this practice "empowering your users." In fact, we now have a new mysterious acronym that describes this well-meaning idea, OLAP, or on-line analytical processing.

As many of you already know, though, we need more than good intentions to survive. Smart Viewer does an excellent job of distributing tables in electronic format, and even your reviewer thinks that is a fine idea. At least, it is for those tables that come with statistical testing on them. A surprising amount of SPSS output still comes without statistical testing anyplace on it, in particular from the SPSS Tables module. (This module, by the way, remains as it was in Version 8.0).

It’s just the thought of giving all and sundry a mass of non-annotated data analysis to "play with" that leaves me less than satisfied. For instance, look at Fig. 3. Which differences are significant at the 95 percent confidence level? Which are not quite significant, but "directional," at the 80 percent to 94.99 percent level?

We have independent samples of 510 and 300 respectively, and data were analyzed in the form of one large pre-to-post crosstabulation, using the adjusted standardized residual test available in SPSS crosstabulation procedure.

The correct answers are: significant increase for "buy it all the time," significant decrease for "never heard of the stuff," and "directional" increase for "heard about it, may get around to trying it."

As you can see, the procedures for doing the analysis, and their output, are not getting any easier to understand. Whoever analyzes the data needs to include ample instructions about interpretation, and many reminders about staying somewhat reasonable in using the output. This need is increasing all the time.

However, the SPSS Viewer does not yet match any of the major presentation packages, or spreadsheets, or word processors, in making a well-formatted explanatory and interpretive document. Here are the questions that distributing data analyses raise for me: Would the people who need to apply the data understand the unvarnished output from a factor analysis, a discriminant analysis, a one-way analysis of variance, etc.? Or even our adjusted standardized residuals, as were used in the example?

I suppose some utopian community exists somewhere where all involved can interpret things like these. Would somebody please tell me where to find it? I’m ready to pack my things and move there immediately.

Now, without further delay, SPSS 9.0

SPSS 9.0 is a major upgrade of the program, bringing with it a host of new and useful features. It also brings some bad news, in that you will most likely need to buy more program pieces, or modules, than you did in the past to get all this newly added power.

Leading among the good-news items is that SPSS can now do multinomial logit (MNL) - or more pedantically, polytomous logit - and so can be used to solve problems in discrete choice modeling. This new capability is part of a new Regression Models module, which replaces the old Professional Statistics module.

Chief among the bad-news items is that conjoint analysis now has become its own module, separate from the rest of "Categories." If you want to use SPSS to do both conjoint and correspondence analysis, for instance, you will now need to buy two modules. (Actually, this happened in Version 8.0, with anyone who had an earlier version of Categories getting the conjoint module free. Since we missed reporting on the Version 8.0 upgrade, this has to pass for news in this review.)

Disappointingly, but not new to Version 9.0, to do repeated-measures analysis of variance, you must get the Advanced Models module, which replaces Advanced Statistics. As a reminder, you need to use repeated measures analysis of variance where the measurements are done repeatedly over time among the same group. (This requirement of a module in SPSS is quite different from the situation in SYSTAT, where if you want the analysis to be treated as repeated measures, you just check the appropriate box on-screen in the analysis of variance routine.)

Table 1 shows a run-down on what you will find in SPSS 9.0, and which parts include major revamping or new procedures. The newest of the new appear in the table with a check mark (like this ?). These are only highlights; you can find full details at www.spss.com/software/spss/spss90/specs.htm.

In short, SPSS now provides almost all the data analytical power needed for anybody involved in practical decision-making, whether in market research, marketing, corporate planning or other areas. As always, it does an exemplary job of covering the basics, with exceptionally good data management and transformation capabilities. Its addition of polytomous logit - or as most people prefer, multinomial logit (MNL) - models fills the one major gap it had in its analytical procedures.

Incidentally, don’t feel too bad if some of the terms in the table above are not familiar to you. In the data analysis community, methods gain followings among small groups. For instance, every now and then, you will run across somebody who tells you that two-stage least squares will answer all of humanity’s problems. In response to this, I suggest that you nod your head sagely, and then ask if you can view this great seer’s yacht. The quality of the response to this question should prove how much real-world application this "unique method" really has.

SPSS also sells a host of companion products that we will not have a chance to review here. These include but are not limited to:

allCLEAR for process analysis;
AMOS for path analysis and structural equation models;
MapInfo for the geographical display of data;
Neural Connection for neural networks;
SPSS Diamond for visual exploration of complex data relationships;
Trial Run for generation of experimental designs.

As this suggests, SPSS is moving far beyond the basics and into some specialized areas. It will be quite interesting to see how SPSS responds to changing tastes in market research’s data analysis community. SPSS, for instance, does not yet have modules for Latent Class Analysis or Gibbs Sampling. By the way, both of these methods are gaining some adherents as solutions for getting individual-level data, or grouping people based on their responses, based on multinomial logit analysis. (At least, these methods are heating up some interest among the approximately 0.017 percent of the data analytical community that cares about such things.) Modules or procedures for doing either in a program like SPSS likely would be all that’s needed to gain either procedure much wider acceptance.

Finally, as is always the case with SPSS, release 9.0 seems solid and remarkably free of bugs or operating problems. Here, though, you should note that many bugs are specific to certain combinations of hardware and software that may exist on a system. As such, it is not possible for your reviewer to know where you may have a potential problem area on your computer. SPSS takes a very serious attitude toward any problems you may report, which places them miles ahead of many other software and hardware companies. One of the nicest things about SPSS is this knowledge that - in the rare instance that the software bogs down on you - you can pick up the phone and get a serious investigation and response from technical support.

SYSTAT 8.0 for Windows

For many years, SYSTAT and SPSS competed with each other. Then, not too long ago, they merged. Some people, in particular loyal SYSTAT users, watched with concern. Some wondered whether SYSTAT would even continue as a program in its own right, or whether its "best" features (not necessarily based on their opinions) would simply get folded into SPSS.

Several years later, not only do concerns about the demise of SYSTAT seem premature, but the program is flourishing. This is one of those cases where all the outcomes were happy, at least from the user’s point of view. SPSS has incorporated many good features from SYSTAT, and SYSTAT has now adopted a much more integrated graphical interface, like the one used by SPSS. Whichever program you prefer, each has improved.

Interestingly enough, although SPSS, Inc., now identifies SYSTAT as a "scientific" product, it by no means is a simple duplicate of the flagship SPSS program. Since both SYSTAT and SPSS are full-featured programs, they overlap in most basic areas -- but beyond this, each remains highly distinctive.

SYSTAT has always had a certain reputation - again, at least among those who care - that differentiates it from SPSS. SYSTAT was always more compact, sometimes (but not always) offering a few less options than SPSS, and somewhat more idiosyncratic in its look and feel. The SYSTAT manuals had a well-founded reputation for a style that is both clear-headed and (for statistics) lively. SYSTAT also had a reputation for being a step or two ahead of SPSS in its graphical output, and a step or two behind in its other output.

Of the two, SYSTAT clung more closely to its "command-line"-based heritage. For those of you who can’t - or won’t - remember this, way back in the bad old days, statistics programs expected a lot of typed commands, and gave their poor users little on-line help in getting this right. SPSS, starting as it did before 1970, went through a long time like this.

Way back then, you simply:

referred to the several-thousand page reference manual on command syntax;
wrote everything down a few times;
submitted a batch job; and then
waited for the thousands of error messages due to your forgetting a comma or period someplace.

Incidentally, when you got the sacred error sheets from the technicians running the real big computer, invariably only the first error would have any meaning. All the rest would be completely inaccurate because the first error threw the program off entirely. (Don’t complain about this story or I’ll tell you about having to use punched cards to submit data and analysis.)

Beyond this, SYSTAT always has had its own set of features distinct from those of SPSS. Even in the procedures they had in common, SYSTAT always had something slightly different to offer.

All of these differences persist, at least to some extent, in the current versions. For these reasons, I’ve always seen these programs as complementary, rather than as strict replacements for each other. If you get both programs, you should find your data-analytical needs extremely well covered. Procedures that SYSTAT has that are not in SPSS include the following, in no particular order:

Bootstrapping of error estimates. This procedure is really an add-on to many other procedures in SYSTAT, and allows you to get accurate readings on the error in estimates where this was not formerly possible.

Most notably, bootstrapping can provide the standard errors of coefficients from multinomial logistic regression. This makes bootstrapping a very useful option for analyzing discrete choice modeling problems. (For the more statistically inclined readers, bootstrapping works better than the - perhaps - more familiar Wald tests for the coefficients of nonlinear models.)

You also can use bootstrapping to estimate the standard errors of medians, the standard errors of Spearman correlations, and the standard errors of regression coefficients where predictors are highly intercorrelated.

Bootstrapping as implemented by SYSTAT actually includes three related estimating procedures, more correctly called jackknife, simple replacement, and (finally) bootstrap. These procedures determine errors empirically, with calculations based on drawing many subsamples or subsets from the data set.

Since you run bootstrapping procedures hundreds or thousands of times to get the required estimates, plan to leave a little extra time for it. You probably will want to wait until just before lunch -- or better, a few minutes before quitting time -- to turn the computer loose on this type of problem. (A few years from now, though, if computer speeds continue increasing as they have been, you’ll probably get this all done while you have a cup of coffee.)

We should note that SPSS has bootstrapping as a feature in non-linear regression, but this is not a procedure that can be used in many places, unlike the bootstrapping in SYSTAT.

Conjoint analysis that probably differs strongly from what you would expect, and that certainly differs from the conjoint module in SPSS. This is actually a general-purpose modeling program that will fit additive models to data that you cannot measure with more specialized conjoint models.

You can fit trade-off models, for instance, to data that does not come from experimental designs. This program can address the question of whether this type of model could fit, once you have data that was not collected with a standard conjoint procedure. As such, it could work as a useful supplement to the standard conjoint methods more familiar to many of you.

Path analysis with a special and powerful module, called RAMONA. This is part of the main SYSTAT program, not an add-on module, as the product AMOS is for SPSS. (Both RAMONA and AMOS are fancy acronyms, of which we will spare you any explanation.) AMOS has a more graphical, Windows-like interface than RAMONA, but both programs pack ample analytical power. All you need is the faith that you can specify the many "latent," or hidden, relationships required in these models, and you will be ready to go with either program.
Spatial statistics including such esoteric methods as 2-D and 3-D variograms, kriging, and Voroni tesselations. Now, you may never have heard of these things, but rest assured that SYSTAT here lives up to its reputation, giving you a tremendous amount of graphical display power. Its graphics manual is itself a tour de force, starting with a discussion of cognitive science and graphic design, and taking you through a remarkably informational tour of the many ways in which data can become graphs or charts.

New output organizer

One of the best pieces of news about SYSTAT is that it now handles output much more like SPSS. You see all the results organized into a two-window display, with a tree-like panel to the left, showing where you are in the analysis, and a right panel showing the details of the analysis.

As does the SPSS Viewer, the SYSTAT Output Organizer allows you to move quickly anywhere in the output. Selecting any item in the left-hand side of the Output Organizer automatically scrolls the output to the corresponding results in the right window. SYSTAT also offers you the capability to reorder output by dragging Organizer entries to new locations, and to delete output by deleting its entry from the Organizer.

Part of the Output Organizer is a new integrated command window. In SYSTAT up to version 7, users interactively entered commands in the Main window. Command files were created and submitted using the command editor. The command editor also served as a log of the commands used in a SYSTAT session.

True to its command-line heritage, and the tastes of many of its users, SYSTAT by default keeps a window for entering commands open and ready for use. You can unlock and resize the command window to increase the space available for output. The command pane has three tabs, and the contents of any of them can be saved to command files for later editing or submission. In case you are now entirely confused about this set-up, Fig. 4 should give you some idea of how output and commands are organized.

In the command pane, the first tab (Interactive) acts as an interactive command processor. Commands are processed as they are entered, with output generated by "hot commands" (as you can see, statisticians take a hot time as they find it).

The middle tab (Untitled) now serves as the command editor. You can type a series of commands directly on this tab, and then submit the resulting file in "batch" mode. You can submit the entire tab, or select a portion for processing.

The Log tab serves as a record of all commands issued during a SYSTAT session. This tab is kept as "read-only" by SYSTAT, so you cannot type over it and change it inadvertently (or "advertently," for that matter).

Traditionalists will be glad to hear that statistical results can appear as either formatted or unformatted tables. Those who have to have unformatted output (ASCII text) need only type CLASSIC ON in the Command pane or select "Use SYSTAT Classic output style" from the "output" tab on the "options" dialog box. Similarly, typing CLASSIC OFF makes all subsequent output appear in its nicer, default format.

SYSTAT remains a leader in value

The entire SYSTAT program comes as one package, for a single price. It allows you to do nearly everything that you can with SPSS and several add-on modules for that program, at a substantially lower price. Here are the two notable exceptions in SYSTAT’s otherwise formidable armamentarium of methods:

It does not provide the full range of classification tree methods, but rather uses a program called C&RT, which does only two-way splits, or bifurcation. C&RT has many highly sophisticated tree-growing features, but its inability to do many-way splits rules it out as a first choice for your reviewer. If you use only two-way splits, you can miss highly important predictors that appear only when the sample is split into three or more subgroups at once.

However, any classification tree program is far better than none. Therefore, if your budget cannot stand both a full statistics package and the SPSS AnswerTree program (which we discuss next), then by all means consider SYSTAT with its built-in C&RT module.

SYSTAT does not perform conjoint analysis as it is commonly defined in marketing research and related disciplines. (This of course assumes that we can use the term "discipline" in connection with any of these activities. You be the judge.) Anyhow, SYSTAT does not perform the analysis that you may already know and love, which includes the generation of an experimental design, creation of "product profiles" for testing, and analysis of the results. (SYSTAT can design experiments but cannot generate profiles with different numbers of levels for the various attributes tested. As such, it would be only partially useful for generating the types of designs usually used in conjoint analysis.)

Nonetheless, if you do not need SYSTAT to do either of these forms of analysis, it is an extremely powerful piece of software with a remarkable range of features. Its impressive powers are neatly reflected by its main statistics volume, a command reference running to some 1,086 pages.

Again, if your budget can afford it, and you are at all serious about analyzing data, you really need to have both SPSS and SYSTAT up and running on your system.

AnswerTree 2.0: a much-needed classification tree analysis program

SPSS has rushed to the rescue of data analysts everywhere with its release of AnswerTree 2.0, by providing a good and powerful program for classification tree analysis at a price that will not make you faint.

Unfortunately for those unfamiliar with this method, space doesn’t allow us to explain classification tree analysis here. However, if you keep an archive of Quirk’s magazines, you can find an introductory description in the November 1995 issue. Those of you who sinfully threw out this issue (or less sinfully never had it), and have access to the Web, can find the text of this article in the Article Archive at the Quirk’s Web site. The Web address is www.quirks.com and it is a nice place to visit anyhow. (Note: this last comment is really the author’s and not his editor’s!)

I found the development of AnswerTree by SPSS particularly welcome, following closely on the near-demise of the program that used to lead the pack, KnowledgeSeeker. We last reviewed KnowledgeSeeker in 1995, when the software’s bugs were tolerable. In later versions, the bugs got larger and more numerous, actually undermining the integrity of the program.

Since then, it was acquired by a company that focuses on "enterprise-wide solutions" (which is software industry language for "obscenely expensive programs"), rather than on fixing the problems in their software. This company took a program that worked well, if somewhat slowly, and made it speedier, while more awkward, and much less reliable -- and raised its price from $995 to $4,620.

KnowledgeSeeker lives on at the same place, with its new exalted price point, and without the needed bug fixes. If you want the product without bugs, as I understand it you need to move up to the still pricier and more comprehensive KnowledgeStudio (which, I believe, you can have for something about $9,950).

In any event, I think most of you will agree that, after that preamble, it seems quite reasonable of SPSS to charge only $995 for AnswerTree. AnswerTree in fact offers more analytical options than KnowledgeSeeker, and produces very pretty - if inflexibly formatted - displays in the bargain.

A wealth of analytical options

AnswerTree offers four main analytical methods, including CHAID, exhaustive CHAID, C&RT (of SYSTAT fame), and Qwest. These last two methods are limited to bifurcation, as discussed earlier with C&RT (just about when your eyelids were getting heavy). This analytical limitation leaves them as interesting, but not first, choices.

Unless exhaustive CHAID blows up your computer (which it has not yet done for your author in AnswerTree’s implementation), it likely will become your method of choice. This exhaustive method does not stop once it finds what appears to be an "optimal" split based on any variable, but goes on and tests all other possible combinations. As a result, it almost always finds more ways to split the sample, and better ways, than standard CHAID.

Several steps from becoming a "killer application"

AnswerTree 2.0 is really an excellent program overall. Its analytical strengths are immediately apparent. The only areas in which it needs more work are in handling the tree diagram output, the diagram’s flexibility, and in creating accompanying "gains charts."

The tree diagrams are quite handsome, but getting them out of the program in fully-usable form is a nuisance. You can copy the tree into another program only as an inherently low-resolution bitmap (.BMP) file. You can take the extra step and export the graphic so that it retains its good appearance, but only in the highly incompatible .EMF format. I tried several programs for opening and editing the tree diagram that emerged, but only PowerPoint had some success in handling the file (i.e., allowing me to open and edit the diagram). PowerPoint chose to keep the exported image life-size, rather than shrink it to the page, which is good, as the file came with a useless "background" that was some 110 by 200 inches at this scale. This mass needed to be deleted, but very carefully, as it was linked with several needed parts of the diagram.

Other programs, all of which tried to get the diagram onto one page, left the tree diagram as a minuscule mass squeezed far into one corner. Trying to expand the diagram to something actually visible left it a horrible mess, with the labeling text either absent or scattered at random locations.

Even with PowerPoint, which almost always accepts and converts any other image, the charts (which can appear at each point in the tree diagram) did not make the transition. I hope SPSS will deliver a solution to this diagram-pasting problem soon.

In fact, those working at SPSS only need look at the old KnowledgeSeeker 3.1 to see how this is done. Incidentally, that older program included many other nice little tricks that SPSS could incorporate here as well. It’s a shame that programmers seem not to look at others’ work to learn how they could improve their own. This phenomenon of not learning from what others have already done is widely known in technical circles as the "NIH" (or "not invented here") effect. That is, "If we didn’t make it, it can’t be good."

Beyond this basic problem in pasting, you cannot set which elements appear at each point in the diagram, except to say that you want the full explanation, or the same with a graph attached. (Since the graphs apparently cannot go into other applications, they seem a moot point.)

Nice as it is to have the full story, as many years of sad experience have taught me, most clients cannot process and understand a complete classification tree diagram. Figs. 5a and 5b give you some idea of what I am talking about, comparing the unvarnished tree output with the approximate level of detail that most client-types can tolerate.

As the diagrams show, the number-one client priority in information is usually "How are we doing?", assuming the analysis is structured to answer that question. They may also ask about one or two other key competitors, but usually want to see them separately before taking a look at all three (or two) combined.

Anyhow, allowing the user to select which categories appear at each node, and which statistics appear along with this, would greatly help this program create useful output.

Another feature of classification tree analysis that comes in quite handy is the "gains" analysis. This shows all final groups, or "terminal nodes," in the tree, and how they compare in (for instance) incidence of some group that we care about greatly. They appear in order, from highest to lowest. The large table shown here is a portion of a gains analysis, in the form that a client might want.

The smaller table shows how the gains analysis comes out of AnswerTree. It is missing the highly important information about who is in each group. This leaves the unfortunate user in the position of needing to provide this information for the program. While the old KnowledgeSeeker did not do a great job formatting this output, it at least put some form of group definition with each group. Your reviewer is sure SPSS can do this as well, and requests that they do.

AnswerTree: a summary

SPSS has come very far with release 2.0 of this product. It has top-notch analytical capabilities, as we would expect for products from this software maker.

I really hope that they will work to resolve problems with the tree diagram output quickly. To repeat myself, the default "copy and paste" format for these diagrams, the plain bitmap (.BMP) falls short of acceptability for professional-level presentations. The programs need to have cutting and pasting take place in the true enhanced Windows metafile format, just as the old KnowledgeSeeker program has done for years. Beyond this, it would be helpful if the program allowed choices in the contents that appear in the tree diagram, and if it allowed group descriptions to appear in the gains analysis.

Another feature that would help users is a progress meter of some type. For instance, KnowledgeSeeker displays the name of the variable it is processing in the small pane at the lower left of its program window. Watching these names fly past, generally faster than you can see, makes you understand how fast the program is working. A meter like this can greatly reduce any subjective feeling of slowness in this type of analysis. Here’s a case where AnswerTree could borrow from the main SPSS program, which has long included a case counter, showing just how far it has gotten in the analysis.

DeltaGraph 4.0: a reprise

This program is not new to these pages, but a reminder seems in order, since it is at the top of its class. In short, DeltaGraph provides charting and graphing power of the highest order. It has the best mix of simplicity of use, chart customization, and depth of features of any charting/graphics package your reviewer has yet encountered. DeltaGraph has a long history on the Macintosh, and now, in its Version 4.0 series (the current release is 4.05), has become fully compatible with Windows 98. (It still comes in a Macintosh version.)

Even though other programs, including Excel, Harvard Graphics, and Lotus Freelance -- and SPSS itself, as mentioned -- have strongly improved graphing abilities, DeltaGraph remains several steps ahead of all. It offers over 70 different chart types, and over 200 styles, grouped into "galleries." Just browsing through its offerings may give you new ideas about ways to display data.

In addition to providing many preset graph types, the program allows you to customize nearly anything on a chart, and then save the result in the "gallery." This is definitely the program to have if you need everything "just so," down to the size and placement of the tick marks at the border of the chart.

Among the many fine features of this program, one that I probably like best is its ability to make scatterplot diagrams with labels next to the points. This feature works quite handily with the various types of perceptual maps (whether actually from discriminant analysis, correspondence analysis, multidimensional scaling, or whatever). You just copy in the coordinates and the labels, and the program does nearly all the terrible plotting work you once had to do by hand. You will have to nudge some of the labels if the chart is crowded, but the program makes this type of on-screen editing quite simple.

Other useful charts rarely seen elsewhere include "x-y bars" and "bubble charts." In x-y bars the widths represent one series of numbers, and the heights another. For instance, you can make the widths of the bars represent the sizes of groups being analyzed, and the heights represent market shares among those groups. Unlike a simple bar chart, this can give you a quick visual impression of how two factors vary at one glance. (This even has been client-tested, and elicited no more than 60 seconds of blank stares from any audience.) Using this chart type, for instance, you could quickly show how much of total sales volume goes into each group. (The area of each bar in the chart -- height times width -- would show the proportion of volume accounted for by the group. The bigger the bar, then, the more volume in the group. Most people get this idea.)

Bubble charts are useful because they can show both an "x-y" position for a point and represent (for instance) its importance, by its size. This can add very nicely to several types of maps.

DeltaGraph also has many analytical extra features, some of which have become expected of a charting package. For instance, it can calculate and plot regression lines and fit various types of curves (power, exponential, logarithmic, etc.), and calculate new data with built-in formulas. More advanced features include the ability to add "error bars" to exact specifications (for instance, at 1.5 standard deviations around plotted points in either or both directions, if you wish), and an editor specifically for equations. DeltaGraph also can handle charts with thousands of data points, if you and your audience can.

You can make a sort of a slide show with DeltaGraph alone, but I prefer to use it as a supplement to programs like PowerPoint or Excel, when they do not have enough charting power. Charts from DeltaGraph paste very nicely into these applications, as the good kind of Windows enhanced metafiles, which print at the best resolution your printer can offer. The charts also can be ungrouped, and edited one element at a time, in PowerPoint and several other programs.

DeltaGraph is particularly useful as an adjunct to these programs in part because (unlike them), DeltaGraph does not think it knows better than you when it comes to labeling. On bar charts in particular, DeltaGraph will include all the labels you request, and not skip some to satisfy its own internal sense of aesthetics.

Of course, Delta Graph can make all sorts of astonishing, and sometimes mind-boggling, charts with 3-D and 3-D effects. Unfortunately, while these usually seem incredibly interesting in the making, many audiences do not find them much fun, or highly comprehensible. It may take a little experience with a program this powerful to realize that it offers some invitation to overdo your charts.

There are only a few features on your author’s wish list for DeltaGraph. Salient among these is the inclusion of a "recently used file" list on the file menu. Nearly all Windows programs now have this feature, and it certainly can be very handy in opening and editing recent work.

In addition, it would be nice if the menus were reorganized somewhat. As they now stand, you can alter elements of the chart with three menu items, "Options," "Properties," and "Format." Until you have used this program for a while, you may find it unclear which one of these you need to perform the changes you want. Almost always, the item you want to modify has a control someplace, but you may need to search for it.

Also, there is one control that could be added. This one would allow the user to specify the placement of labels on bar charts more closely. Now you have some general options like "inside," "at end" or "outside." The ability to specify labels’ distances from the ends of bars would help. At the least, the program would work better if it made sure labels fell beyond the ends of 3-D bars, when you ask for them to go "outside."

Parts of DeltaGraph’s charting power continue to find their ways into each release of SPSS, but even so, you likely will find this a remarkably versatile and useful piece of software. It packs a tremendous amount of charting power, regardless of price -- and at $295 looks like an exceptional value for the money.

SPSS in particular should be congratulated for recognizing an outstanding program NIH (not invented here), and getting this remarkable product when it could. The original maker of this program was a once Mac-centered outfit called DeltaPoint. Here’s one instance where SPSS found a way to take on board years of clever program development, rather than reinventing it.

A two-minute review

SPSS 9.0 definitely represents a major upgrade of the program. If you have an earlier version, and have been holding off on upgrading, the time is at hand. In particular, if you are considering taking on the analysis of discrete choice modeling (DCM) problems, SPSS now has the tools you need. You may need an add-on module or two more than you ever did, but then, you are getting more functionality in the bargain.

SYSTAT has become a real powerhouse of a program. If your budget is tight, and you want one piece of software that will do nearly everything you could ask, this is your choice. (The only salient gaps in SYSTAT are a lack of conjoint analysis as it is commonly done, and classification tree methods that use two-way splits only.) If you do any heavy lifting in data analysis, you almost certainly will want both SPSS and SYSTAT on your system.

AnswerTree 2.0 represents a major step toward having a comprehensive classification-tree-analysis program, at a price that will not make you ill. As such, it fills an important gap. SPSS still needs to beef up AnswerTree’s limited facilities for copying and pasting tree diagrams into other applications, to make the tree displays more flexible, and to strengthen the output in gains analysis. If they can do that, this will become the best classification tree analysis program of all time.

DeltaGraph 4.04 remains at the top of its class in charting and graphing programs. This program has had a minor enhancement since our last report, but remains the same outstanding software that we reviewed earlier. This program earns the strongest of recommendations.

About SPSS pricing

If you have not yet bought a statistics package, you may be surprised to find that they are much more expensive than an office suite program, a create-a-card program, or even the latest 3-D action game. This is because the statistics program is targeted toward a smaller and more select audience than mass-market software, i.e., you.

In any event, that’s the explanation your reviewer has heard. (The fact that SPSS probably has over a million users worldwide has nothing to do with this discussion.) Whatever the causation, expect to pay about $795 for the SPSS base and some $295 to $395 for each add-on module. You can get discounts for buying several modules at once, and for buying more than one copy of the program. You need to call SPSS to find out about the exact discounts available. AnswerTree, as mentioned earlier, weighs in at $995, which is the same price as SYSTAT.

The Smart Viewer program requires that you pay a fee for each user who gets the viewer with your output. SPSS asks a fee of $195 for the first user, then has steep discounts for subsequent users. SPSS also has a Web server version of the Viewer, which allows you to put SPSS output on the Web for all to see. Be advised though, that this latter program is a true "enterprise level application," in that it is very expensive. Expect to pay about $20,000 for this.

More next month

That’s about it for SPSS products at the moment. Next month we’ll look at some new items from Microsoft and also look at a data viewing program.