Book title: HIBROWSE for Bibliographic Databases: a study of the application of usability techniques in view based searchings

University of Huddersfield, 1997, ISBN 0-7123-3315-0

Andrew Dillon

This item is not the definitive copy. Please use the following citation when referencing this material: Dillon, A. (1998) Review of Treglown et al (1997) Hibrowse for bibliographic databases: a study of the application of usability techniques in view -based searching. BL Research and Innovation Report #52. Journal of Documentation 54(4), 505-508.

The present volume is a short report from a project funded by the British Library. With only 113 pages of text, and as many again in references and appendices, the reader will not find the volume a lengthy or entertaining read. It has the look and feel of a technical report and is written in a mechanical, matter of fact style that smacks of contractual obligation. While one should never judge a book by its cover, the plastic binding only adds to the sense of reading someone’s internal project reports - not an inspiring start.

HIBROWSE seems to have been an attempt to develop a view-based search interface for bibliographic databases and database management systems. The present report details a project which applies ‘methods and techniques from human-computer interaction design’ (quoted from the abstract) to the HIBROWSE system, and the report leads the reader through the design processes and evaluation methods employed. Thus stated, I approached the review with relish. What is truly required is a systematic study of the utility of various evaluation methods in the context of real design processes. Sadly, the present book fails to deliver on its promise.

The report is divided into seven chapters, the first of which provides a brief introduction to the history of databases, a listing of project partners, and a statement of the project’s objectives. Remaining chapters cover the preliminary design of HIBROWSE, the conversion of EMBASE data onto ADABAS ( a three-page chapter), prototyping the new interface, first usability evaluation, second usability evaluation and then conclusions.
The appendices appear to be little more than copies of the original internal project reports, the questionnaires used in the user testing, and some data output from the automated scoring software from one of the questionnaire’s used (Kirakowski’s SUMI). As you can gather, there are enough unexplained acronyms and code words here to keep even the most motivated reader challenged.

While project reports are normally quite dull to read, they may contain useful insights, data, and experiences from which the readership might learn. Presumably this is the thinking behind the publication of the present volume. Unfortunately, this work is so deeply flawed it is hard to find anything positive to say.

The field of HCI is a relatively new one (though tracing its origins to the mid 1950s is not difficult) and over the last 20 years it has derived numerous methods and techniques aimed at supporting the design of more usable information technology. While all designers and systems developers claim to be user-centered in their thinking, research has repeatedly demonstrated that shaping technology appropriately for human use is complicated. A variety of usability evaluation methods (user, expert and theory-based) have been proposed and tested, and it is generally agreed that reliable and valid tests of users performing realistic tasks in an appropriate context are the best means of ensuring usability.

As such, any project that seriously examined the utility of the HCI approach to design would likely offer tremendous insights into the important issues and variables in database development that affect user responses and influence designers’ choices. To do this properly would require HCI experts to identify candidate technologies, establish their contexts of use, select appropriate test methods, perform the tests and compare their respective outputs in terms of reliability, validity and, ultimately, utility to the design team. Had this been done, the objectives of this project might have been met and the results would surely have been useful. Unfortunately, while claiming to attempt this, the present report outlines efforts at doing so that fall so far short of ideal that one can hardly imagine any useful interpretations being made of the results.

The gist of the report is that by the time usability was considered worthy of investigation, the major design choices had already been made. Straight away, the sharp reader will notice that the project is never going to be able to deliver fully on its objectives. We will never know how the designers’ views drove initial design or altered over time on the basis of test results. We will not be able to see how various test methods offer differing utility at distinct stages of the development process. All we are told is that by the time usability became an issue, the team identified heuristic evaluation and questionnaires as the best means of testing the interface. After attempting a cognitive walkthough themselves on a paper version, the team tested seven subjects with the questionnaire method, redesigned the interface, and then ran more users through more formal tests, relying heavily on screen records and further SUMI scores to generate data. Their major conclusion to be that users might need more training to exploit the HIBROWSE interface.

That the present project represents a missed opportunity for those in the field of HCI to generate some real knowledge seems largely to result from the authors’ poor understanding of HCI in general and usability evaluation in particular. This harsh conclusion is forced upon the present reviewer by the demonstrable mistakes the authors made. While hardly a precise science, HCI does offer a suite of tools and methods that must be mastered by those who would apply them. HCI practitioners assume also that understanding the context of use for a tool is essential; evaluators must be trained; there are trade-offs between various methods of testing (the precise nature of which the present project presumably sought to identify); and that usability is a quality of the user-tool interaction not a feature or interface style of the tool itself. Usability engineering is predicated on the need to define usability operationally within this context of use and to test accordingly against agreed criteria for effectiveness, efficiency and satisfaction. Sadly, the project team seem unaware of these assumptions or else have chosen to ignore them. Context of use, task analysis, and user analysis are not performed, and usability criteria are never established. As an example of usability engineering, this project could almost serve as a case-study in how not to do it!

The writing has more than its share of questionable interpretations and factual errors. For example, as early as page 18 the authors claim that user input to design (a fundamental point of user-centered design), should be regarded with caution. This interpretation is supposedly based on evidence, though none is cited. While it is true that there is more to user-centeredness than just asking users what they would like (and frequently it is true that users cannot reliably answer that question), there are many forms of user input, ranging from the passive (field observations of users in action) to the interactive (users participating in design meetings). It hardly inspires confidence in the authors’ HCI credentials when they make such sweeping statements and seems more likely to be a weak rhetorical defense on their part for avoiding any user analysis at the outset.

From an HCI perspective it is meaningless to talk of usability without describing the users, their tasks and the environments in which they work, as any change in these will impact the measured level of usability and limit the potential for generalisation of test results (see e.g., Shackel, 1991). This is basic HCI. However, citing completely unrelated criticisms of traditional software engineering approaches as supporting evidence, the authors dismiss the need to do realistic task analysis on the grounds that such analyses are not appropriate for information retrieval (IR) tasks. If so, IR must be the only known human activity for which task analysis is considered inappropriate when designing supporting tools. Furthermore, this position contradicts the conclusion made in the appendices where the results of the otherwise insightful Cognitive Walkthrough test are taken to indicate that further task analyses are required. So which is it that they really believe? They dismiss a cost-effective tool that would aid them perform such an analysis (the MUSiC context handbook) on the grounds that it “requires structured descriptions of tasks in a form similar to the task models of Card et al 1983” ( p .20). This is simply not true. As a co-developer of said context handbook I can vouch for the fact that there is almost no relationship between the task descriptions it invokes and the GOMS models that they dismiss.

The demonstrable ignorance of relevant findings and their implications for evaluation continue throughout this report. Heuristic evaluations are performed by analysts with no knowledge of the task domain, yet while the authors cite relevant papers, nowhere do they mention the results of those studies showing that such evaluators perform worse than evaluators with task knowledge (Nielsen, 1992). Furthermore, said research also shows that two such evaluators will miss more of the problems than they will find, and that to optimise for such an evaluation, at least 3 evaluators are needed to find approximately 75% of the problems. For completely novice evaluators (and we are told the two employed here had little experience with the method) many more are needed to hit that problem identification rate. While the Nielsen data are themselves questionable, they do point to concerns we all should have with heuristic methods and which any reader would reasonably expect the authors of this study to mention and address.

Continuously throughout this report we are reminded that the authors are either not knowledgeable enough or not sufficiently concerned with usability testing methodology to perform evaluations that would meet the stated objectives of the project. Basic details of the user tests are missing. Sampling is opportunistic not stratified or targeted. Efficiency and effectiveness scores are not related to satisfaction scores. Data is mined for supposedly interesting results. Worst of all, the essence of the project’s objective, to study how the methods and techniques of HCI might work in this context, is never discussed nor analysed, even though the introduction makes a claim that this discussion occurs explicitly in chapter seven. It does not occur there or anywhere else. How could it when the authors avoid performing the very research that would enable this analysis to be performed? The final selection of techniques used in this report seems quite openly to have been based on what they had to hand, not any analysis of methodological relevance or research problem.

One could continue but it would be an exercise in fault finding. One suspects that the politics of the project may have been partly to blame for this weak report, and the apologetic tone of confession at the end suggests the authors are not entirely happy with this work themselves. What remains is a disappointing report that offers little value to HCI researchers or database designers. Furthermore, it proves once again, that claiming to do HCI is easy, actually doing the work properly is difficult and is best left to specialists. HCI has come a long way in the last 20 years, and it is clear, as Newell and Card (1985) put it, that good intentions and a willingness to serve the user are no longer sufficient to practice the discipline - if nothing else, the present report demonstrates this so well. The authors confess at the end that more should have been done to involve the users in this process. At least they learned something from all this.

References

Newell, A. and Card, S. (1985) The prospects for psychological science in Human-Computer Interaction. Human-Computer Interaction, 1, 209-242.

Nielsen, J. (1992) Finding usability problems through heuristic evaluation Proc. of CHI 92, ACM. 373-380

Shackel, B. (1991) . Usability - Context, Framework, Definition, Design and Evaluation In B. Shackel and S. Richardson (eds.) (1991) Human Factors for Informatics Usability. Cambridge: Cambridge University Press, pp 21-38)