HCI — Human Computer Interaction -(ii)

Evaluation techniques for interactive systems

9 min readDec 27, 2020

This is the second blog of this series of HIC. In this article I mainly focus on Evaluation techniques for interactive systems. Foe easy understand I divide this topics to sub topics.

  1. What is evaluation
  2. Goals of evaluation 
  3. Evaluation through expert analysis
  • Cognitive walkthrough
  • Heuristic evaluation
  • Model-based evaluation 

4. Evaluation through user participation:

  • Styles of evaluation
  • Empirical methods: experimental evaluation
  • Observational techniques
  • Query techniques 

5. Evaluation through monitoring physiological responses

1. What is evaluation

Evaluation should not be thought of as a single phase in the design process . Ideally, evaluation should occur throughout the design life cycle, with the results of the evaluation feeding back into modifications to the design. Clearly, it is not usually possible to perform extensive experimental testing continuously throughout the design, but analytic and informal techniques can and should be used. such techniques help to ensure that the design is assessed continually. This has the advantage that problems can be ironed out before considerable effort and resources have been expended on the implementation itself: it is much easier to change a design in the early stages of development than in the later stages. We can make a broad distinction between evaluation by the designer or a usability expert, without direct involvement by users, and evaluation that studies actual use of the system. The former is particularly useful for assessing early designs and prototypes; the latter normally requires a working prototype or implementation. However, this is a broad distinction and, in practice, the user may be involved in assessing early design ideas (for example, through focus groups), and expert-based analysis can be performed on completed systems, as a cheap and quick usability assessment. We will consider evaluation techniques under two broad headings: expert analysis and user participation.

2. Goals of evaluation

When we discuss the goals of evaluation there are three main gals.

  • Assess the extent and accessibility of the system’s functionality — the design of the system should enable users to perform their intended tasks more easily. This includes not only making the appropriate functionality available within the system, but making it clearly reachable by the user in terms of the actions that the user needs to take to perform the task. It also involves matching the use of the system to the user’s expectations of the task.
  • Assess users’ experience of the interaction — This includes considering aspects such as how easy the system is to learn, its usability and the user’s satisfaction with it. It may also include his enjoyment and emotional response, particularly in the case of systems that are aimed at leisure or entertainment.
  • Identify any specific problems with the system — These may be aspects of the design which, when used in their intended context, cause unexpected results, or confusion amongst users. This is related to both the functionality and usability of the design. However, it is specifically concerned with identifying trouble-spots which can then be rectified.

3. Evaluation through expert analysis

Evaluation should occur throughout the design process.. In particular, the first evaluation of a system should ideally be performed before any implementation work has started. We will consider Three approaches to expert analysis

  • Cognitive walkthrough —

The origin of the cognitive walkthrough approach to evaluation is the code walkthrough familiar in software engineering. Walkthroughs require a detailed review of a sequence of actions. In the code walkthrough, the sequence represents a segment of the program code that is stepped through by the reviewers to check certain characteristics. . To do a walkthrough you need four things,

  1. A specification or prototype of the system. It doesn’t have to be complete, but it should be fairly detailed. Details such as the location and wording for a menu can make a big difference.
  2. A description of the task the user is to perform on the system. This should be a representative task that most users will want to do.
  3. A complete, written list of the actions needed to complete the task with the proposed system. 4. An indication of who the users are and what kind of experience and knowledge the evaluators can assume about them.

Given this information, the evaluators step through the action sequence to critique the system and tell a believable story about its usability. To do this, for each action, the evaluators try to answer the following four questions for each step in the action sequence,

  1. Is the effect of the action the same as the user’s goal at that point?
  2. Will users see that the action is available?
  3. Once users have found the correct action, will they know it is the one they need?
  4. After the action is taken, will users understand the feedback they get?
  • Heuristic evaluation —

A heuristic is a guideline or general principle or rule of thumb that can guide a design decision or be used to critique a decision that has already been made. Heuristic evaluation, developed by Jakob Nielsen and Rolf Molich, is a method for structuring the critique of a system using a set of relatively simple and general heuristics. Heuristic evaluation can be performed on a design specification so it is useful for evaluating early design. But it can also be used on prototypes, storyboards and fully functioning systems. It is therefore a flexible, relatively cheap approach. Hence it is often considered a discount usability technique.

  • The use of models —

A third expert-based approach is the use of models. Certain cognitive and design models provide a means of combining design specification and evaluation into the same framework. Similarly, lower-level modeling techniques such as the keystroke-level model provide predictions of the time users will take to perform low-level physical tasks. Design methodologies, such as design rationale also have a role to play in evaluation at the design stage. Design rationale provides a framework in which design options can be evaluated. By examining the criteria that are associated with each option in the design, and the evidence that is provided to support these criteria, informed judgments can be made in the design. Dialog models can also be used to evaluate dialog sequences for problems, such as unreachable states, circular dialogs and complexity. Models such as state transition networks are useful for evaluating dialog designs prior to implementation.

4. Evaluation through user participation

The techniques we have considered so far concentrate on evaluating a design or system through analysis by the designer, or an expert evaluator, rather than testing with actual users. For easiness we can discus part by part.

  • Styles of evaluation —

If we consider style of evaluation there are two main parts. First one performed under laboratory conditions and second one conducted in the work environment or ‘in the field’.

Laboratory studies : In the first type of evaluation studies, users are taken out of their normal work environment to take part in controlled tests, often in a specialist usability laboratory.

Field studies : The second type of evaluation takes the designer or evaluator out into the user’s work environment in order to observe the system in action

  • Empirical methods: experimental evaluation —

One of the most powerful methods of evaluating a design or an aspect of a design is to use a controlled experiment. This provides empirical evidence to support a particular claim or hypothesis. It can be used to study a wide range of different issues at different levels of detail. These method include the participants chosen, the variables tested and manipulated, and the hypothesis tested.

Participants :- The choice of participants is vital to the success of any experiment. In evaluation experiments, participants should be chosen to match the expected user population as closely as possible.

Variables :- Experiments manipulate and measure variables under controlled conditions, in order to test the hypothesis. There are two main types of variable: those that are ‘manipulated’ or changed ( independent variables) and those that are measured (the dependent variables).

Hypotheses :- A hypothesis is a prediction of the outcome of an experiment. It is framed in terms of the independent and dependent variables, stating that a variation in the independent variable will cause a difference in the dependent variable.

Experimental design

In order to produce reliable and generalizable results, an experiment must be carefully designed. We have already looked at a number of the factors that the experimenter must consider in the design, namely the participants, the independent and dependent variables, and the hypothesis. There is a process to do this.

  • Observational techniques —

A popular way to gather information about actual use of a system is to observe users interacting with it. Usually they are asked to complete a set of predetermined tasks, although, if observation is being carried out in their place of work, they may be observed going about their normal duties.

Think aloud and cooperative evaluation :- Think aloud is a form of observation where the user is asked to talk through what he is doing as he is being observed; for example, describing what he believes is happening, why he takes an action, what he is trying to do.

Protocol analysis :- Methods for recording user actions analyze. we can use following methods for recode user details.

— Paper and pencil

— Audio recording

— Video recording

— Computer logging and etc.

the most popular thing is use a mix method of these.

Automatic protocol analysis tools :- Analyzing protocols, whether video, audio or system logs, is time consuming and tedious by hand. It is made harder if there is more than one stream of data to synchronize. One solution to this problem is to provide automatic analysis tools to support the task. These offer a means of editing and annotating video, audio and system logs and synchronizing these for detailed analysis.

Post-task walkthroughs :- Often data obtained via direct observation lack interpretation. We have the basic actions that were performed, but little knowledge as to why. Even where the participant has been encouraged to think aloud through the task, the information may be at the wrong level.

  • Query techniques —

Another set of evaluation techniques relies on asking the user about the interface directly. Query techniques can be useful in eliciting detail of the user’s view of a system. They embody the philosophy that states that the best way to find out how a system meets user requirements is to ‘ask the user’. They can be used in evaluation and more widely to collect information about user requirements and tasks. There are two main types of query technique:

interviews — Interviewing users about their experience with an interactive system provides a direct and structured way of gathering information.

questionnaires — An alternative method of querying the user is to administer a questionnaire. This is clearly less flexible than the interview technique, since questions are fixed in advance, and it is likely that the questions will be less probing.

5. Evaluation through monitoring physiological responses

One of the problems with most evaluation techniques is that we are reliant on observation and the users telling us what they are doing and how they are feeling. However for this evaluation mostly use eye tracking and physiological measurement.

Physiological measurements :- emotional response is closely tied to physiological changes. These include changes in heart rate, breathing and skin secretions. Measuring these physiological responses may therefore be useful in determining a user’s emotional response to an interface. Physiological measurement involves attaching various probes and sensors to the user. These measure a number of factors:

Heart activity — indicated by blood pressure, volume and pulse. These may respond to stress or anger.

Activity of the sweat glands — indicated by skin resistance or galvanic skin response (GSR). These are thought to indicate levels of arousal and mental effort.

Electrical activity in muscle — measured by the electromyogram (EMG). These appear to reflect involvement in a task.

Electrical activity in the brain — measured by the electroencephalogram

This is the end of the second blog article of this series. For more details with example you can refer the book mentioned bellow. Next article is the final blog of this series. That about Universal Design for Interactive Systems.

First blog article of this series : Design rules for interactive systems

Next blog article of this series :Universal Design for Interactive Systems


Chapter 9





Undergraduate student of Software engineering-University of Kelaniya.