Home > Evaluation Methods > User Performance Evaluation

User Performance Evaluation

Description of the method:

User Performance Evaluation is a method in which a participant performs the task using the actual documentation in order to detect potential problems and errors by measuring user's effectiveness, efficiency, and satisfaction.  User testing may be performed on the actual aircraft or component parts on a bench.  A think-aloud protocol is used to find errors and problem areas in the User Performance Evaluation of technical documentation. 

The User Performance Evaluation typically has four stages: 1) Planning and preparation; 2) Introduction; 3) The actual evaluation; and 4) Debriefing and 5) Data analysis and results reporting. The Tools and Templates  section contains tools to help you develop each of the stages -- the Usability Evaluation Planning Template, the User Testing Script, User Performance Evaluation Guidelines for Facilitator and Note Takers , and the Background Questionnaire, and Satisfaction Questionnaire to collect individual user's information.

Development Lifecycle Stage:  This method is most effective in the mid- to latter-stage of development -- when the aircraft is available and the documentation has already gone through internal proofing and heuristic evaluations or cognitive walkthroughs to detect errors. 

Number of Users Required:  The number of users to be tested will likely depend on the complexity and safety criticality of the task.  It is preferable to have a diverse sample of users to test, i.e. experience levels and/or familiarity with the aircraft tasks being evaluated.  In most usability testing, it is advisable to have at least five participants.  However, limitations due to availability of maintenance technicians, the aircraft, and time may not allow for five sessions. 

Type of Users:  The more diverse the types of users you test, the more your findings will generalize to the different types of users of the documentation.  Differences to be considered are user's experience level, familiarity with the procedure, and their different  job responsibility, e.g. maintenance technician or engineer.  Although the engineer is not the end user of the documentation, a Co-Discovery method of User Performance Evaluation may be advantageous to work out issues that are discovered during evaluation while both the maintenance technician and engineer are present.

Evaluator Skills required to Use the Method:  This method requires a great deal of creativity and skill from the evaluator; therefore, the best case would be an specialist in evaluative testing.  Many times the evaluator will not know exactly what areas will be problematic; therefore, the evaluator must be alert to the technicians’ actions, expressions, as well as dialogue.  This ability will aid the evaluator in probing the maintenance technicians’ thoughts and concerns during the session.

Although experience in user testing is advisable, planning and conducting this method of evaluation may be accomplished with a careful review of the publications available on user testing.  We have also included a planning template and user evaluation script to guide your use of this evaluation method.

Number of Evaluators Required:  Depending upon the complexity of the procedure, one or two evaluators are necessary.  If the procedure is complex (number of steps, criticality), two evaluators -- one to probe and one to video record or take notes -- is advisable.

Advantages of method: 

  • Provides in-depth information as to problem areas and errors in the documentation.

  • Finds the most severe types of errors that are not likely found by other evaluation methods.

  • Provides the benefit of watching actual users experiencing problems.

Disadvantages of Method: 

  • It is more expensive -- time, personnel, and aircraft resources; however, it may save resources by correcting errors earlier in the process.

  • Testing is always an artificial situation and not the actual situation; the very act of conducting an evaluation can affect the results.

  • Does not usually find the typographical or grammar errors in the documentation.  Errors in technical values may also be more difficult to discover using this evaluation method.

  • May be difficult to recruit the targeted user groups.

Level (or amount) of User and Evaluator Interaction:  High interaction in that the evaluator must probe and seek to understand the user's concerns, perhaps even working out possible solutions to solve the problem area.

 Data Recording Method(s):  Video taping the session is advised in order to have a record to re-evaluate your findings.  Portions of the video recording will also provide a clearer meaning when presenting the your findings to others. 

Data may also be recorded by a skilled note taker.  Portions of the test may need further clarification following the evaluation and the note taker can mark those areas for probing in the debriefing session.

Total Testing Time Required:  Testing time required for this method will range from weeks to months depending upon the number of participants to be tested.

Testing Time Per User:  Depending upon the complexity of the task, user performance evaluation will take approximately 50% longer than the procedure would "normally" take when using participants unfamiliar with the task.  This additional time is due to using a think-aloud protocol, having others observe the user's performance, and the debriefing session following the evaluation.  Less time would be required for the experienced user.

Typical Output from Test:  Objective data can be collected for User Performance evaluations, such as the time users take a complete the task, the time spent recovering from errors, the number of user errors, how frequently the manual was used to solve the user's problem or the number of times the user had to workaround a problem that wasn't covered in the procedure.

Subjective data can also be collected as to the user's comments during the evaluation about the language/terminology used, the sequence of the tasks, the user's comments during the debriefing session following the performance evaluation, and the satisfaction data collected by the questionnaire.

How to Run the Test:  Basic instructions for running the tests can be found in the Tools and Templates  section -- the Usability Evaluation Planning Template  to help with each of these stages – the User Testing Script, User Performance Evaluation Guidelines for Facilitator and Note Takers , Background Questionnaire, and Satisfaction Questionnaire to collect individual user's information.

Related Tests:  Co-Discovery is basically the same as User Performance Evaluation using two participants rather than a single user.

Required Testing Materials: Audio/video equipment, aircraft or component parts with any needed tools, the written procedure with any illustrations or supporting documentation, a consent form, background questionnaire, and a satisfaction questionnaire.

Cost to Conduct Test:  The cost to conduct User Performance Evaluations is high due to the human and aircraft resources required.  The cost may be reduced by using component parts on the bench rather than the aircraft.

References / Where to Learn More:  See References and Useful Resources.

Type of System that Test Can Be Done On:  User testing can be performed on the actual system or component parts on the bench.

Goals of Testing:  User performance evaluation finds the most severe problems in the documentation -- the issues that usually are not discovered using other methods.  Therefore, the goal of User testing is to discover problems in the maintenance documentation for the most complex and/or safety critical components of the aircraft.

Subjective or Objective Test:   The results generated from user performance evaluations is mainly objective but may be biased by the evaluator's perception of the participant's comments.  Care should be taken when an the evaluator is not trained in user testing that the participant's comments are clarified. 

Following the data collected from the evaluation, a satisfaction questionnaire will collect the maintenance technician's subjective assessment of the procedure.

Ease of Learning to Conduct the Test:   In the case of evaluating technical maintenance documentation, user evaluation is principally testing the documentation as written and discovering areas that require more, less, or different text or illustrations.  Therefore, the ability to learn how to conduct this type of test requires a moderate level of reading and no formal research training.  A basic understanding of cognitive processes, the user's action cycle, and careful planning of the User Evaluation are necessary to get valid and reliable results.

Turnaround Time:  User Performance Evaluations may take several weeks to complete; however, results from the first few will produce results along the way that may help the evaluator focus on certain aspects of the evaluation.  By seeing the different approaches to problem areas by various technicians, the writer will have the ability to probe for solutions, test alternatives, and/or add necessary information for clarity where required.  Depending upon the availability of resources necessary to conduct the user testing, turnaround for the completed project may take several weeks.

Focus of Evaluation:  The focus of User Performance Evaluation is wide since the user has access to the test article and the full extent of the maintenance task can be evaluated.

Related Statistical Analysis: Results of User Performance Evaluation are qualitative due to the low number of participants.


Human Factors at NIAR | Human Factors at FAA | Human Factors Psychology at WSU

Human Factors Laboratory, National Institute for Aviation Research at Wichita State University. Research funded by the Federal Aviation Administration.  All rights reserved.
Revised: 11/05/04