Practicing Multiple Choice Question Design

As I noted in my previous blog, writing test items is a challenge for almost everyone. Even seasoned educators have told me it’s a complicated process.

When test developers design test questions, they should focus on the knowledge that needs to be confirmed. That sounds simple enough, right? Yet as any test developer can confirm, it’s easy to lose sight of the objective once you begin to write the test items.

Good test questions also require solid grammar that avoids confusion for the test taker. Test items can be poorly worded due to many factors, including vague or inconsistent wording.

In this blog, we will examine some sample test questions and how we might improve them.

Example 1

First, let’s assume you are training employees on new purchasing policies and procedures. In this scenario managers typically approve all supply requests, but manager workloads have increased, so staff-level employees are now instructed that when requesting orders under $25, they should send purchasing requisitions directly to the bookkeeper in the controller’s office. The bookkeeper will then input the requisition information, which will flag the purchasing agent to fulfill the order.

You want to ensure that your employees know who should receive purchasing requests for small orders under $25.

Poor:

If you need office supplies, you should:

A. Fill out a purchasing request

B. The bookkeeper gets the form

C. Send your manager the form

D. Email Daffy Duck

This stem is confusing and vague. It does not directly ask the question of who the employee needs to contact, so Response A might seem appropriate even though it is incorrect. Response B does not flow grammatically from the stem and therefore doesn’t look like the right answer even though it is supposed to be the correct answer. The stem doesn’t provide enough information about the office supplies for the employee to know that they should bypass their manager with this small order, so Response C could be arguably correct. Finally, including Daffy Duck is a useless distractor, adding nothing to the validity of the question’s design.

Poor: You need to order notepads and pencils. Your manager is very busy working on the new scheduling software. How do you go about obtaining your supplies?

A. Email the bookkeeper because the manager is busy

B. Email the manager because all supplies must be approved regardless of cost

C. Email the purchasing agent because the supplies cost less than $25

D. Email a purchasing requisition to the controller

This stem offers more information than is necessary by noting the manager is busy, yet it also remains vague on the order’s value. Response A is correct, but not because the manager is busy. While Responses A, B and C are parallel in construction, D is not. And Response D provides additional wording that could leave a test-taker with no good response if the test-taker begins to debate the sending of a simple email vs the sending of a purchasing requisition. Response D offers additional confusion considering the bookkeeper is located in the controller’s office. This feels like a trick question.

Better but Needs Improvement:

You need to order notepads and pencils totaling less than $25. How do you go about obtaining your supplies?

A. Email a purchasing requisition to your manager

B. Email a purchasing requisition to the purchasing agent

C. Email a purchasing requisition to the bookkeeper

This stem gets straight to the point, but the construction of the question could eliminate extra reading time and redundancy by wording it like so:

Better:

You need to order notepads and pencils totaling less than $25. To whom should you email your purchasing requisition?

A. Your manager

B. The purchasing agent

C. The bookkeeper

Now the question is clean and simple and tests the knowledge you sought to test.

And note that including four responses is not a requirement when designing multiple-choice questions. Three responses can be just as, if not more, effective.

Example 2

Let’s assume that your clients are located across the globe and your recent client survey revealed that they prefer communications to be conducted via email. You share this information with your staff during a training session on professional communications. However, you also caution your staff that client complaints should be handled with as much personal interaction as possible; therefore, a telephone call should always be the preferred method of communicating with an unhappy client if an in-person visit is not possible.

You want to ensure that your staff knows your clients’ preferred method of contact.

Poor:

Communication:

A. Is best via email

B. Should be face to face

C. Requires us to use text messaging sometimes

D. Is more personal on the telephone

This stem is too vague to be effective. Response A is correct according to the desired knowledge being tested, but Response D is undeniably true. Since the stem is so confusing, how can this test item be a valid illustration of knowledge?

Better:

Our clients prefer that we communicate with them on routine matters via:

A. In-person visit

B. Telephone

C. Email

D. Text message

This stem is clear and to the point. The responses are simple and orderly.

Example 3

Your new workplace injury policy requests all employees to notify Human Resources within 24 hours after an on-the-job injury. You want to be sure your employees know how soon to report the injury.

Poor:

What should you do if you are injured on the job?

A. Seek medical attention immediately.

B. Report the matter to HR within 24 hours.

C. Secure a witness right away.

D. Tell your supervisor.

This stem is too vague, and all responses could be argued as reasonably correct in most circumstances.

Better:

How quickly should you report an on-the-job injury?

A. Within 1 hour

B. Within 24 hours

C. Within 3 months

D. There is no time limit

Again, simpler is better. If you want to make sure they know to whom they should report, add another question. Increasing the number of test questions can be an excellent way of adding validity and reliability to your test.

Example 4

During new hire orientation, your safety presentation includes the locations of the eyewash stations. You want to ensure that your staff know where to go if they need an eyewash station.

Poor:

Where is the emergency eyewash station located?

A. In the human resource office

B. In the employee break room

C. In multiple departments as designated by the safety risk factors associated with tasks that involve hazardous splash conditions

D. There are no eye wash stations in this building

The stem is poorly constructed in that it asks about a singular eyewash station, yet Response C – the correct response – notes plural eyewash stations and therefore doesn’t seem to match with the stem. Furthermore, Response C provides vague information.

Better:

The emergency eyewash stations are located in the:

A. Lab and production areas

B. Shipping and receiving areas

C. Human Resource and Employee Health departments

D. Employee break room and bathrooms

This stem is concise and accurately notes the presence of multiple eyewash stations. The responses are relatively parallel in format and length, and they are specific.

Example 5

For our final example, let’s say your anti-harassment policy allows employees multiple avenues for reporting harassment concerns. Your policy instructs employees to report harassment concerns to either their supervisor, the human resource manager, or the CEO.

You want to ensure that your employees know who to talk to in the event of a harassment concern:

Poor:

To whom should you report a concern about workplace harassment?

A. Supervisor

B. Human Resource Manager

C. CEO

D. All of the above

While the stem is worded well, the responses are poorly constructed. Choosing one person leaves out the others, but Response D makes it sound like the employee must report to all three rather than one.

Better:

Our policy allows you to report a concern about workplace harassment directly to which of the following?

A. Supervisor

B. Human Resource Manager

C. CEO

D. Any of the above

While the stem in the first example was good, changing the wording to emphasize the policy’s intent is a good idea in this case. Also, Response D has been reworded to say “Any” rather than “All,” making it clearer. Normally it's best to avoid an "any of the above" type of answer, but in special cases it is warranted.

Also note that if you want employees to be able to independently name the people they should report concerns to, you should consider a fill-in-the-blank format rather than a multiple-choice test item.

 

Finally, since constructing fair, valid and reliable tests is a challenging task for all test designers, every test should be reviewed by multiple reviewers. When reviewing for grammatical consistencies and confusion, reviewers do not need to be familiar with the body of knowledge being tested, but when checking the test’s ability to assess the actual knowledge, reviewers should be familiar with the content being tested.

When designing tests, be sure to maintain your focus on the following questions:

Return to the home page.

Return to the blog list.