Do you write multiple-choice questions (MCQ) for the American Board of Radiology, or the American College of Radiology, or for CME meetings or online/printed CME materials?  Do you create exams for medical students, residents, or other learners?  Perhaps you’ve been writing MCQs for a number of years.  Or maybe you’re just getting started and haven’t had any training in creating effective items. Either way, you might benefit from reading this post, if only to validate your mastery of the process.  

Preamble – a personal perspective

I’ve chaired committees tasked with writing MCQs for high-stakes radiology exams. I’ve taught the art of writing good MCQs both locally and nationally.  I’ve published articles on the topic.  I’ve personally written hundreds of items.  One thing I’ve learned is that writing good MCQs is difficult and time-consuming.  Radiologists often have had little experience or training in item-writing, and, as a group, they are incredibly busy.  It’s no wonder that you see items on exams and CME materials that are poorly written (just ask any resident if you don’t believe me).  

Q. Why should you care about whether MCQs are effective?

A. The MCQ is the most commonly used test item in radiologic graduate medical education and continuing medical education exams.

MCQs – the devil is in the details

MCQs have their critics, and rightly so.  They’re often constructed to measure only rote memorization.  They test recognition (choosing an answer) rather than recall (constructing an answer), they allow for guessing, and as I’ve already said, they are difficult and time-consuming to construct.

In addition, MCQs are not the most effective way to test everything worth testing. For example, they are not the best way to test a radiologist’s ability to communicate effectively.  An MCQ that asks the learner to recognize benign dermal calcifications on a mammogram does not test the learner’s problem-solving ability or ability to communicate the findings to a patient. Note: A question that provides specific patient information and imaging data and that asks the learner to choose the most appropriate management IS an example of an item that tests problem-solving ability.

When used appropriately, MCQs offer several advantages over other types of test items.  They can be used to assess a broad range of learner knowledge in a short period of time. Because a large number of MCQs can be developed for a given content area, which provides a broad coverage of concepts that can be tested consistently, the MCQ format allows for test reliability. If MCQs are drawn from a representative sample of content areas that constitute predetermined learning outcomes, they allow for a high degree of test validity.  It’s worth noting that the criticisms of MCQs are more often attributed to flaws in the construction of the test items rather than to their inherent weakness. 

Note: MCQ guidelines should be viewed not as absolutes but rather as best-practice principles.  In some circumstances, it may be appropriate to deviate from the guidelines.  However, such circumstances should be justified and occur infrequently.

Also note: This post does not cover Bloom’s taxonomy of cognitive learning, which is a hierarchy of knowledge, comprehension, application, analysis, synthesis, and evaluation. An MCQ should be written to test at the same level of learning as the objective it is designed to assess. 

Anatomy of an MCQ

Characteristics of effective MCQs can be described in terms of the overall item, the stem, and the options.  The “item” is the entire unit and consists of a stem and several options.  The “stem” is the question, statement or lead-in to the question.  The possible answers are called “alternatives”, “options”, or “choices.”  The correct option is called the “keyed response.”  The incorrect options are called “foils” or “distractors.”    

Ideally, the item should be answerable without having to read all of the options. 

Stem

The stem is usually written first and is best written as a complete sentence in the form of a question.  Direct questions (e.g., Which of the following is an imaging feature of lymphangiomyomatosis?) are clearer than sentence completions (e.g., Lymphangiomyomatosis…).  Even a stem that incorporates radiologic images should be accompanied by a complete statement.  Ideally, the item should be answerable without having to read all of the options. 

The stem should include all relevant information, only relevant information, and contain as much of the item as possible. If a phrase can be stated in the stem, it should not be repeated in the options.   

The stem should be kept as short as possible.  It should not be used as an opportunity to teach or include statements which are informative but not needed in order to select the correct option.  The purpose of an MCQ is not to teach.  It is a tool used to determine whether the learner achieved a particular objective, and in some cases, to test the effectiveness of instruction.  

Stems should not be tricky or misleading, such that they might deceive the examinee into answering incorrectly.  The level of reading difficulty should be kept low using simple language so that the stem is not a test of the examinee’s reading ability.  

To test application of knowledge, clinical vignettes can provide the basis for the question, beginning with the presenting problem of a patient, followed by the history (duration of signs and symptoms), physical findings, results of diagnostic studies, initial treatment, subsequent findings, etc.  Vignettes do not have to be long to be effective, and should avoid verbosity, extraneous material and “red herrings.” 

The stem should be stated so that only one option can be substantiated and that option should be indisputably correct.  If the correct option provided is not the only possible response, the stem should include the words “of the following.”  When more than one option has some element of truth or accuracy but the keyed response is the best, the stem should ask the examinee to select the “best answer” rather than the “correct answer.” 

Questions should generally be structured to ask for the correct answer and not a “wrong” answer.  Negatively posed questions are recognizable by phrases such as “which is not true” or “all of the following except.”  Negative questions tend to be less effective and more difficult for the examinee to understand.  When negative stems are used, the negative term (e.g., “not”) should be underlined, capitalized or italicized to make sure that it is noticed.  

The terms “may”, “could”, and “can” are cues for the correct answer, as testwise examinees will know that almost anything is possible.  

Absolute terms, such as “always”, “never”, “all” or “none” should not be used in the stem or distractors.  Savvy examinees know that few ideas or situations are absolute or universally true.  The terms “may”, “could”, and “can” are cues for the correct answer, as testwise examinees will know that almost anything is possible.  Imprecise terms such as “seldom”, “rarely”, “occasionally”, “sometimes”, “few”, and “many”, are not uniformly understood and should be avoided.  

Eponyms, acronyms or abbreviations without definition should be avoided.  Consider the acronyms AAA, RCA, TBI, and BBB.  Examinees may be unfamiliar with these terms, and each term has more than one meaning.  In such cases, the item becomes a test of whether the examinee understands the meaning of a term, or the item is faulty because a term can be interpreted in more than one way.

The ability of an item to discriminate (i.e., separate those who know the material from those who don’t) is founded in the quality and attractiveness of the distractors. 

Options

The most challenging aspect of creating MCQs is designing plausible distractors.  The ability of an item to discriminate (i.e., separate those who know the material from those who don’t) is founded in the quality and attractiveness of the distractors.  The best distractors are statements that are accurate but do not fully meet the requirements of the problem, and incorrect statements that seem right to the examinee.  Each incorrect option should be plausible but clearly incorrect.  

Implausible, trivial, or nonsense distractors should not be used.  Ideal options represent errors commonly made by examinees.  Distractors are often conceived by asking questions such as, “What do people usually confuse this entity with?”, “What is a common error in interpretation of this finding?” or “What are the common misconceptions in this area?”

Distractors should be related or somehow linked to each other.  That is, they should fall into the same category as the correct option (all diagnoses, tests, treatments, prognoses, disposition alternatives).  For example, all options might be a type of pneumonia or cause of right lower quadrant pain. 

The distractors should appear as similar as possible to the correct answer in terms of grammar, length, and complexity.  There is a common tendency to make the correct answer substantially longer than the distractors. 

The options should not stand out as a result of their phrasing.  Grammatical cues, such as when one or more options don’t follow grammatically from the stem, lead the examinee to the correct option.  If the stem is in past tense, all the options should be in past tense.  If the tense calls for a plural answer, all the options should be plural.  Stem and options should have subject-verb agreement.  

Because an item writer tends to pay more attention to the correct option than to the distractors, grammatical errors are more likely to occur in the distractors.  Note:  These problems do not occur when the stem is written as a question. 

Options should not include “none of the above” or “all of the above.”  “None of the above” is problematic in items where judgment is involved and where the options are not absolutely true or false.  It only informs about what the examinee knows is not correct and not what is correct.  “All of the above” is a problem because the examinee only needs to recognize that two of the options are correct. 

Options should be placed in logical order, if there is one.  For example, if the answer is a number, the options should begin with the smallest and proceed to the largest (it is also acceptable to begin with the largest and proceed to the smallest).  If the options are dates, they should be listed in chronological order.  Also, options should be independent and not overlap with each other.  

Summary

If you write a lot of MCQs, especially for high-stakes exams, you can check your items using this “crib sheet”:

Stems

  • Provide a complete statement (preferably in the form of a question)
  • Include only relevant information
  • Contain as much of the item as possible in the stem
  • Keep stems as short as possible
  • Ask for the correct, not the “wrong” answer
  • Avoid absolute terms such as “always”, “never”, “all”, or “none”
  • Avoid imprecise terms such as “seldom”, “rarely”, “occasionally”, “sometimes”, “few”, or “many”
  • Avoid cues such as “may”, “could” or “can”
  • Define eponyms, acronyms, or abbreviations when used

Options

  • Keep options grammatically consistent with the stem
  • Write incorrect options to be plausible but clearly incorrect
  • Link options to each other (all diagnoses, tests, treatments)
  • Write distractors to be similar to the correct option in terms of grammar, length, and complexity
  • Avoid “none of the above” or “all of the above”
  • Place options in logical order (numerical, chronological)
  • Write options to be independent and not overlapping

If you want to watch a video on the topic:

ACR video: https://appcenter.acr.org/lcmsVideos/Tips_for_Writing/story_html5.html

If you want to read more:

RadioGraphics articles:

https://doi.org/10.1148/rg.262055145

https://pubs.rsna.org/doi/10.1148/rg.337125749

Vanderbilt University Center for Teaching: https://cft.vanderbilt.edu/guides-sub-pages/writing-good-multiple-choice-test-questions/