- ExamEval
- Item Writing Flaws
- Other Flaws
Other Flaws: Beyond the Standard Categories

ExamEval provides AI-powered exam analysis for health professions educators. Although there are a variety of well-described item writing flaws in the literature, a variety of other flaws can exist when writing multiple choice questions.
Because ExamEval uses AI, and not a strict rule-based system, to detect item-writing flaws, it is common for flaws to be flagged by ExamEval that do not fit into a specific flaw category. This behavior is similar to what would occur if a peer review of an exam occurred — a faculty peer reviewer would potentially find item-writing flaws that are unique to the topic or context of the course.
AI-powered analysis can identify nuanced and context-specific flaws that rule-based systems miss. This allows for a more comprehensive and human-like review of exam questions, ultimately leading to higher-quality assessments.
The Advantage of AI-Powered Detection
Traditional rule-based systems for detecting exam flaws are limited to identifying known patterns and explicit violations of established guidelines. While these systems excel at catching common issues like negative stems or "all of the above" options, they often miss more nuanced problems that experienced educators would readily identify.
AI, on the other hand, can be trained on vast datasets of exam questions and expert feedback, allowing it to learn the subtle characteristics of high-quality items. This enables it to identify a wide range of flaws, including:
- Contextual Ambiguity: Questions that are technically correct but confusing in the context of the course material.
- Subtle Bias: Language or scenarios that may unintentionally favor one group of students over another.
- Unclear relationships: Poor logical flow between stem and options
- Implausible Distractors: Answer choices that are so obviously incorrect that they do not effectively assess student knowledge.
By going beyond a simple checklist of rules, AI-powered tools like ExamEval can provide a more holistic and insightful analysis of exam quality, helping educators create fairer and more effective assessments. This approach mirrors the comprehensive review process that expert faculty members provide when evaluating exam questions.
Common Types of Context-Specific Flaws
Practical Application Issues
In health professions education, questions must not only test knowledge but also reflect realistic clinical scenarios:
- Unrealistic clinical scenarios: Situations that wouldn't occur in actual practice
- Outdated practices: Questions reflecting superseded clinical guidelines
- Missing critical context: Scenarios lacking essential information for safe practice
Cognitive Load Problems
Some questions may be structurally sound but create unnecessary cognitive burden:
- Information overload: Stems containing excessive extraneous details
- Ambiguous terminology: Using terms with multiple meanings in the field
- Unclear relationships: Poor logical flow between stem and options