Table 3 The validation inferences validity framework

Validity inference Definition (assumptions)a Examples of evidence
Scoring The score or written narrative from a given observation adequately captures key aspects of performance Procedures for creating and empirically evaluating item wording, response options, scoring options
Rater selection and training
Generalization The total score or synthesis of narratives reflects performance across the test domain Sampling strategy (e.g., test blueprint) and sample size
Internal consistency reliability
Interrater reliability
Extrapolation The total score or synthesis in a test setting reflects meaningful performance in a real life setting Authenticity of context
Correlation with tests measuring similar constructs, especially in real-life context
Correlation (or lack thereof) with tests measuring different constructs
Expert-novice comparisons
Factor analysis
Implications/decisions Measured performance constitutes a rational basis for meaningful decisions and actions See Table 2, “Consequences”
  1. See Kane [10] and Cook et al [12] for further details and examples
  2. aEach of the inferences reflects assumptions about the creation and use of assessment results