IRT Model Fit Software
Item response
theory (IRT) is a collection of statistical models and methods used for questionnaire
development, evaluation, and scoring. IRT models describe, in probabilistic terms, the
relationship between a person's response to a survey question and his or her standing on
the construct being measured by the survey. These measured constructs include any latent
(i.e., unobservable) variable, such as depression, fatigue, or pain, that requires
multiple survey items to estimate a person's level on the construct. One of the
fundamental assumptions for application of IRT methods is that the IRT models fit the
data.
A range of indices have been created for examining the fit of various IRT models to
item response data, which are mostly dichotomous response data. However, no one fit index
has been universally accepted nor applied routinely in educational, psychological, and
health outcomes measurement. The performance of these indices varies, depending on sample
size, model type, number of items, and properties of the items in the data. The lack of a
standardized set of fit indices that can be applied to a range of IRT models estimated
from various IRT software programs has limited the acceptability of these powerful
measurement tools for application in health outcomes research.
The Outcomes Research Branch contracted with QualityMetric, Inc. to create a SAS program (a
compiled macro) that will produce a range of indices for testing the fit of IRT models to
polytomous response data. The program reads in both IRT model parameter estimates provided
by various IRT model software programs (e.g., MULTILOG, PARSCALE, WINSTEPS) and the
individual response patterns. It returns a range of fit statistics, including extensions
of the S-X2 and the S-G2 tests for polytomous items and the X2* statistic. To visualize
misfit, the program provides observed-expected plots of fit. The program and its
accompanying documentation may be downloaded here:
|