Constructing a validity argument for the Objective Structured Assessment of Technical Skills (OSATS)

a systematic review of validity evidence

Rose Hatala, David Allan Cook, Ryan Brydges, Richard Hawkins

Research output: Contribution to journalArticle

31 Citations (Scopus)

Abstract

In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane’s framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected original research articles in any language evaluating the OSATS as an assessment tool for any health professional. We iteratively and collaboratively extracted validity evidence from included articles to construct and evaluate the validity argument for varied uses of the OSATS. Twenty-nine articles met the inclusion criteria, all focussed on surgical technical skills assessment. We identified three intended uses for the OSATS, namely formative feedback, high-stakes assessment and program evaluation. Following Kane’s framework, four inferences in the validity argument were examined (scoring, generalization, extrapolation, decision). For formative feedback and high-stakes assessment, there was reasonable evidence for scoring and extrapolation. However, for high-stakes assessment there was a dearth of evidence for generalization aside from inter-rater reliability data and an absence of evidence linking multi-station OSATS scores to performance in real clinical settings. For program evaluation, the OSATS validity argument was supported by reasonable generalization and extrapolation evidence. There was a complete lack of evidence regarding implications and decisions based on OSATS scores. In general, validity evidence supported the use of the OSATS for formative feedback. Research to provide support for decisions based on OSATS scores is required if the OSATS is to be used for higher-stakes decisions and program evaluation.

Original languageEnglish (US)
Pages (from-to)1149-1175
Number of pages27
JournalAdvances in Health Sciences Education
Volume20
Issue number5
DOIs
StatePublished - Feb 22 2015

Fingerprint

Program Evaluation
evidence
Research
MEDLINE
Language
Health
Formative Feedback
Generalization (Psychology)
evaluation
health professionals

Keywords

  • Assessment
  • OSATS
  • Systematic review
  • Validity argument

ASJC Scopus subject areas

  • Medicine(all)
  • Education

Cite this

Constructing a validity argument for the Objective Structured Assessment of Technical Skills (OSATS) : a systematic review of validity evidence. / Hatala, Rose; Cook, David Allan; Brydges, Ryan; Hawkins, Richard.

In: Advances in Health Sciences Education, Vol. 20, No. 5, 22.02.2015, p. 1149-1175.

Research output: Contribution to journalArticle

@article{c6bdef753f43471bac3e95eba2f8a467,
title = "Constructing a validity argument for the Objective Structured Assessment of Technical Skills (OSATS): a systematic review of validity evidence",
abstract = "In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane’s framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected original research articles in any language evaluating the OSATS as an assessment tool for any health professional. We iteratively and collaboratively extracted validity evidence from included articles to construct and evaluate the validity argument for varied uses of the OSATS. Twenty-nine articles met the inclusion criteria, all focussed on surgical technical skills assessment. We identified three intended uses for the OSATS, namely formative feedback, high-stakes assessment and program evaluation. Following Kane’s framework, four inferences in the validity argument were examined (scoring, generalization, extrapolation, decision). For formative feedback and high-stakes assessment, there was reasonable evidence for scoring and extrapolation. However, for high-stakes assessment there was a dearth of evidence for generalization aside from inter-rater reliability data and an absence of evidence linking multi-station OSATS scores to performance in real clinical settings. For program evaluation, the OSATS validity argument was supported by reasonable generalization and extrapolation evidence. There was a complete lack of evidence regarding implications and decisions based on OSATS scores. In general, validity evidence supported the use of the OSATS for formative feedback. Research to provide support for decisions based on OSATS scores is required if the OSATS is to be used for higher-stakes decisions and program evaluation.",
keywords = "Assessment, OSATS, Systematic review, Validity argument",
author = "Rose Hatala and Cook, {David Allan} and Ryan Brydges and Richard Hawkins",
year = "2015",
month = "2",
day = "22",
doi = "10.1007/s10459-015-9593-1",
language = "English (US)",
volume = "20",
pages = "1149--1175",
journal = "Advances in Health Sciences Education",
issn = "1382-4996",
publisher = "Springer Netherlands",
number = "5",

}

TY - JOUR

T1 - Constructing a validity argument for the Objective Structured Assessment of Technical Skills (OSATS)

T2 - a systematic review of validity evidence

AU - Hatala, Rose

AU - Cook, David Allan

AU - Brydges, Ryan

AU - Hawkins, Richard

PY - 2015/2/22

Y1 - 2015/2/22

N2 - In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane’s framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected original research articles in any language evaluating the OSATS as an assessment tool for any health professional. We iteratively and collaboratively extracted validity evidence from included articles to construct and evaluate the validity argument for varied uses of the OSATS. Twenty-nine articles met the inclusion criteria, all focussed on surgical technical skills assessment. We identified three intended uses for the OSATS, namely formative feedback, high-stakes assessment and program evaluation. Following Kane’s framework, four inferences in the validity argument were examined (scoring, generalization, extrapolation, decision). For formative feedback and high-stakes assessment, there was reasonable evidence for scoring and extrapolation. However, for high-stakes assessment there was a dearth of evidence for generalization aside from inter-rater reliability data and an absence of evidence linking multi-station OSATS scores to performance in real clinical settings. For program evaluation, the OSATS validity argument was supported by reasonable generalization and extrapolation evidence. There was a complete lack of evidence regarding implications and decisions based on OSATS scores. In general, validity evidence supported the use of the OSATS for formative feedback. Research to provide support for decisions based on OSATS scores is required if the OSATS is to be used for higher-stakes decisions and program evaluation.

AB - In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane’s framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected original research articles in any language evaluating the OSATS as an assessment tool for any health professional. We iteratively and collaboratively extracted validity evidence from included articles to construct and evaluate the validity argument for varied uses of the OSATS. Twenty-nine articles met the inclusion criteria, all focussed on surgical technical skills assessment. We identified three intended uses for the OSATS, namely formative feedback, high-stakes assessment and program evaluation. Following Kane’s framework, four inferences in the validity argument were examined (scoring, generalization, extrapolation, decision). For formative feedback and high-stakes assessment, there was reasonable evidence for scoring and extrapolation. However, for high-stakes assessment there was a dearth of evidence for generalization aside from inter-rater reliability data and an absence of evidence linking multi-station OSATS scores to performance in real clinical settings. For program evaluation, the OSATS validity argument was supported by reasonable generalization and extrapolation evidence. There was a complete lack of evidence regarding implications and decisions based on OSATS scores. In general, validity evidence supported the use of the OSATS for formative feedback. Research to provide support for decisions based on OSATS scores is required if the OSATS is to be used for higher-stakes decisions and program evaluation.

KW - Assessment

KW - OSATS

KW - Systematic review

KW - Validity argument

UR - http://www.scopus.com/inward/record.url?scp=84947019597&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84947019597&partnerID=8YFLogxK

U2 - 10.1007/s10459-015-9593-1

DO - 10.1007/s10459-015-9593-1

M3 - Article

VL - 20

SP - 1149

EP - 1175

JO - Advances in Health Sciences Education

JF - Advances in Health Sciences Education

SN - 1382-4996

IS - 5

ER -