Crowdsourcing Assessment of Surgeon Dissection of Renal Artery and Vein during Robotic Partial Nephrectomy

A Novel Approach for Quantitative Assessment of Surgical Performance

Mary K. Powers, Aaron Boonjindasup, Michael Pinsky, Philip Dorsey, Michael Maddox, Li Ming Su, Matthew Gettman, Chandru P. Sundaram, Erik P Castle, Jason Y. Lee, Benjamin R. Lee

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

Introduction: We sought to describe a methodology of crowdsourcing for obtaining quantitative performance ratings of surgeons performing renal artery and vein dissection of robotic partial nephrectomy (RPN). We sought to compare assessment of technical performance obtained from the crowdsourcers with that of surgical content experts (CE). Our hypothesis is that the crowd can score performances of renal hilar dissection comparably to surgical CE using the Global Evaluative Assessment of Robotic Skills (GEARS). Methods: A group of resident and attending robotic surgeons submitted a total of 14 video clips of RPN during hilar dissection. These videos were rated by both crowd and CE for technical skills performance using GEARS. A minimum of 3 CE and 30 Amazon Mechanical Turk crowdworkers evaluated each video with the GEARS scale. Results: Within 13 days, we received ratings of all videos from all CE, and within 11.5 hours, we received 548 GEARS ratings from crowdworkers. Even though CE were exposed to a training module, internal consistency across videos of CE GEARS ratings remained low (ICC = 0.38). Despite this, we found that crowdworker GEARS ratings of videos were highly correlated with CE ratings at both the video level (R = 0.82, p <0.001) and surgeon level (R = 0.84, p <0.001). Similarly, crowdworker ratings of the renal artery dissection were highly correlated with expert assessments (R = 0.83, p <0.001) for the unique surgery-specific assessment question. Conclusions: We conclude that crowdsourced assessment of qualitative performance ratings may be an alternative and/or adjunct to surgical experts' ratings and would provide a rapid scalable solution to triage technical skills.

Original languageEnglish (US)
Pages (from-to)447-452
Number of pages6
JournalJournal of Endourology
Volume30
Issue number4
DOIs
StatePublished - Apr 1 2016

Fingerprint

Crowdsourcing
Renal Veins
Robotics
Renal Artery
Nephrectomy
Dissection
Surgeons
Triage
Surgical Instruments

ASJC Scopus subject areas

  • Urology

Cite this

Crowdsourcing Assessment of Surgeon Dissection of Renal Artery and Vein during Robotic Partial Nephrectomy : A Novel Approach for Quantitative Assessment of Surgical Performance. / Powers, Mary K.; Boonjindasup, Aaron; Pinsky, Michael; Dorsey, Philip; Maddox, Michael; Su, Li Ming; Gettman, Matthew; Sundaram, Chandru P.; Castle, Erik P; Lee, Jason Y.; Lee, Benjamin R.

In: Journal of Endourology, Vol. 30, No. 4, 01.04.2016, p. 447-452.

Research output: Contribution to journalArticle

Powers, Mary K. ; Boonjindasup, Aaron ; Pinsky, Michael ; Dorsey, Philip ; Maddox, Michael ; Su, Li Ming ; Gettman, Matthew ; Sundaram, Chandru P. ; Castle, Erik P ; Lee, Jason Y. ; Lee, Benjamin R. / Crowdsourcing Assessment of Surgeon Dissection of Renal Artery and Vein during Robotic Partial Nephrectomy : A Novel Approach for Quantitative Assessment of Surgical Performance. In: Journal of Endourology. 2016 ; Vol. 30, No. 4. pp. 447-452.
@article{65c616a6064e400495ec680aed0ca4b9,
title = "Crowdsourcing Assessment of Surgeon Dissection of Renal Artery and Vein during Robotic Partial Nephrectomy: A Novel Approach for Quantitative Assessment of Surgical Performance",
abstract = "Introduction: We sought to describe a methodology of crowdsourcing for obtaining quantitative performance ratings of surgeons performing renal artery and vein dissection of robotic partial nephrectomy (RPN). We sought to compare assessment of technical performance obtained from the crowdsourcers with that of surgical content experts (CE). Our hypothesis is that the crowd can score performances of renal hilar dissection comparably to surgical CE using the Global Evaluative Assessment of Robotic Skills (GEARS). Methods: A group of resident and attending robotic surgeons submitted a total of 14 video clips of RPN during hilar dissection. These videos were rated by both crowd and CE for technical skills performance using GEARS. A minimum of 3 CE and 30 Amazon Mechanical Turk crowdworkers evaluated each video with the GEARS scale. Results: Within 13 days, we received ratings of all videos from all CE, and within 11.5 hours, we received 548 GEARS ratings from crowdworkers. Even though CE were exposed to a training module, internal consistency across videos of CE GEARS ratings remained low (ICC = 0.38). Despite this, we found that crowdworker GEARS ratings of videos were highly correlated with CE ratings at both the video level (R = 0.82, p <0.001) and surgeon level (R = 0.84, p <0.001). Similarly, crowdworker ratings of the renal artery dissection were highly correlated with expert assessments (R = 0.83, p <0.001) for the unique surgery-specific assessment question. Conclusions: We conclude that crowdsourced assessment of qualitative performance ratings may be an alternative and/or adjunct to surgical experts' ratings and would provide a rapid scalable solution to triage technical skills.",
author = "Powers, {Mary K.} and Aaron Boonjindasup and Michael Pinsky and Philip Dorsey and Michael Maddox and Su, {Li Ming} and Matthew Gettman and Sundaram, {Chandru P.} and Castle, {Erik P} and Lee, {Jason Y.} and Lee, {Benjamin R.}",
year = "2016",
month = "4",
day = "1",
doi = "10.1089/end.2015.0665",
language = "English (US)",
volume = "30",
pages = "447--452",
journal = "Journal of Endourology",
issn = "0892-7790",
publisher = "Mary Ann Liebert Inc.",
number = "4",

}

TY - JOUR

T1 - Crowdsourcing Assessment of Surgeon Dissection of Renal Artery and Vein during Robotic Partial Nephrectomy

T2 - A Novel Approach for Quantitative Assessment of Surgical Performance

AU - Powers, Mary K.

AU - Boonjindasup, Aaron

AU - Pinsky, Michael

AU - Dorsey, Philip

AU - Maddox, Michael

AU - Su, Li Ming

AU - Gettman, Matthew

AU - Sundaram, Chandru P.

AU - Castle, Erik P

AU - Lee, Jason Y.

AU - Lee, Benjamin R.

PY - 2016/4/1

Y1 - 2016/4/1

N2 - Introduction: We sought to describe a methodology of crowdsourcing for obtaining quantitative performance ratings of surgeons performing renal artery and vein dissection of robotic partial nephrectomy (RPN). We sought to compare assessment of technical performance obtained from the crowdsourcers with that of surgical content experts (CE). Our hypothesis is that the crowd can score performances of renal hilar dissection comparably to surgical CE using the Global Evaluative Assessment of Robotic Skills (GEARS). Methods: A group of resident and attending robotic surgeons submitted a total of 14 video clips of RPN during hilar dissection. These videos were rated by both crowd and CE for technical skills performance using GEARS. A minimum of 3 CE and 30 Amazon Mechanical Turk crowdworkers evaluated each video with the GEARS scale. Results: Within 13 days, we received ratings of all videos from all CE, and within 11.5 hours, we received 548 GEARS ratings from crowdworkers. Even though CE were exposed to a training module, internal consistency across videos of CE GEARS ratings remained low (ICC = 0.38). Despite this, we found that crowdworker GEARS ratings of videos were highly correlated with CE ratings at both the video level (R = 0.82, p <0.001) and surgeon level (R = 0.84, p <0.001). Similarly, crowdworker ratings of the renal artery dissection were highly correlated with expert assessments (R = 0.83, p <0.001) for the unique surgery-specific assessment question. Conclusions: We conclude that crowdsourced assessment of qualitative performance ratings may be an alternative and/or adjunct to surgical experts' ratings and would provide a rapid scalable solution to triage technical skills.

AB - Introduction: We sought to describe a methodology of crowdsourcing for obtaining quantitative performance ratings of surgeons performing renal artery and vein dissection of robotic partial nephrectomy (RPN). We sought to compare assessment of technical performance obtained from the crowdsourcers with that of surgical content experts (CE). Our hypothesis is that the crowd can score performances of renal hilar dissection comparably to surgical CE using the Global Evaluative Assessment of Robotic Skills (GEARS). Methods: A group of resident and attending robotic surgeons submitted a total of 14 video clips of RPN during hilar dissection. These videos were rated by both crowd and CE for technical skills performance using GEARS. A minimum of 3 CE and 30 Amazon Mechanical Turk crowdworkers evaluated each video with the GEARS scale. Results: Within 13 days, we received ratings of all videos from all CE, and within 11.5 hours, we received 548 GEARS ratings from crowdworkers. Even though CE were exposed to a training module, internal consistency across videos of CE GEARS ratings remained low (ICC = 0.38). Despite this, we found that crowdworker GEARS ratings of videos were highly correlated with CE ratings at both the video level (R = 0.82, p <0.001) and surgeon level (R = 0.84, p <0.001). Similarly, crowdworker ratings of the renal artery dissection were highly correlated with expert assessments (R = 0.83, p <0.001) for the unique surgery-specific assessment question. Conclusions: We conclude that crowdsourced assessment of qualitative performance ratings may be an alternative and/or adjunct to surgical experts' ratings and would provide a rapid scalable solution to triage technical skills.

UR - http://www.scopus.com/inward/record.url?scp=84964922288&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84964922288&partnerID=8YFLogxK

U2 - 10.1089/end.2015.0665

DO - 10.1089/end.2015.0665

M3 - Article

VL - 30

SP - 447

EP - 452

JO - Journal of Endourology

JF - Journal of Endourology

SN - 0892-7790

IS - 4

ER -