Sample size calculations in surgery: Are they done correctly?

Melinda A. Maggard; Jessica B. O'Connell; Jerome H. Liu; David A. Etzioni; Clifford Y. Ko

doi:10.1067/msy.2003.235

Sample size calculations in surgery: Are they done correctly?

Melinda A. Maggard, Jessica B. O'Connell, Jerome H. Liu, David A. Etzioni, Clifford Y. Ko

Colon and Rectal Surgery

Research output: Contribution to journal › Article › peer-review

45 Scopus citations

Abstract

Background. Randomized controlled trials (RCTs) are considered the gold standard for evidence-based clinical research, but prior work has suggested that there may be poor reporting of sample sizes in the surgical literature. Sample size calculations are essential for planning a study to minimize both type I and type II errors. We hypothesized that sample size calculations may not be performed consistently in surgery studies and, therefore, many studies may be "underpowered." To address this issue, we reviewed RCTs published in the surgical literature to determine how often sample size calculations were reported and to analyze each study's ability to detect varying degrees of differences in outcomes. Methods. A comprehensive MEDLINE search identified RCTs published in Annals of Surgery, Archives of Surgery, and Surgery between 1999 and 2002. Each study was evaluated by two independent reviewers. Sample size calculations were performed to determine whether they had 80% power to detect differences between treatment groups of 50% (large) and 20% (small), with one-sided test, alpha = 0.05. For the underpowered studies, the degree to which sample size would need to be increased was determined. Results. One hundred twenty-seven RCT articles were identified; of these, 48 (38%) reported sample size calculations. Eighty-six (68%) studies reported positive treatment effect, whereas 41 (32%) found negative results. Sixty-three (50%) of the studies were appropriately powered to detect a 50% effect change, whereas 24 (19%) had the power to detect a 20% difference. Of the studies that were underpowered, more than half needed to increase sample she by more than 10-fold. Conclusions. The reporting of sample size calculations was not provided in more than 60% of recently published surgical RCTs. Moreover; only half of studies had sample sizes appropriate to detect large differences between treatment groups.

Original language	English (US)
Pages (from-to)	275-279
Number of pages	5
Journal	Surgery
Volume	134
Issue number	2
DOIs	https://doi.org/10.1067/msy.2003.235
State	Published - Aug 1 2003

ASJC Scopus subject areas

Surgery

Access to Document

10.1067/msy.2003.235

Cite this

@article{e79819ace2ce448ab52b9303bf2e7967,

title = "Sample size calculations in surgery: Are they done correctly?",

abstract = "Background. Randomized controlled trials (RCTs) are considered the gold standard for evidence-based clinical research, but prior work has suggested that there may be poor reporting of sample sizes in the surgical literature. Sample size calculations are essential for planning a study to minimize both type I and type II errors. We hypothesized that sample size calculations may not be performed consistently in surgery studies and, therefore, many studies may be {"}underpowered.{"} To address this issue, we reviewed RCTs published in the surgical literature to determine how often sample size calculations were reported and to analyze each study's ability to detect varying degrees of differences in outcomes. Methods. A comprehensive MEDLINE search identified RCTs published in Annals of Surgery, Archives of Surgery, and Surgery between 1999 and 2002. Each study was evaluated by two independent reviewers. Sample size calculations were performed to determine whether they had 80% power to detect differences between treatment groups of 50% (large) and 20% (small), with one-sided test, alpha = 0.05. For the underpowered studies, the degree to which sample size would need to be increased was determined. Results. One hundred twenty-seven RCT articles were identified; of these, 48 (38%) reported sample size calculations. Eighty-six (68%) studies reported positive treatment effect, whereas 41 (32%) found negative results. Sixty-three (50%) of the studies were appropriately powered to detect a 50% effect change, whereas 24 (19%) had the power to detect a 20% difference. Of the studies that were underpowered, more than half needed to increase sample she by more than 10-fold. Conclusions. The reporting of sample size calculations was not provided in more than 60% of recently published surgical RCTs. Moreover; only half of studies had sample sizes appropriate to detect large differences between treatment groups.",

author = "Maggard, {Melinda A.} and O'Connell, {Jessica B.} and Liu, {Jerome H.} and Etzioni, {David A.} and Ko, {Clifford Y.}",

note = "Funding Information: Support for this study was provided by the Robert Wood Johnson Clinical Scholars Program, UCLA, Los Angeles, Calif ",

year = "2003",

month = aug,

day = "1",

doi = "10.1067/msy.2003.235",

language = "English (US)",

volume = "134",

pages = "275--279",

journal = "Surgery",

issn = "0039-6060",

publisher = "Mosby Inc.",

number = "2",

}

TY - JOUR

T1 - Sample size calculations in surgery

T2 - Are they done correctly?

AU - Maggard, Melinda A.

AU - O'Connell, Jessica B.

AU - Liu, Jerome H.

AU - Etzioni, David A.

AU - Ko, Clifford Y.

N1 - Funding Information: Support for this study was provided by the Robert Wood Johnson Clinical Scholars Program, UCLA, Los Angeles, Calif

PY - 2003/8/1

Y1 - 2003/8/1

N2 - Background. Randomized controlled trials (RCTs) are considered the gold standard for evidence-based clinical research, but prior work has suggested that there may be poor reporting of sample sizes in the surgical literature. Sample size calculations are essential for planning a study to minimize both type I and type II errors. We hypothesized that sample size calculations may not be performed consistently in surgery studies and, therefore, many studies may be "underpowered." To address this issue, we reviewed RCTs published in the surgical literature to determine how often sample size calculations were reported and to analyze each study's ability to detect varying degrees of differences in outcomes. Methods. A comprehensive MEDLINE search identified RCTs published in Annals of Surgery, Archives of Surgery, and Surgery between 1999 and 2002. Each study was evaluated by two independent reviewers. Sample size calculations were performed to determine whether they had 80% power to detect differences between treatment groups of 50% (large) and 20% (small), with one-sided test, alpha = 0.05. For the underpowered studies, the degree to which sample size would need to be increased was determined. Results. One hundred twenty-seven RCT articles were identified; of these, 48 (38%) reported sample size calculations. Eighty-six (68%) studies reported positive treatment effect, whereas 41 (32%) found negative results. Sixty-three (50%) of the studies were appropriately powered to detect a 50% effect change, whereas 24 (19%) had the power to detect a 20% difference. Of the studies that were underpowered, more than half needed to increase sample she by more than 10-fold. Conclusions. The reporting of sample size calculations was not provided in more than 60% of recently published surgical RCTs. Moreover; only half of studies had sample sizes appropriate to detect large differences between treatment groups.

AB - Background. Randomized controlled trials (RCTs) are considered the gold standard for evidence-based clinical research, but prior work has suggested that there may be poor reporting of sample sizes in the surgical literature. Sample size calculations are essential for planning a study to minimize both type I and type II errors. We hypothesized that sample size calculations may not be performed consistently in surgery studies and, therefore, many studies may be "underpowered." To address this issue, we reviewed RCTs published in the surgical literature to determine how often sample size calculations were reported and to analyze each study's ability to detect varying degrees of differences in outcomes. Methods. A comprehensive MEDLINE search identified RCTs published in Annals of Surgery, Archives of Surgery, and Surgery between 1999 and 2002. Each study was evaluated by two independent reviewers. Sample size calculations were performed to determine whether they had 80% power to detect differences between treatment groups of 50% (large) and 20% (small), with one-sided test, alpha = 0.05. For the underpowered studies, the degree to which sample size would need to be increased was determined. Results. One hundred twenty-seven RCT articles were identified; of these, 48 (38%) reported sample size calculations. Eighty-six (68%) studies reported positive treatment effect, whereas 41 (32%) found negative results. Sixty-three (50%) of the studies were appropriately powered to detect a 50% effect change, whereas 24 (19%) had the power to detect a 20% difference. Of the studies that were underpowered, more than half needed to increase sample she by more than 10-fold. Conclusions. The reporting of sample size calculations was not provided in more than 60% of recently published surgical RCTs. Moreover; only half of studies had sample sizes appropriate to detect large differences between treatment groups.

UR - http://www.scopus.com/inward/record.url?scp=0042829184&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0042829184&partnerID=8YFLogxK

U2 - 10.1067/msy.2003.235

DO - 10.1067/msy.2003.235

M3 - Article

C2 - 12947329

AN - SCOPUS:0042829184

SN - 0039-6060

VL - 134

SP - 275

EP - 279

JO - Surgery

JF - Surgery

IS - 2

ER -

Sample size calculations in surgery: Are they done correctly?

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this