Q- and A-learning methods for estimating optimal dynamic treatment regimes

Phillip Schulte, Anastasios A. Tsiatis, Eric B. Laber, Marie Davidian

Research output: Contribution to journalArticle

61 Citations (Scopus)

Abstract

In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.

Original languageEnglish (US)
Pages (from-to)640-661
Number of pages22
JournalStatistical Science
Volume29
Issue number4
DOIs
StatePublished - 2014
Externally publishedYes

Fingerprint

Decision Rules
Baseline
Series
Learning
Learning methods
Decision rules
Physicians
Sequential decisions

Keywords

  • Advantage learning
  • Bias-variance trade-off
  • Model misspecification
  • Personalized medicine
  • Potential outcomes
  • Sequential decision-making

ASJC Scopus subject areas

  • Statistics and Probability
  • Mathematics(all)
  • Statistics, Probability and Uncertainty

Cite this

Q- and A-learning methods for estimating optimal dynamic treatment regimes. / Schulte, Phillip; Tsiatis, Anastasios A.; Laber, Eric B.; Davidian, Marie.

In: Statistical Science, Vol. 29, No. 4, 2014, p. 640-661.

Research output: Contribution to journalArticle

Schulte, Phillip ; Tsiatis, Anastasios A. ; Laber, Eric B. ; Davidian, Marie. / Q- and A-learning methods for estimating optimal dynamic treatment regimes. In: Statistical Science. 2014 ; Vol. 29, No. 4. pp. 640-661.
@article{8bb968c90a394995a5576731dcb337fb,
title = "Q- and A-learning methods for estimating optimal dynamic treatment regimes",
abstract = "In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.",
keywords = "Advantage learning, Bias-variance trade-off, Model misspecification, Personalized medicine, Potential outcomes, Sequential decision-making",
author = "Phillip Schulte and Tsiatis, {Anastasios A.} and Laber, {Eric B.} and Marie Davidian",
year = "2014",
doi = "10.1214/13-STS450",
language = "English (US)",
volume = "29",
pages = "640--661",
journal = "Statistical Science",
issn = "0883-4237",
publisher = "Institute of Mathematical Statistics",
number = "4",

}

TY - JOUR

T1 - Q- and A-learning methods for estimating optimal dynamic treatment regimes

AU - Schulte, Phillip

AU - Tsiatis, Anastasios A.

AU - Laber, Eric B.

AU - Davidian, Marie

PY - 2014

Y1 - 2014

N2 - In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.

AB - In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.

KW - Advantage learning

KW - Bias-variance trade-off

KW - Model misspecification

KW - Personalized medicine

KW - Potential outcomes

KW - Sequential decision-making

UR - http://www.scopus.com/inward/record.url?scp=84921477485&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84921477485&partnerID=8YFLogxK

U2 - 10.1214/13-STS450

DO - 10.1214/13-STS450

M3 - Article

AN - SCOPUS:84921477485

VL - 29

SP - 640

EP - 661

JO - Statistical Science

JF - Statistical Science

SN - 0883-4237

IS - 4

ER -