Does drug-target have a likeness?

H. Xu, Y. Fang, Lixia Yao, Y. Chen, Xin Chen

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

Objective: The discovery of new targets that are sufficiently robust to yield marketable therapeutics is an enormous challenge. Conventional target identification approaches are disease-dependent, which require heavy experimental workload and comprehensive domain knowledge. In this work, we propose that a disease-independent property of proteins, "drug-target likeness", can be explored to facilitate the genomic scale target screening in the post-genomic age. Methods: A Support Vector Machine (SVM) classifier was trained to recognize target and non-target protein sequences compiled from the Therapeutic Target Database, DrugBank, and PFam. Protein sequences are encoded by their composition, transition and distribution features of residues and Gaussian kernel function was used in SVM classification. Results: SM with a fine-tuned kernel width records 66.4 ± 5.1% of sensitivity and 97.2 ± 0.6% of specificity, corresponding to an overall target prediction accuracyof 94.4 ± 0.8%. Conclusions: Though primitive, these results suggest that, similar to the "drug likeness" for small chemicals, their binding partners, drug targets, also display shared features which are reflected in their sequences and can be captured by statistical learning approaches. Further research on how to accurately and interpretably measure the likeness of protein being a drug target is promising. Inspired by the progress of "drug likeness" studies, advances in protein descriptors, statistical learning algorithms and more comprehensive and accurate gold-standard data set from disease biology research may help to further define the "drug-target likeness" property of proteins.

Original languageEnglish (US)
Pages (from-to)360-366
Number of pages7
JournalMethods of Information in Medicine
Volume46
Issue number3
DOIs
StatePublished - May 30 2007
Externally publishedYes

Fingerprint

Pharmaceutical Preparations
Proteins
Learning
Workload
Research
Gold
Databases
Therapeutics
Support Vector Machine
Datasets

Keywords

  • Drug target
  • Human genome
  • Statistical learning
  • Support vector machine

ASJC Scopus subject areas

  • Health Informatics
  • Advanced and Specialized Nursing
  • Health Information Management

Cite this

Does drug-target have a likeness? / Xu, H.; Fang, Y.; Yao, Lixia; Chen, Y.; Chen, Xin.

In: Methods of Information in Medicine, Vol. 46, No. 3, 30.05.2007, p. 360-366.

Research output: Contribution to journalArticle

Xu, H, Fang, Y, Yao, L, Chen, Y & Chen, X 2007, 'Does drug-target have a likeness?', Methods of Information in Medicine, vol. 46, no. 3, pp. 360-366. https://doi.org/10.1160/ME0425
Xu, H. ; Fang, Y. ; Yao, Lixia ; Chen, Y. ; Chen, Xin. / Does drug-target have a likeness?. In: Methods of Information in Medicine. 2007 ; Vol. 46, No. 3. pp. 360-366.
@article{d50047b4e9d84735a73d6d12d2b60ba1,
title = "Does drug-target have a likeness?",
abstract = "Objective: The discovery of new targets that are sufficiently robust to yield marketable therapeutics is an enormous challenge. Conventional target identification approaches are disease-dependent, which require heavy experimental workload and comprehensive domain knowledge. In this work, we propose that a disease-independent property of proteins, {"}drug-target likeness{"}, can be explored to facilitate the genomic scale target screening in the post-genomic age. Methods: A Support Vector Machine (SVM) classifier was trained to recognize target and non-target protein sequences compiled from the Therapeutic Target Database, DrugBank, and PFam. Protein sequences are encoded by their composition, transition and distribution features of residues and Gaussian kernel function was used in SVM classification. Results: SM with a fine-tuned kernel width records 66.4 ± 5.1{\%} of sensitivity and 97.2 ± 0.6{\%} of specificity, corresponding to an overall target prediction accuracyof 94.4 ± 0.8{\%}. Conclusions: Though primitive, these results suggest that, similar to the {"}drug likeness{"} for small chemicals, their binding partners, drug targets, also display shared features which are reflected in their sequences and can be captured by statistical learning approaches. Further research on how to accurately and interpretably measure the likeness of protein being a drug target is promising. Inspired by the progress of {"}drug likeness{"} studies, advances in protein descriptors, statistical learning algorithms and more comprehensive and accurate gold-standard data set from disease biology research may help to further define the {"}drug-target likeness{"} property of proteins.",
keywords = "Drug target, Human genome, Statistical learning, Support vector machine",
author = "H. Xu and Y. Fang and Lixia Yao and Y. Chen and Xin Chen",
year = "2007",
month = "5",
day = "30",
doi = "10.1160/ME0425",
language = "English (US)",
volume = "46",
pages = "360--366",
journal = "Methods of Information in Medicine",
issn = "0026-1270",
publisher = "Schattauer GmbH",
number = "3",

}

TY - JOUR

T1 - Does drug-target have a likeness?

AU - Xu, H.

AU - Fang, Y.

AU - Yao, Lixia

AU - Chen, Y.

AU - Chen, Xin

PY - 2007/5/30

Y1 - 2007/5/30

N2 - Objective: The discovery of new targets that are sufficiently robust to yield marketable therapeutics is an enormous challenge. Conventional target identification approaches are disease-dependent, which require heavy experimental workload and comprehensive domain knowledge. In this work, we propose that a disease-independent property of proteins, "drug-target likeness", can be explored to facilitate the genomic scale target screening in the post-genomic age. Methods: A Support Vector Machine (SVM) classifier was trained to recognize target and non-target protein sequences compiled from the Therapeutic Target Database, DrugBank, and PFam. Protein sequences are encoded by their composition, transition and distribution features of residues and Gaussian kernel function was used in SVM classification. Results: SM with a fine-tuned kernel width records 66.4 ± 5.1% of sensitivity and 97.2 ± 0.6% of specificity, corresponding to an overall target prediction accuracyof 94.4 ± 0.8%. Conclusions: Though primitive, these results suggest that, similar to the "drug likeness" for small chemicals, their binding partners, drug targets, also display shared features which are reflected in their sequences and can be captured by statistical learning approaches. Further research on how to accurately and interpretably measure the likeness of protein being a drug target is promising. Inspired by the progress of "drug likeness" studies, advances in protein descriptors, statistical learning algorithms and more comprehensive and accurate gold-standard data set from disease biology research may help to further define the "drug-target likeness" property of proteins.

AB - Objective: The discovery of new targets that are sufficiently robust to yield marketable therapeutics is an enormous challenge. Conventional target identification approaches are disease-dependent, which require heavy experimental workload and comprehensive domain knowledge. In this work, we propose that a disease-independent property of proteins, "drug-target likeness", can be explored to facilitate the genomic scale target screening in the post-genomic age. Methods: A Support Vector Machine (SVM) classifier was trained to recognize target and non-target protein sequences compiled from the Therapeutic Target Database, DrugBank, and PFam. Protein sequences are encoded by their composition, transition and distribution features of residues and Gaussian kernel function was used in SVM classification. Results: SM with a fine-tuned kernel width records 66.4 ± 5.1% of sensitivity and 97.2 ± 0.6% of specificity, corresponding to an overall target prediction accuracyof 94.4 ± 0.8%. Conclusions: Though primitive, these results suggest that, similar to the "drug likeness" for small chemicals, their binding partners, drug targets, also display shared features which are reflected in their sequences and can be captured by statistical learning approaches. Further research on how to accurately and interpretably measure the likeness of protein being a drug target is promising. Inspired by the progress of "drug likeness" studies, advances in protein descriptors, statistical learning algorithms and more comprehensive and accurate gold-standard data set from disease biology research may help to further define the "drug-target likeness" property of proteins.

KW - Drug target

KW - Human genome

KW - Statistical learning

KW - Support vector machine

UR - http://www.scopus.com/inward/record.url?scp=34249328535&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34249328535&partnerID=8YFLogxK

U2 - 10.1160/ME0425

DO - 10.1160/ME0425

M3 - Article

C2 - 17492123

AN - SCOPUS:34249328535

VL - 46

SP - 360

EP - 366

JO - Methods of Information in Medicine

JF - Methods of Information in Medicine

SN - 0026-1270

IS - 3

ER -