A rough set theory approach to the analysis of gene expression profiles

Joachim Petit, Nathalie Meurice, José Luis Medina-Franco, Gerald M. Maggiora

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Scopus citations


Rough set theory (RST) is a set-based method that is well suited for dealing with a wide variety of discrete data. The goal of this preliminary study is to evaluate the potential suitability of RST for predicting biological endpoints in cells from their associated gene expression profiles. Such studies are the basis for identifying potential new targets that ultimately will be integrated with chemical information in drug-discovery research. In the present work, a small literature dataset was used to assess whether the gene-expression profiles induced by 30 well-known drugs can be used to predict whether human hepatoma HepG2 cells exhibit signs of phospholipidosis after treatment with the drugs. The data in this study is cast in the form of a decision table (DT), whose rows are associated with the 30 drugs and whose columns are associated with the drug-induced expression levels of 17 genes, called condition attributes in RST, plus a column that is associated with the single decision attribute that characterizes whether or not cells exhibit drug-induced phospholipidosis. The gene expression levels provide a means for partitioning the drugs into equivalence classes called indiscernibility classes in RST, such that none of the drugs in a given class can be distinguished from any other drugs in that class on the basis of their drug-induced gene expression levels. One of the powers of RST is that it provides a systematic, mathematically rigorous method for removing superfluous information. The remaining relevant information can then be expressed in terms of simple, linguistic rules that significantly enhance communication among scientists, especially those not conversant with RST. In this work, the RST approach allowed easy identification of the strongest relationships existing between drug-induced gene-expression profiles and the occurrence or nonoccurrence of phospholipidosis in the HepG2 cells. This study suggests that RST may be an efficient and effective tool for analyzing gene-expression levels in small datasets. Future studies will examine the suitability of the RST approach to larger and more complex datasets.

Original languageEnglish (US)
Title of host publicationChemoinformatics for Drug Discovery
Number of pages33
ISBN (Electronic)9781118742785
ISBN (Print)9781118139103
StatePublished - Nov 15 2013


  • Association rules
  • Drug discovery
  • Drug-induced phospholipidosis
  • Gene expression profiles
  • Rough set theory

ASJC Scopus subject areas

  • Chemistry(all)


Dive into the research topics of 'A rough set theory approach to the analysis of gene expression profiles'. Together they form a unique fingerprint.

Cite this