CNVpytor: a tool for copy number variation detection and analysis from read depth and allele imbalance in whole-genome sequencing

Milovan Suvakov, Arijit Panda, Colin Diesh, Ian Holmes, Alexej Abyzov

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Detecting copy number variations (CNVs) and copy number alterations (CNAs) based on whole-genome sequencing data is important for personalized genomics and treatment. CNVnator is one of the most popular tools for CNV/CNA discovery and analysis based on read depth. Findings: Herein, we present an extension of CNVnator developed in Python - CNVpytor. CNVpytor inherits the reimplemented core engine of its predecessor and extends visualization, modularization, performance, and functionality. Additionally, CNVpytor uses B-allele frequency likelihood information from single-nucleotide polymorphisms and small indels data as additional evidence for CNVs/CNAs and as primary information for copy number-neutral losses of heterozygosity. Conclusions: CNVpytor is significantly faster than CNVnator - particularly for parsing alignment files (2-20 times faster) - and has (20-50 times) smaller intermediate files. CNV calls can be filtered using several criteria, annotated, and merged over multiple samples. Modular architecture allows it to be used in shared and cloud environments such as Google Colab and Jupyter notebook. Data can be exported into JBrowse, while a lightweight plugin version of CNVpytor for JBrowse enables nearly instant and GUI-assisted analysis of CNVs by any user. CNVpytor release and the source code are available on GitHub at https://github.com/abyzovlab/CNVpytor under the MIT license.

Original languageEnglish (US)
Article numbergiab074
JournalGigaScience
Volume10
Issue number11
DOIs
StatePublished - Nov 1 2021

Keywords

  • Python
  • copy number alternations
  • copy number variations
  • whole-genome sequencing

ASJC Scopus subject areas

  • General Medicine

Fingerprint

Dive into the research topics of 'CNVpytor: a tool for copy number variation detection and analysis from read depth and allele imbalance in whole-genome sequencing'. Together they form a unique fingerprint.

Cite this