Robust Recurrent CNV Detection in the Presence of Inter-Subject Variability

Mustafa Alshawaqfeh, Ahmad Al Kawam, Erchin Serpedin, Aniruddha Datta

Research output: Contribution to journalArticle


The study of recurrent copy number variations (CNVs) plays an important role in understanding the onset and evolution of complex diseases such as cancer. Array-based comparative genomic hybridization (aCGH) is a widely used microarray based technology for identifying CNVs. However, due to high noise levels and inter-sample variability, detecting recurrent CNVs from aCGH data remains a challenging topic. This paper proposes a novel method for identification of the recurrent CNVs. In the proposed method, the noisy aCGH data is modeled as the superposition of three matrices: a full-rank matrix of weighted piece-wise generating signals accounting for the clean aCGH data, a Gaussian noise matrix to model the inherent experimentation errors and other sources of error, and a sparse matrix to capture the sparse inter-sample (sample-specific) variations. We demonstrated the ability of our method to separate accurately recurrent CNVs from sample-specific variations and noise in both simulated (artificial) data and real data. The proposed method produced more accurate results than current state-of-the-art methods used in recurrent CNV detection and exhibited robustness to noise and sample-specific variations.

Original languageEnglish
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Publication statusAccepted/In press - 1 Jan 2018



  • Bioinformatics
  • Copy number variation
  • Diseases
  • Fused lasso
  • Genomics
  • Hidden Markov models
  • Inter-subject variability
  • Mathematical model
  • Probes
  • Recurrent copy number variation
  • Sparse matrices

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Cite this