SparseNCA: Sparse Network Component Analysis for Recovering Transcription Factor Activities with Incomplete Prior Information

Amina Noor, Aitzaz Ahmad, Erchin Serpedin

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Network component analysis (NCA) is an important method for inferring transcriptional regulatory networks (TRNs) and recovering transcription factor activities (TFAs) using gene expression data, and the prior information about the connectivity matrix. The algorithms currently available crucially depend on the completeness of this prior information. However, inaccuracies in the measurement process may render incompleteness in the available knowledge about the connectivity matrix. Hence, computationally efficient algorithms are needed to overcome the possible incompleteness in the available data. We present a sparse network component analysis algorithm (sparseNCA), which incorporates the effect of incompleteness in the estimation of TRNs by imposing an additional sparsity constraint using the ℓ1 norm, which results in a greater estimation accuracy. In order to improve the computational efficiency, an iterative re-weighted ℓ2 method is proposed for the NCA problem which not only promotes sparsity but is hundreds of times faster than the ℓ1 norm based solution. The performance of sparseNCA is rigorously compared to that of FastNCA and NINCA using synthetic data as well as real data. It is shown that sparseNCA outperforms the existing state-of-the-art algorithms both in terms of estimation accuracy and consistency with the added advantage of low computational complexity. The performance of sparseNCA compared to its predecessors is particularly pronounced in case of incomplete prior information about the sparsity of the network. Subnetwork analysis is performed on the E.coli data which reiterates the superior consistency of the proposed algorithm.

Original languageEnglish
Pages (from-to)387-395
Number of pages9
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume15
Issue number2
DOIs
Publication statusPublished - 1 Mar 2018

Fingerprint

Network components
Transcription factors
Incomplete Information
Prior Information
Transcription Factor
Transcription Factors
Algorithm Analysis
Incompleteness
Sparsity
Regulatory Networks
Gene Regulatory Networks
Connectivity
Norm
Synthetic Data
Gene Expression Data
Computational Efficiency
Low Complexity
Escherichia Coli
Computational efficiency
Gene expression

Keywords

  • incomplete prior
  • Network component analysis
  • transcription factor activity estimation

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Cite this

@article{32bc1cee193d4fc297a5c2aed64d7a71,
title = "SparseNCA: Sparse Network Component Analysis for Recovering Transcription Factor Activities with Incomplete Prior Information",
abstract = "Network component analysis (NCA) is an important method for inferring transcriptional regulatory networks (TRNs) and recovering transcription factor activities (TFAs) using gene expression data, and the prior information about the connectivity matrix. The algorithms currently available crucially depend on the completeness of this prior information. However, inaccuracies in the measurement process may render incompleteness in the available knowledge about the connectivity matrix. Hence, computationally efficient algorithms are needed to overcome the possible incompleteness in the available data. We present a sparse network component analysis algorithm (sparseNCA), which incorporates the effect of incompleteness in the estimation of TRNs by imposing an additional sparsity constraint using the ℓ1 norm, which results in a greater estimation accuracy. In order to improve the computational efficiency, an iterative re-weighted ℓ2 method is proposed for the NCA problem which not only promotes sparsity but is hundreds of times faster than the ℓ1 norm based solution. The performance of sparseNCA is rigorously compared to that of FastNCA and NINCA using synthetic data as well as real data. It is shown that sparseNCA outperforms the existing state-of-the-art algorithms both in terms of estimation accuracy and consistency with the added advantage of low computational complexity. The performance of sparseNCA compared to its predecessors is particularly pronounced in case of incomplete prior information about the sparsity of the network. Subnetwork analysis is performed on the E.coli data which reiterates the superior consistency of the proposed algorithm.",
keywords = "incomplete prior, Network component analysis, transcription factor activity estimation",
author = "Amina Noor and Aitzaz Ahmad and Erchin Serpedin",
year = "2018",
month = "3",
day = "1",
doi = "10.1109/TCBB.2015.2495224",
language = "English",
volume = "15",
pages = "387--395",
journal = "IEEE/ACM Transactions on Computational Biology and Bioinformatics",
issn = "1545-5963",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "2",

}

TY - JOUR

T1 - SparseNCA

T2 - Sparse Network Component Analysis for Recovering Transcription Factor Activities with Incomplete Prior Information

AU - Noor, Amina

AU - Ahmad, Aitzaz

AU - Serpedin, Erchin

PY - 2018/3/1

Y1 - 2018/3/1

N2 - Network component analysis (NCA) is an important method for inferring transcriptional regulatory networks (TRNs) and recovering transcription factor activities (TFAs) using gene expression data, and the prior information about the connectivity matrix. The algorithms currently available crucially depend on the completeness of this prior information. However, inaccuracies in the measurement process may render incompleteness in the available knowledge about the connectivity matrix. Hence, computationally efficient algorithms are needed to overcome the possible incompleteness in the available data. We present a sparse network component analysis algorithm (sparseNCA), which incorporates the effect of incompleteness in the estimation of TRNs by imposing an additional sparsity constraint using the ℓ1 norm, which results in a greater estimation accuracy. In order to improve the computational efficiency, an iterative re-weighted ℓ2 method is proposed for the NCA problem which not only promotes sparsity but is hundreds of times faster than the ℓ1 norm based solution. The performance of sparseNCA is rigorously compared to that of FastNCA and NINCA using synthetic data as well as real data. It is shown that sparseNCA outperforms the existing state-of-the-art algorithms both in terms of estimation accuracy and consistency with the added advantage of low computational complexity. The performance of sparseNCA compared to its predecessors is particularly pronounced in case of incomplete prior information about the sparsity of the network. Subnetwork analysis is performed on the E.coli data which reiterates the superior consistency of the proposed algorithm.

AB - Network component analysis (NCA) is an important method for inferring transcriptional regulatory networks (TRNs) and recovering transcription factor activities (TFAs) using gene expression data, and the prior information about the connectivity matrix. The algorithms currently available crucially depend on the completeness of this prior information. However, inaccuracies in the measurement process may render incompleteness in the available knowledge about the connectivity matrix. Hence, computationally efficient algorithms are needed to overcome the possible incompleteness in the available data. We present a sparse network component analysis algorithm (sparseNCA), which incorporates the effect of incompleteness in the estimation of TRNs by imposing an additional sparsity constraint using the ℓ1 norm, which results in a greater estimation accuracy. In order to improve the computational efficiency, an iterative re-weighted ℓ2 method is proposed for the NCA problem which not only promotes sparsity but is hundreds of times faster than the ℓ1 norm based solution. The performance of sparseNCA is rigorously compared to that of FastNCA and NINCA using synthetic data as well as real data. It is shown that sparseNCA outperforms the existing state-of-the-art algorithms both in terms of estimation accuracy and consistency with the added advantage of low computational complexity. The performance of sparseNCA compared to its predecessors is particularly pronounced in case of incomplete prior information about the sparsity of the network. Subnetwork analysis is performed on the E.coli data which reiterates the superior consistency of the proposed algorithm.

KW - incomplete prior

KW - Network component analysis

KW - transcription factor activity estimation

UR - http://www.scopus.com/inward/record.url?scp=85044959163&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044959163&partnerID=8YFLogxK

U2 - 10.1109/TCBB.2015.2495224

DO - 10.1109/TCBB.2015.2495224

M3 - Article

C2 - 26529780

AN - SCOPUS:85044959163

VL - 15

SP - 387

EP - 395

JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics

JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics

SN - 1545-5963

IS - 2

ER -