Computational assignment of cell-cycle stage from single-cell transcriptome data

Antonio Scialdone, Kedar N. Natarajan, Luis Miguel Rodrigues Saraiva, Valentina Proserpio, Sarah A. Teichmann, Oliver Stegle, John C. Marioni, Florian Buettner

Research output: Contribution to journalArticle

104 Citations (Scopus)

Abstract

The transcriptome of single cells can reveal important information about cellular states and heterogeneity within populations of cells. Recently, single-cell RNA-sequencing has facilitated expression profiling of large numbers of single cells in parallel. To fully exploit these data, it is critical that suitable computational approaches are developed. One key challenge, especially pertinent when considering dividing populations of cells, is to understand the cell-cycle stage of each captured cell. Here we describe and compare five established supervised machine learning methods and a custom-built predictor for allocating cells to their cell-cycle stage on the basis of their transcriptome. In particular, we assess the impact of different normalisation strategies and the usage of prior knowledge on the predictive power of the classifiers. We tested the methods on previously published datasets and found that a PCA-based approach and the custom predictor performed best. Moreover, our analysis shows that the performance depends strongly on normalisation and the usage of prior knowledge. Only by leveraging prior knowledge in form of cell-cycle annotated genes and by preprocessing the data using a rank-based normalisation, is it possible to robustly capture the transcriptional cell-cycle signature across different cell types, organisms and experimental protocols.

Original languageEnglish
Pages (from-to)54-61
Number of pages8
JournalMethods
Volume85
DOIs
Publication statusPublished - 1 Sep 2015
Externally publishedYes

Fingerprint

Transcriptome
Cell Cycle
Cells
RNA Sequence Analysis
cdc Genes
Passive Cutaneous Anaphylaxis
Population Characteristics
Learning systems
Classifiers
Genes
RNA
Cell Count
Population

Keywords

  • Cell cycle
  • Computational biology
  • Machine learning
  • Single cell RNA-seq

ASJC Scopus subject areas

  • Molecular Biology
  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Scialdone, A., Natarajan, K. N., Rodrigues Saraiva, L. M., Proserpio, V., Teichmann, S. A., Stegle, O., ... Buettner, F. (2015). Computational assignment of cell-cycle stage from single-cell transcriptome data. Methods, 85, 54-61. https://doi.org/10.1016/j.ymeth.2015.06.021

Computational assignment of cell-cycle stage from single-cell transcriptome data. / Scialdone, Antonio; Natarajan, Kedar N.; Rodrigues Saraiva, Luis Miguel; Proserpio, Valentina; Teichmann, Sarah A.; Stegle, Oliver; Marioni, John C.; Buettner, Florian.

In: Methods, Vol. 85, 01.09.2015, p. 54-61.

Research output: Contribution to journalArticle

Scialdone, A, Natarajan, KN, Rodrigues Saraiva, LM, Proserpio, V, Teichmann, SA, Stegle, O, Marioni, JC & Buettner, F 2015, 'Computational assignment of cell-cycle stage from single-cell transcriptome data', Methods, vol. 85, pp. 54-61. https://doi.org/10.1016/j.ymeth.2015.06.021
Scialdone, Antonio ; Natarajan, Kedar N. ; Rodrigues Saraiva, Luis Miguel ; Proserpio, Valentina ; Teichmann, Sarah A. ; Stegle, Oliver ; Marioni, John C. ; Buettner, Florian. / Computational assignment of cell-cycle stage from single-cell transcriptome data. In: Methods. 2015 ; Vol. 85. pp. 54-61.
@article{6584b42b4aec43249dcb9a1406f4608c,
title = "Computational assignment of cell-cycle stage from single-cell transcriptome data",
abstract = "The transcriptome of single cells can reveal important information about cellular states and heterogeneity within populations of cells. Recently, single-cell RNA-sequencing has facilitated expression profiling of large numbers of single cells in parallel. To fully exploit these data, it is critical that suitable computational approaches are developed. One key challenge, especially pertinent when considering dividing populations of cells, is to understand the cell-cycle stage of each captured cell. Here we describe and compare five established supervised machine learning methods and a custom-built predictor for allocating cells to their cell-cycle stage on the basis of their transcriptome. In particular, we assess the impact of different normalisation strategies and the usage of prior knowledge on the predictive power of the classifiers. We tested the methods on previously published datasets and found that a PCA-based approach and the custom predictor performed best. Moreover, our analysis shows that the performance depends strongly on normalisation and the usage of prior knowledge. Only by leveraging prior knowledge in form of cell-cycle annotated genes and by preprocessing the data using a rank-based normalisation, is it possible to robustly capture the transcriptional cell-cycle signature across different cell types, organisms and experimental protocols.",
keywords = "Cell cycle, Computational biology, Machine learning, Single cell RNA-seq",
author = "Antonio Scialdone and Natarajan, {Kedar N.} and {Rodrigues Saraiva}, {Luis Miguel} and Valentina Proserpio and Teichmann, {Sarah A.} and Oliver Stegle and Marioni, {John C.} and Florian Buettner",
year = "2015",
month = "9",
day = "1",
doi = "10.1016/j.ymeth.2015.06.021",
language = "English",
volume = "85",
pages = "54--61",
journal = "Methods",
issn = "1046-2023",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Computational assignment of cell-cycle stage from single-cell transcriptome data

AU - Scialdone, Antonio

AU - Natarajan, Kedar N.

AU - Rodrigues Saraiva, Luis Miguel

AU - Proserpio, Valentina

AU - Teichmann, Sarah A.

AU - Stegle, Oliver

AU - Marioni, John C.

AU - Buettner, Florian

PY - 2015/9/1

Y1 - 2015/9/1

N2 - The transcriptome of single cells can reveal important information about cellular states and heterogeneity within populations of cells. Recently, single-cell RNA-sequencing has facilitated expression profiling of large numbers of single cells in parallel. To fully exploit these data, it is critical that suitable computational approaches are developed. One key challenge, especially pertinent when considering dividing populations of cells, is to understand the cell-cycle stage of each captured cell. Here we describe and compare five established supervised machine learning methods and a custom-built predictor for allocating cells to their cell-cycle stage on the basis of their transcriptome. In particular, we assess the impact of different normalisation strategies and the usage of prior knowledge on the predictive power of the classifiers. We tested the methods on previously published datasets and found that a PCA-based approach and the custom predictor performed best. Moreover, our analysis shows that the performance depends strongly on normalisation and the usage of prior knowledge. Only by leveraging prior knowledge in form of cell-cycle annotated genes and by preprocessing the data using a rank-based normalisation, is it possible to robustly capture the transcriptional cell-cycle signature across different cell types, organisms and experimental protocols.

AB - The transcriptome of single cells can reveal important information about cellular states and heterogeneity within populations of cells. Recently, single-cell RNA-sequencing has facilitated expression profiling of large numbers of single cells in parallel. To fully exploit these data, it is critical that suitable computational approaches are developed. One key challenge, especially pertinent when considering dividing populations of cells, is to understand the cell-cycle stage of each captured cell. Here we describe and compare five established supervised machine learning methods and a custom-built predictor for allocating cells to their cell-cycle stage on the basis of their transcriptome. In particular, we assess the impact of different normalisation strategies and the usage of prior knowledge on the predictive power of the classifiers. We tested the methods on previously published datasets and found that a PCA-based approach and the custom predictor performed best. Moreover, our analysis shows that the performance depends strongly on normalisation and the usage of prior knowledge. Only by leveraging prior knowledge in form of cell-cycle annotated genes and by preprocessing the data using a rank-based normalisation, is it possible to robustly capture the transcriptional cell-cycle signature across different cell types, organisms and experimental protocols.

KW - Cell cycle

KW - Computational biology

KW - Machine learning

KW - Single cell RNA-seq

UR - http://www.scopus.com/inward/record.url?scp=84939772971&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939772971&partnerID=8YFLogxK

U2 - 10.1016/j.ymeth.2015.06.021

DO - 10.1016/j.ymeth.2015.06.021

M3 - Article

C2 - 26142758

AN - SCOPUS:84939772971

VL - 85

SP - 54

EP - 61

JO - Methods

JF - Methods

SN - 1046-2023

ER -