TCGAbiolinks

An R/Bioconductor package for integrative analysis of TCGA data

Antonio Colaprico, Tiago C. Silva, Catharina Olsen, Luciano Garofano, Claudia Cava, Davide Garolini, Thais S. Sabedot, Tathiane M. Malta, Stefano M. Pagnotta, Isabella Castiglioni, Michele Ceccarelli, Gianluca Bontempi, Houtan Noushmehr

Research output: Contribution to journalArticle

182 Citations (Scopus)

Abstract

The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGA's research network, opportunities still exist to implement novel methods, thereby elucidating new biological pathways and diagnostic markers. However, mining the TCGA data presents several bioinformatics challenges, such as data retrieval and integration with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to address these challenges and offer bioinformatics solutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incorporated methodologies developed in previous TCGA marker studies and in our own group. Using four different TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illustrate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries.

Original languageEnglish
Pages (from-to)e71
JournalNucleic Acids Research
Volume44
Issue number8
DOIs
Publication statusPublished - 5 May 2016

Fingerprint

Atlases
Genome
Neoplasms
Computational Biology
Workflow
Information Storage and Retrieval
DNA Methylation
Research
Epigenomics
Brain Neoplasms
Colon
Breast
RNA
Phenotype
Kidney

ASJC Scopus subject areas

  • Genetics

Cite this

Colaprico, A., Silva, T. C., Olsen, C., Garofano, L., Cava, C., Garolini, D., ... Noushmehr, H. (2016). TCGAbiolinks: An R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Research, 44(8), e71. https://doi.org/10.1093/nar/gkv1507

TCGAbiolinks : An R/Bioconductor package for integrative analysis of TCGA data. / Colaprico, Antonio; Silva, Tiago C.; Olsen, Catharina; Garofano, Luciano; Cava, Claudia; Garolini, Davide; Sabedot, Thais S.; Malta, Tathiane M.; Pagnotta, Stefano M.; Castiglioni, Isabella; Ceccarelli, Michele; Bontempi, Gianluca; Noushmehr, Houtan.

In: Nucleic Acids Research, Vol. 44, No. 8, 05.05.2016, p. e71.

Research output: Contribution to journalArticle

Colaprico, A, Silva, TC, Olsen, C, Garofano, L, Cava, C, Garolini, D, Sabedot, TS, Malta, TM, Pagnotta, SM, Castiglioni, I, Ceccarelli, M, Bontempi, G & Noushmehr, H 2016, 'TCGAbiolinks: An R/Bioconductor package for integrative analysis of TCGA data', Nucleic Acids Research, vol. 44, no. 8, pp. e71. https://doi.org/10.1093/nar/gkv1507
Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D et al. TCGAbiolinks: An R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Research. 2016 May 5;44(8):e71. https://doi.org/10.1093/nar/gkv1507
Colaprico, Antonio ; Silva, Tiago C. ; Olsen, Catharina ; Garofano, Luciano ; Cava, Claudia ; Garolini, Davide ; Sabedot, Thais S. ; Malta, Tathiane M. ; Pagnotta, Stefano M. ; Castiglioni, Isabella ; Ceccarelli, Michele ; Bontempi, Gianluca ; Noushmehr, Houtan. / TCGAbiolinks : An R/Bioconductor package for integrative analysis of TCGA data. In: Nucleic Acids Research. 2016 ; Vol. 44, No. 8. pp. e71.
@article{4ae2eb3d854340b1acbf0cd3405ea2eb,
title = "TCGAbiolinks: An R/Bioconductor package for integrative analysis of TCGA data",
abstract = "The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGA's research network, opportunities still exist to implement novel methods, thereby elucidating new biological pathways and diagnostic markers. However, mining the TCGA data presents several bioinformatics challenges, such as data retrieval and integration with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to address these challenges and offer bioinformatics solutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incorporated methodologies developed in previous TCGA marker studies and in our own group. Using four different TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illustrate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries.",
author = "Antonio Colaprico and Silva, {Tiago C.} and Catharina Olsen and Luciano Garofano and Claudia Cava and Davide Garolini and Sabedot, {Thais S.} and Malta, {Tathiane M.} and Pagnotta, {Stefano M.} and Isabella Castiglioni and Michele Ceccarelli and Gianluca Bontempi and Houtan Noushmehr",
year = "2016",
month = "5",
day = "5",
doi = "10.1093/nar/gkv1507",
language = "English",
volume = "44",
pages = "e71",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "8",

}

TY - JOUR

T1 - TCGAbiolinks

T2 - An R/Bioconductor package for integrative analysis of TCGA data

AU - Colaprico, Antonio

AU - Silva, Tiago C.

AU - Olsen, Catharina

AU - Garofano, Luciano

AU - Cava, Claudia

AU - Garolini, Davide

AU - Sabedot, Thais S.

AU - Malta, Tathiane M.

AU - Pagnotta, Stefano M.

AU - Castiglioni, Isabella

AU - Ceccarelli, Michele

AU - Bontempi, Gianluca

AU - Noushmehr, Houtan

PY - 2016/5/5

Y1 - 2016/5/5

N2 - The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGA's research network, opportunities still exist to implement novel methods, thereby elucidating new biological pathways and diagnostic markers. However, mining the TCGA data presents several bioinformatics challenges, such as data retrieval and integration with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to address these challenges and offer bioinformatics solutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incorporated methodologies developed in previous TCGA marker studies and in our own group. Using four different TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illustrate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries.

AB - The Cancer Genome Atlas (TCGA) research network has made public a large collection of clinical and molecular phenotypes of more than 10 000 tumor patients across 33 different tumor types. Using this cohort, TCGA has published over 20 marker papers detailing the genomic and epigenomic alterations associated with these tumor types. Although many important discoveries have been made by TCGA's research network, opportunities still exist to implement novel methods, thereby elucidating new biological pathways and diagnostic markers. However, mining the TCGA data presents several bioinformatics challenges, such as data retrieval and integration with clinical data and other molecular data types (e.g. RNA and DNA methylation). We developed an R/Bioconductor package called TCGAbiolinks to address these challenges and offer bioinformatics solutions by using a guided workflow to allow users to query, download and perform integrative analyses of TCGA data. We combined methods from computer science and statistics into the pipeline and incorporated methodologies developed in previous TCGA marker studies and in our own group. Using four different TCGA tumor types (Kidney, Brain, Breast and Colon) as examples, we provide case studies to illustrate examples of reproducibility, integrative analysis and utilization of different Bioconductor packages to advance and accelerate novel discoveries.

UR - http://www.scopus.com/inward/record.url?scp=84966269471&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84966269471&partnerID=8YFLogxK

U2 - 10.1093/nar/gkv1507

DO - 10.1093/nar/gkv1507

M3 - Article

VL - 44

SP - e71

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 8

ER -