Early-stage event prediction for longitudinal data

Mahtab J. Fard, Sanjay Chawla, Chandan K. Reddy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Predicting event occurrence at an early stage in longitudinal studies is an important problem which has high practical value. As opposed to the standard classification and regression problems where a domain expert can provide the labels for the data in a reasonably short period of time, training data in such longitudinal studies must be obtained only by waiting for the occurrence of sufficient number of events. The main objective of this work is to predict the event occurrence in the future for a particular subject in the study using the data collected at the initial stages of a longitudinal study. In this paper, we propose a novel Early Stage Prediction (ESP) framework for building event prediction models which are trained at early stages of longitudinal studies. More specifically, we develop two probabilistic algorithms based on Naive Bayes and Tree-Augmented Naive Bayes (TAN), called ESP-NB and ESP-TAN, respectively, for early stage event prediction by modifying the posterior probability of event occurrence using different extrapolations that are based on Weibull and Lognormal distributions. The proposed framework is evaluated using a wide range of synthetic and real-world benchmark datasets. Our extensive set of experiments show that the proposed ESP framework is able to more accurately predict future event occurrences using only a limited amount of training data compared to the other alternative approaches.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages139-151
Number of pages13
Volume9651
ISBN (Print)9783319317526
DOIs
Publication statusPublished - 2016
Event20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2016 - Auckland, New Zealand
Duration: 19 Apr 201622 Apr 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9651
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2016
CountryNew Zealand
CityAuckland
Period19/4/1622/4/16

Fingerprint

Longitudinal Data
Longitudinal Study
Prediction
Naive Bayes
Trees (mathematics)
Predict
Probabilistic Algorithms
Log Normal Distribution
Extrapolation
Posterior Probability
Weibull Distribution
Labels
Period of time
Prediction Model
Regression
Benchmark
Sufficient
Alternatives
Range of data
Experiment

Keywords

  • Longitudinal data
  • Prediction
  • Regression
  • Survival analysis

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Fard, M. J., Chawla, S., & Reddy, C. K. (2016). Early-stage event prediction for longitudinal data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9651, pp. 139-151). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9651). Springer Verlag. https://doi.org/10.1007/978-3-319-31753-3_12

Early-stage event prediction for longitudinal data. / Fard, Mahtab J.; Chawla, Sanjay; Reddy, Chandan K.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 9651 Springer Verlag, 2016. p. 139-151 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9651).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fard, MJ, Chawla, S & Reddy, CK 2016, Early-stage event prediction for longitudinal data. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 9651, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9651, Springer Verlag, pp. 139-151, 20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2016, Auckland, New Zealand, 19/4/16. https://doi.org/10.1007/978-3-319-31753-3_12
Fard MJ, Chawla S, Reddy CK. Early-stage event prediction for longitudinal data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 9651. Springer Verlag. 2016. p. 139-151. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-31753-3_12
Fard, Mahtab J. ; Chawla, Sanjay ; Reddy, Chandan K. / Early-stage event prediction for longitudinal data. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 9651 Springer Verlag, 2016. pp. 139-151 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{aa6d0fab75f947b7a45b9cdf1b114fff,
title = "Early-stage event prediction for longitudinal data",
abstract = "Predicting event occurrence at an early stage in longitudinal studies is an important problem which has high practical value. As opposed to the standard classification and regression problems where a domain expert can provide the labels for the data in a reasonably short period of time, training data in such longitudinal studies must be obtained only by waiting for the occurrence of sufficient number of events. The main objective of this work is to predict the event occurrence in the future for a particular subject in the study using the data collected at the initial stages of a longitudinal study. In this paper, we propose a novel Early Stage Prediction (ESP) framework for building event prediction models which are trained at early stages of longitudinal studies. More specifically, we develop two probabilistic algorithms based on Naive Bayes and Tree-Augmented Naive Bayes (TAN), called ESP-NB and ESP-TAN, respectively, for early stage event prediction by modifying the posterior probability of event occurrence using different extrapolations that are based on Weibull and Lognormal distributions. The proposed framework is evaluated using a wide range of synthetic and real-world benchmark datasets. Our extensive set of experiments show that the proposed ESP framework is able to more accurately predict future event occurrences using only a limited amount of training data compared to the other alternative approaches.",
keywords = "Longitudinal data, Prediction, Regression, Survival analysis",
author = "Fard, {Mahtab J.} and Sanjay Chawla and Reddy, {Chandan K.}",
year = "2016",
doi = "10.1007/978-3-319-31753-3_12",
language = "English",
isbn = "9783319317526",
volume = "9651",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "139--151",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Early-stage event prediction for longitudinal data

AU - Fard, Mahtab J.

AU - Chawla, Sanjay

AU - Reddy, Chandan K.

PY - 2016

Y1 - 2016

N2 - Predicting event occurrence at an early stage in longitudinal studies is an important problem which has high practical value. As opposed to the standard classification and regression problems where a domain expert can provide the labels for the data in a reasonably short period of time, training data in such longitudinal studies must be obtained only by waiting for the occurrence of sufficient number of events. The main objective of this work is to predict the event occurrence in the future for a particular subject in the study using the data collected at the initial stages of a longitudinal study. In this paper, we propose a novel Early Stage Prediction (ESP) framework for building event prediction models which are trained at early stages of longitudinal studies. More specifically, we develop two probabilistic algorithms based on Naive Bayes and Tree-Augmented Naive Bayes (TAN), called ESP-NB and ESP-TAN, respectively, for early stage event prediction by modifying the posterior probability of event occurrence using different extrapolations that are based on Weibull and Lognormal distributions. The proposed framework is evaluated using a wide range of synthetic and real-world benchmark datasets. Our extensive set of experiments show that the proposed ESP framework is able to more accurately predict future event occurrences using only a limited amount of training data compared to the other alternative approaches.

AB - Predicting event occurrence at an early stage in longitudinal studies is an important problem which has high practical value. As opposed to the standard classification and regression problems where a domain expert can provide the labels for the data in a reasonably short period of time, training data in such longitudinal studies must be obtained only by waiting for the occurrence of sufficient number of events. The main objective of this work is to predict the event occurrence in the future for a particular subject in the study using the data collected at the initial stages of a longitudinal study. In this paper, we propose a novel Early Stage Prediction (ESP) framework for building event prediction models which are trained at early stages of longitudinal studies. More specifically, we develop two probabilistic algorithms based on Naive Bayes and Tree-Augmented Naive Bayes (TAN), called ESP-NB and ESP-TAN, respectively, for early stage event prediction by modifying the posterior probability of event occurrence using different extrapolations that are based on Weibull and Lognormal distributions. The proposed framework is evaluated using a wide range of synthetic and real-world benchmark datasets. Our extensive set of experiments show that the proposed ESP framework is able to more accurately predict future event occurrences using only a limited amount of training data compared to the other alternative approaches.

KW - Longitudinal data

KW - Prediction

KW - Regression

KW - Survival analysis

UR - http://www.scopus.com/inward/record.url?scp=84964058324&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84964058324&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-31753-3_12

DO - 10.1007/978-3-319-31753-3_12

M3 - Conference contribution

SN - 9783319317526

VL - 9651

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 139

EP - 151

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -