Exploiting Financial News and Social Media Opinions for Stock Market Analysis using MCMC Bayesian Inference

Manolis Maragoudakis, Dimitrios Serpanos

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Stock market analysis by using Information and Communication Technology methods is a dynamic and volatile domain. Over the past years, there has been an increasing focus on the development of modeling tools, especially when the expected outcomes appear to yield significant profits to the investors’ portfolios. In alignment with modern globalized economy, the available resources are becoming gradually more plentiful, thus difficult to be analyzed by standard statistical tools. Thus far, there have been a number of research papers that emphasize solely in past data from stock bond prices and other technical indicators. Nevertheless, throughout recent studies, prediction is also based on textual information, based on the logical assumption that the course of a stock price can also be affected by news articles and perhaps by public opinions, as posted on various Web 2.0 platforms. Despite the recent advances in Natural Language Processing and Data Mining, when data tend to grow both in number of records and attributes, numerous mining algorithms face significant difficulties, resulting in poor forecast ability. The aim of this study is to propose a potential answer to the problem, by considering a Markov Chain Monte Carlo Bayesian Inference approach, which estimates conditional probability distributions in structures obtained from a Tree-Augmented Naïve Bayes algorithm. The novelty of this study is based on the fact that technical analysis contains the event and not the cause of the change, while textual data may interpret that cause. The paper takes into account a large number of technical indices, accompanied with features that are extracted by a text mining methodology, from financial news articles and opinions posted in different social media platforms. Previous research has demonstrated that due to the high-dimensionality and sparseness of such data, the majority of widespread Data Mining algorithms suffer from either convergence or accuracy problems. Results acquired from the experimental phase, including a virtual trading experiment, are promising. Certainly, as it is tedious for a human investor to read all daily news concerning a company and other financial information, a prediction system that could analyze such textual resources and find relations with price movement at future time frames is valuable.

Original languageEnglish
JournalComputational Economics
DOIs
Publication statusAccepted/In press - 25 Feb 2015

Fingerprint

Data mining
Trees (mathematics)
Markov processes
Probability distributions
Profitability
Communication
Processing
Financial markets
Markov chain Monte Carlo
Bayesian inference
Stock market
News media
Market analysis
Social media
News
Industry
Experiments
Resources
Investors
Prediction

Keywords

  • Data mining
  • Hierarchical Bayesian methods
  • Stock return forecasting
  • Trading strategies

ASJC Scopus subject areas

  • Economics, Econometrics and Finance (miscellaneous)
  • Computer Science Applications

Cite this

Exploiting Financial News and Social Media Opinions for Stock Market Analysis using MCMC Bayesian Inference. / Maragoudakis, Manolis; Serpanos, Dimitrios.

In: Computational Economics, 25.02.2015.

Research output: Contribution to journalArticle

@article{9391de0dbc224ddf9f6054fc43f2a7a9,
title = "Exploiting Financial News and Social Media Opinions for Stock Market Analysis using MCMC Bayesian Inference",
abstract = "Stock market analysis by using Information and Communication Technology methods is a dynamic and volatile domain. Over the past years, there has been an increasing focus on the development of modeling tools, especially when the expected outcomes appear to yield significant profits to the investors’ portfolios. In alignment with modern globalized economy, the available resources are becoming gradually more plentiful, thus difficult to be analyzed by standard statistical tools. Thus far, there have been a number of research papers that emphasize solely in past data from stock bond prices and other technical indicators. Nevertheless, throughout recent studies, prediction is also based on textual information, based on the logical assumption that the course of a stock price can also be affected by news articles and perhaps by public opinions, as posted on various Web 2.0 platforms. Despite the recent advances in Natural Language Processing and Data Mining, when data tend to grow both in number of records and attributes, numerous mining algorithms face significant difficulties, resulting in poor forecast ability. The aim of this study is to propose a potential answer to the problem, by considering a Markov Chain Monte Carlo Bayesian Inference approach, which estimates conditional probability distributions in structures obtained from a Tree-Augmented Na{\"i}ve Bayes algorithm. The novelty of this study is based on the fact that technical analysis contains the event and not the cause of the change, while textual data may interpret that cause. The paper takes into account a large number of technical indices, accompanied with features that are extracted by a text mining methodology, from financial news articles and opinions posted in different social media platforms. Previous research has demonstrated that due to the high-dimensionality and sparseness of such data, the majority of widespread Data Mining algorithms suffer from either convergence or accuracy problems. Results acquired from the experimental phase, including a virtual trading experiment, are promising. Certainly, as it is tedious for a human investor to read all daily news concerning a company and other financial information, a prediction system that could analyze such textual resources and find relations with price movement at future time frames is valuable.",
keywords = "Data mining, Hierarchical Bayesian methods, Stock return forecasting, Trading strategies",
author = "Manolis Maragoudakis and Dimitrios Serpanos",
year = "2015",
month = "2",
day = "25",
doi = "10.1007/s10614-015-9492-9",
language = "English",
journal = "Computer Science in Economics and Management",
issn = "0921-2736",
publisher = "Springer Netherlands",

}

TY - JOUR

T1 - Exploiting Financial News and Social Media Opinions for Stock Market Analysis using MCMC Bayesian Inference

AU - Maragoudakis, Manolis

AU - Serpanos, Dimitrios

PY - 2015/2/25

Y1 - 2015/2/25

N2 - Stock market analysis by using Information and Communication Technology methods is a dynamic and volatile domain. Over the past years, there has been an increasing focus on the development of modeling tools, especially when the expected outcomes appear to yield significant profits to the investors’ portfolios. In alignment with modern globalized economy, the available resources are becoming gradually more plentiful, thus difficult to be analyzed by standard statistical tools. Thus far, there have been a number of research papers that emphasize solely in past data from stock bond prices and other technical indicators. Nevertheless, throughout recent studies, prediction is also based on textual information, based on the logical assumption that the course of a stock price can also be affected by news articles and perhaps by public opinions, as posted on various Web 2.0 platforms. Despite the recent advances in Natural Language Processing and Data Mining, when data tend to grow both in number of records and attributes, numerous mining algorithms face significant difficulties, resulting in poor forecast ability. The aim of this study is to propose a potential answer to the problem, by considering a Markov Chain Monte Carlo Bayesian Inference approach, which estimates conditional probability distributions in structures obtained from a Tree-Augmented Naïve Bayes algorithm. The novelty of this study is based on the fact that technical analysis contains the event and not the cause of the change, while textual data may interpret that cause. The paper takes into account a large number of technical indices, accompanied with features that are extracted by a text mining methodology, from financial news articles and opinions posted in different social media platforms. Previous research has demonstrated that due to the high-dimensionality and sparseness of such data, the majority of widespread Data Mining algorithms suffer from either convergence or accuracy problems. Results acquired from the experimental phase, including a virtual trading experiment, are promising. Certainly, as it is tedious for a human investor to read all daily news concerning a company and other financial information, a prediction system that could analyze such textual resources and find relations with price movement at future time frames is valuable.

AB - Stock market analysis by using Information and Communication Technology methods is a dynamic and volatile domain. Over the past years, there has been an increasing focus on the development of modeling tools, especially when the expected outcomes appear to yield significant profits to the investors’ portfolios. In alignment with modern globalized economy, the available resources are becoming gradually more plentiful, thus difficult to be analyzed by standard statistical tools. Thus far, there have been a number of research papers that emphasize solely in past data from stock bond prices and other technical indicators. Nevertheless, throughout recent studies, prediction is also based on textual information, based on the logical assumption that the course of a stock price can also be affected by news articles and perhaps by public opinions, as posted on various Web 2.0 platforms. Despite the recent advances in Natural Language Processing and Data Mining, when data tend to grow both in number of records and attributes, numerous mining algorithms face significant difficulties, resulting in poor forecast ability. The aim of this study is to propose a potential answer to the problem, by considering a Markov Chain Monte Carlo Bayesian Inference approach, which estimates conditional probability distributions in structures obtained from a Tree-Augmented Naïve Bayes algorithm. The novelty of this study is based on the fact that technical analysis contains the event and not the cause of the change, while textual data may interpret that cause. The paper takes into account a large number of technical indices, accompanied with features that are extracted by a text mining methodology, from financial news articles and opinions posted in different social media platforms. Previous research has demonstrated that due to the high-dimensionality and sparseness of such data, the majority of widespread Data Mining algorithms suffer from either convergence or accuracy problems. Results acquired from the experimental phase, including a virtual trading experiment, are promising. Certainly, as it is tedious for a human investor to read all daily news concerning a company and other financial information, a prediction system that could analyze such textual resources and find relations with price movement at future time frames is valuable.

KW - Data mining

KW - Hierarchical Bayesian methods

KW - Stock return forecasting

KW - Trading strategies

UR - http://www.scopus.com/inward/record.url?scp=84923349761&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84923349761&partnerID=8YFLogxK

U2 - 10.1007/s10614-015-9492-9

DO - 10.1007/s10614-015-9492-9

M3 - Article

AN - SCOPUS:84923349761

JO - Computer Science in Economics and Management

JF - Computer Science in Economics and Management

SN - 0921-2736

ER -