Abstract
In this paper we propose new algorithmic techniques for massively data parallel computation of the Likelihood Ratio Test (LRT) on a large spatial data grid. LRT is the state-of-the-art method for identifying hotspots or anomalous regions in spatially referenced data. LRT is highly adaptable permitting the use of a large class of statistical distributions to model the data. However, standard sequential implementations of LRT may take several days on modern machines to identify anomalous regions even for moderately sized spatial grids. This work claims three novel contributions. First, we devise a dynamic program with a pre-processing step of O(n2) that allows us to compute the statistic for any given region in O(1), where n is the length of the grid. Second, we propose a scheme to accelerate the likelihood computation of a complement region using a bounding technique. Third, we provide a parallelization strategy for the LRT computation on GPGPUs. In concert all three contributions result in a speed up of nearly four hundred times reducing the LRT computation time of large spatial grids from several days to minutes.
Original language | English |
---|---|
Title of host publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Pages | 595-608 |
Number of pages | 14 |
Volume | 7808 LNCS |
DOIs | |
Publication status | Published - 2013 |
Externally published | Yes |
Event | 15th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2013 - Sydney, NSW Duration: 4 Apr 2013 → 6 Apr 2013 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 7808 LNCS |
ISSN (Print) | 03029743 |
ISSN (Electronic) | 16113349 |
Other
Other | 15th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2013 |
---|---|
City | Sydney, NSW |
Period | 4/4/13 → 6/4/13 |
Fingerprint
Keywords
- 1EXP
- GPGPUs
- LRT
- Spatial outlier
- upper-bounding
ASJC Scopus subject areas
- Computer Science(all)
- Theoretical Computer Science
Cite this
A scalable approach for LRT computation in GPGPU environments. / Pang, Linsey Xiaolin; Chawla, Sanjay; Scholz, Bernhard; Wilcox, Georgina.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7808 LNCS 2013. p. 595-608 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7808 LNCS).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
}
TY - GEN
T1 - A scalable approach for LRT computation in GPGPU environments
AU - Pang, Linsey Xiaolin
AU - Chawla, Sanjay
AU - Scholz, Bernhard
AU - Wilcox, Georgina
PY - 2013
Y1 - 2013
N2 - In this paper we propose new algorithmic techniques for massively data parallel computation of the Likelihood Ratio Test (LRT) on a large spatial data grid. LRT is the state-of-the-art method for identifying hotspots or anomalous regions in spatially referenced data. LRT is highly adaptable permitting the use of a large class of statistical distributions to model the data. However, standard sequential implementations of LRT may take several days on modern machines to identify anomalous regions even for moderately sized spatial grids. This work claims three novel contributions. First, we devise a dynamic program with a pre-processing step of O(n2) that allows us to compute the statistic for any given region in O(1), where n is the length of the grid. Second, we propose a scheme to accelerate the likelihood computation of a complement region using a bounding technique. Third, we provide a parallelization strategy for the LRT computation on GPGPUs. In concert all three contributions result in a speed up of nearly four hundred times reducing the LRT computation time of large spatial grids from several days to minutes.
AB - In this paper we propose new algorithmic techniques for massively data parallel computation of the Likelihood Ratio Test (LRT) on a large spatial data grid. LRT is the state-of-the-art method for identifying hotspots or anomalous regions in spatially referenced data. LRT is highly adaptable permitting the use of a large class of statistical distributions to model the data. However, standard sequential implementations of LRT may take several days on modern machines to identify anomalous regions even for moderately sized spatial grids. This work claims three novel contributions. First, we devise a dynamic program with a pre-processing step of O(n2) that allows us to compute the statistic for any given region in O(1), where n is the length of the grid. Second, we propose a scheme to accelerate the likelihood computation of a complement region using a bounding technique. Third, we provide a parallelization strategy for the LRT computation on GPGPUs. In concert all three contributions result in a speed up of nearly four hundred times reducing the LRT computation time of large spatial grids from several days to minutes.
KW - 1EXP
KW - GPGPUs
KW - LRT
KW - Spatial outlier
KW - upper-bounding
UR - http://www.scopus.com/inward/record.url?scp=84875850803&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84875850803&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-37401-2_58
DO - 10.1007/978-3-642-37401-2_58
M3 - Conference contribution
AN - SCOPUS:84875850803
SN - 9783642374005
VL - 7808 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 595
EP - 608
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ER -