Automated detection of anomalous shipping manifests to identify illicit trade

Antonio Sanfilippo, Satish Chikkagoudar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe an approach to analyzing anomalies in trade data based on the identification of cluster outliers. The approach uses unsupervised machine learning methods to discover semantically coherent clusters of shipping records in large collections of trade data. Trade data with cluster annotations are then used as input to a supervised machine learning algorithm to train and evaluate a classification model capable of identifying members of each cluster. The evaluation of this classification model provides an assessment of cluster coherence. Outliers are identified for each cluster by measuring the Euclidean distance from each member of the cluster to the cluster centroid, and then selecting a percentile threshold to identify shipping records with extreme distances from the cluster centroid. We describe a specific application of this approach to a dataset of 2.36M records for containerized shipments, with specific reference to the detection of anomalies potentially related to nuclear smuggling. Results show that this approach succeeds in finding semantically coherent clusters of shipping records, and identifying outliers that may help facilitate the detection of illicit trade.

Original languageEnglish
Title of host publication2013 IEEE International Conference on Technologies for Homeland Security, HST 2013
Pages529-534
Number of pages6
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event2013 13th IEEE International Conference on Technologies for Homeland Security, HST 2013 - Waltham, MA
Duration: 12 Nov 201314 Nov 2013

Other

Other2013 13th IEEE International Conference on Technologies for Homeland Security, HST 2013
CityWaltham, MA
Period12/11/1314/11/13

Fingerprint

shipping
smuggling
learning method
evaluation
learning

Keywords

  • classification
  • clustering
  • detection of radiological threat materials
  • illicit trafficking
  • nuclear smuggling
  • trade data
  • visual analytics

ASJC Scopus subject areas

  • Public Administration

Cite this

Sanfilippo, A., & Chikkagoudar, S. (2013). Automated detection of anomalous shipping manifests to identify illicit trade. In 2013 IEEE International Conference on Technologies for Homeland Security, HST 2013 (pp. 529-534). [6699059] https://doi.org/10.1109/THS.2013.6699059

Automated detection of anomalous shipping manifests to identify illicit trade. / Sanfilippo, Antonio; Chikkagoudar, Satish.

2013 IEEE International Conference on Technologies for Homeland Security, HST 2013. 2013. p. 529-534 6699059.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sanfilippo, A & Chikkagoudar, S 2013, Automated detection of anomalous shipping manifests to identify illicit trade. in 2013 IEEE International Conference on Technologies for Homeland Security, HST 2013., 6699059, pp. 529-534, 2013 13th IEEE International Conference on Technologies for Homeland Security, HST 2013, Waltham, MA, 12/11/13. https://doi.org/10.1109/THS.2013.6699059
Sanfilippo A, Chikkagoudar S. Automated detection of anomalous shipping manifests to identify illicit trade. In 2013 IEEE International Conference on Technologies for Homeland Security, HST 2013. 2013. p. 529-534. 6699059 https://doi.org/10.1109/THS.2013.6699059
Sanfilippo, Antonio ; Chikkagoudar, Satish. / Automated detection of anomalous shipping manifests to identify illicit trade. 2013 IEEE International Conference on Technologies for Homeland Security, HST 2013. 2013. pp. 529-534
@inproceedings{0bee4ca08ce240c4ada287c2036b531e,
title = "Automated detection of anomalous shipping manifests to identify illicit trade",
abstract = "We describe an approach to analyzing anomalies in trade data based on the identification of cluster outliers. The approach uses unsupervised machine learning methods to discover semantically coherent clusters of shipping records in large collections of trade data. Trade data with cluster annotations are then used as input to a supervised machine learning algorithm to train and evaluate a classification model capable of identifying members of each cluster. The evaluation of this classification model provides an assessment of cluster coherence. Outliers are identified for each cluster by measuring the Euclidean distance from each member of the cluster to the cluster centroid, and then selecting a percentile threshold to identify shipping records with extreme distances from the cluster centroid. We describe a specific application of this approach to a dataset of 2.36M records for containerized shipments, with specific reference to the detection of anomalies potentially related to nuclear smuggling. Results show that this approach succeeds in finding semantically coherent clusters of shipping records, and identifying outliers that may help facilitate the detection of illicit trade.",
keywords = "classification, clustering, detection of radiological threat materials, illicit trafficking, nuclear smuggling, trade data, visual analytics",
author = "Antonio Sanfilippo and Satish Chikkagoudar",
year = "2013",
doi = "10.1109/THS.2013.6699059",
language = "English",
isbn = "9781479915354",
pages = "529--534",
booktitle = "2013 IEEE International Conference on Technologies for Homeland Security, HST 2013",

}

TY - GEN

T1 - Automated detection of anomalous shipping manifests to identify illicit trade

AU - Sanfilippo, Antonio

AU - Chikkagoudar, Satish

PY - 2013

Y1 - 2013

N2 - We describe an approach to analyzing anomalies in trade data based on the identification of cluster outliers. The approach uses unsupervised machine learning methods to discover semantically coherent clusters of shipping records in large collections of trade data. Trade data with cluster annotations are then used as input to a supervised machine learning algorithm to train and evaluate a classification model capable of identifying members of each cluster. The evaluation of this classification model provides an assessment of cluster coherence. Outliers are identified for each cluster by measuring the Euclidean distance from each member of the cluster to the cluster centroid, and then selecting a percentile threshold to identify shipping records with extreme distances from the cluster centroid. We describe a specific application of this approach to a dataset of 2.36M records for containerized shipments, with specific reference to the detection of anomalies potentially related to nuclear smuggling. Results show that this approach succeeds in finding semantically coherent clusters of shipping records, and identifying outliers that may help facilitate the detection of illicit trade.

AB - We describe an approach to analyzing anomalies in trade data based on the identification of cluster outliers. The approach uses unsupervised machine learning methods to discover semantically coherent clusters of shipping records in large collections of trade data. Trade data with cluster annotations are then used as input to a supervised machine learning algorithm to train and evaluate a classification model capable of identifying members of each cluster. The evaluation of this classification model provides an assessment of cluster coherence. Outliers are identified for each cluster by measuring the Euclidean distance from each member of the cluster to the cluster centroid, and then selecting a percentile threshold to identify shipping records with extreme distances from the cluster centroid. We describe a specific application of this approach to a dataset of 2.36M records for containerized shipments, with specific reference to the detection of anomalies potentially related to nuclear smuggling. Results show that this approach succeeds in finding semantically coherent clusters of shipping records, and identifying outliers that may help facilitate the detection of illicit trade.

KW - classification

KW - clustering

KW - detection of radiological threat materials

KW - illicit trafficking

KW - nuclear smuggling

KW - trade data

KW - visual analytics

UR - http://www.scopus.com/inward/record.url?scp=84893235998&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893235998&partnerID=8YFLogxK

U2 - 10.1109/THS.2013.6699059

DO - 10.1109/THS.2013.6699059

M3 - Conference contribution

SN - 9781479915354

SP - 529

EP - 534

BT - 2013 IEEE International Conference on Technologies for Homeland Security, HST 2013

ER -