AMAL

High-fidelity, behavior-based automated malware analysis and classification

Aziz Mohaisen, Omar Alrawi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

This paper introduces AMAL, an operational automated and behavior-based malware analysis and labeling (classification and clustering) system that addresses many limitations and shortcomings of the existing academic and industrial systems. AMAL consists of two subsystems, AutoMal and MaLabel. AutoMal provides tools to collect low granularity behavioral artifacts that characterize malware usage of the file system, memory, network, and registry, and does that by running malware samples in virtualized environments. On the other hand, MaLabel uses those artifacts to create representative features, use them for building classifiers trained by manually-vetted training samples, and use those classifiers to classify malware samples into families similar in behavior. AutoMal also enables unsupervised learning, by implementing multiple clustering algorithms for samples grouping. An evaluation of both AutoMal and MaLabel based on medium-scale (4,000 samples) and largescale datasets (more than 115,000 samples)—collected and analyzed by AutoMal over 13 months—show AMAL’s effectiveness in accurately characterizing, classifying, and grouping malware samples. MaLabel achieves a precision of 99.5% and recall of 99.6% for certain families’ classification, and more than 98% of precision and recall for unsupervised clustering. Several benchmarks, costs estimates and measurements highlight and support the merits and features of AMAL.

Original languageEnglish
Title of host publicationInformation Security Applications - 15th International Workshop, WISA 2014, Revised Selected Papers
PublisherSpringer Verlag
Pages107-121
Number of pages15
Volume8909
ISBN (Electronic)9783319150864
DOIs
Publication statusPublished - 2015
Event15th International Workshop on Information Security Applications, WISA 2014 - , Korea, Republic of
Duration: 25 Aug 201427 Aug 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8909
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other15th International Workshop on Information Security Applications, WISA 2014
CountryKorea, Republic of
Period25/8/1427/8/14

Fingerprint

Malware
Fidelity
Classifiers
Grouping
Classifier
Unsupervised learning
Unsupervised Clustering
Computer networks
Clustering algorithms
File System
Unsupervised Learning
Labeling
Training Samples
Granularity
Computer systems
Clustering Algorithm
Subsystem
Data storage equipment
Classify
Clustering

Keywords

  • Automatic analysis
  • Classification
  • Malware

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Mohaisen, A., & Alrawi, O. (2015). AMAL: High-fidelity, behavior-based automated malware analysis and classification. In Information Security Applications - 15th International Workshop, WISA 2014, Revised Selected Papers (Vol. 8909, pp. 107-121). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8909). Springer Verlag. https://doi.org/10.1007/978-3-319-15087-1_9

AMAL : High-fidelity, behavior-based automated malware analysis and classification. / Mohaisen, Aziz; Alrawi, Omar.

Information Security Applications - 15th International Workshop, WISA 2014, Revised Selected Papers. Vol. 8909 Springer Verlag, 2015. p. 107-121 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8909).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mohaisen, A & Alrawi, O 2015, AMAL: High-fidelity, behavior-based automated malware analysis and classification. in Information Security Applications - 15th International Workshop, WISA 2014, Revised Selected Papers. vol. 8909, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8909, Springer Verlag, pp. 107-121, 15th International Workshop on Information Security Applications, WISA 2014, Korea, Republic of, 25/8/14. https://doi.org/10.1007/978-3-319-15087-1_9
Mohaisen A, Alrawi O. AMAL: High-fidelity, behavior-based automated malware analysis and classification. In Information Security Applications - 15th International Workshop, WISA 2014, Revised Selected Papers. Vol. 8909. Springer Verlag. 2015. p. 107-121. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-15087-1_9
Mohaisen, Aziz ; Alrawi, Omar. / AMAL : High-fidelity, behavior-based automated malware analysis and classification. Information Security Applications - 15th International Workshop, WISA 2014, Revised Selected Papers. Vol. 8909 Springer Verlag, 2015. pp. 107-121 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{181a2099d50e4063a6ebe18f397e73d9,
title = "AMAL: High-fidelity, behavior-based automated malware analysis and classification",
abstract = "This paper introduces AMAL, an operational automated and behavior-based malware analysis and labeling (classification and clustering) system that addresses many limitations and shortcomings of the existing academic and industrial systems. AMAL consists of two subsystems, AutoMal and MaLabel. AutoMal provides tools to collect low granularity behavioral artifacts that characterize malware usage of the file system, memory, network, and registry, and does that by running malware samples in virtualized environments. On the other hand, MaLabel uses those artifacts to create representative features, use them for building classifiers trained by manually-vetted training samples, and use those classifiers to classify malware samples into families similar in behavior. AutoMal also enables unsupervised learning, by implementing multiple clustering algorithms for samples grouping. An evaluation of both AutoMal and MaLabel based on medium-scale (4,000 samples) and largescale datasets (more than 115,000 samples)—collected and analyzed by AutoMal over 13 months—show AMAL’s effectiveness in accurately characterizing, classifying, and grouping malware samples. MaLabel achieves a precision of 99.5{\%} and recall of 99.6{\%} for certain families’ classification, and more than 98{\%} of precision and recall for unsupervised clustering. Several benchmarks, costs estimates and measurements highlight and support the merits and features of AMAL.",
keywords = "Automatic analysis, Classification, Malware",
author = "Aziz Mohaisen and Omar Alrawi",
year = "2015",
doi = "10.1007/978-3-319-15087-1_9",
language = "English",
volume = "8909",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "107--121",
booktitle = "Information Security Applications - 15th International Workshop, WISA 2014, Revised Selected Papers",

}

TY - GEN

T1 - AMAL

T2 - High-fidelity, behavior-based automated malware analysis and classification

AU - Mohaisen, Aziz

AU - Alrawi, Omar

PY - 2015

Y1 - 2015

N2 - This paper introduces AMAL, an operational automated and behavior-based malware analysis and labeling (classification and clustering) system that addresses many limitations and shortcomings of the existing academic and industrial systems. AMAL consists of two subsystems, AutoMal and MaLabel. AutoMal provides tools to collect low granularity behavioral artifacts that characterize malware usage of the file system, memory, network, and registry, and does that by running malware samples in virtualized environments. On the other hand, MaLabel uses those artifacts to create representative features, use them for building classifiers trained by manually-vetted training samples, and use those classifiers to classify malware samples into families similar in behavior. AutoMal also enables unsupervised learning, by implementing multiple clustering algorithms for samples grouping. An evaluation of both AutoMal and MaLabel based on medium-scale (4,000 samples) and largescale datasets (more than 115,000 samples)—collected and analyzed by AutoMal over 13 months—show AMAL’s effectiveness in accurately characterizing, classifying, and grouping malware samples. MaLabel achieves a precision of 99.5% and recall of 99.6% for certain families’ classification, and more than 98% of precision and recall for unsupervised clustering. Several benchmarks, costs estimates and measurements highlight and support the merits and features of AMAL.

AB - This paper introduces AMAL, an operational automated and behavior-based malware analysis and labeling (classification and clustering) system that addresses many limitations and shortcomings of the existing academic and industrial systems. AMAL consists of two subsystems, AutoMal and MaLabel. AutoMal provides tools to collect low granularity behavioral artifacts that characterize malware usage of the file system, memory, network, and registry, and does that by running malware samples in virtualized environments. On the other hand, MaLabel uses those artifacts to create representative features, use them for building classifiers trained by manually-vetted training samples, and use those classifiers to classify malware samples into families similar in behavior. AutoMal also enables unsupervised learning, by implementing multiple clustering algorithms for samples grouping. An evaluation of both AutoMal and MaLabel based on medium-scale (4,000 samples) and largescale datasets (more than 115,000 samples)—collected and analyzed by AutoMal over 13 months—show AMAL’s effectiveness in accurately characterizing, classifying, and grouping malware samples. MaLabel achieves a precision of 99.5% and recall of 99.6% for certain families’ classification, and more than 98% of precision and recall for unsupervised clustering. Several benchmarks, costs estimates and measurements highlight and support the merits and features of AMAL.

KW - Automatic analysis

KW - Classification

KW - Malware

UR - http://www.scopus.com/inward/record.url?scp=84922193440&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84922193440&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-15087-1_9

DO - 10.1007/978-3-319-15087-1_9

M3 - Conference contribution

VL - 8909

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 107

EP - 121

BT - Information Security Applications - 15th International Workshop, WISA 2014, Revised Selected Papers

PB - Springer Verlag

ER -