AMAL: High-fidelity, behavior-based automated malware analysis and classification

Aziz Mohaisen, Omar Alrawi, Manar Mohaisen

    Research output: Contribution to journalArticle

    53 Citations (Scopus)

    Abstract

    This paper introduces AMAL, an automated and behavior-based malware analysis and labeling system that addresses shortcomings of the existing systems. AMAL consists of two sub-systems, AutoMal and MaLabel. AutoMal provides tools to collect low granularity behavioral artifacts that characterize malware usage of the file system, memory, network, and registry, and does that by running malware samples in virtualized environments. On the other hand, MaLabel uses those artifacts to create representative features, use them for building classifiers trained by manually vetted training samples, and use those classifiers to classify malware samples into families similar in behavior. AutoMal also enables unsupervised learning, by implementing multiple clustering algorithms for samples grouping. An evaluation of both AutoMal and MaLabel based on medium-scale (4000 samples) and large-scale datasets (more than 115,000 samples)-collected and analyzed by AutoMal over 13 months-shows AMAL's effectiveness in accurately characterizing, classifying, and grouping malware samples. MaLabel achieves a precision of 99.5% and recall of 99.6% for certain families' classification, and more than 98% of precision and recall for unsupervised clustering. Several benchmarks, cost estimates and measurements highlight the merits of AMAL.

    Original languageEnglish
    JournalComputers and Security
    DOIs
    Publication statusAccepted/In press - 15 Dec 2014

      Fingerprint

    Keywords

    • Automatic analysis
    • Classification
    • Clustering
    • Dynamic analysis
    • Machine learning
    • Malware

    ASJC Scopus subject areas

    • Computer Science(all)
    • Law

    Cite this