Discovering malicious domains through passive DNS data graph analysis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

21 Citations (Scopus)

Abstract

Malicious domains are key components to a variety of cyber attacks. Several recent techniques are proposed to identify malicious domains through analysis of DNS data. The general approach is to build classifiers based on DNS-related local domain features. One potential problem is that many local features, e.g., domain name patterns and temporal patterns, tend to be not robust. Attackers could easily alter these features to evade detection without affecting much their attack capabilities. In this paper, we take a complementary approach. Instead of focusing on local features, we propose to discover and analyze global associations among domains. The key challenges are (1) to build meaningful associations among domains; and (2) to use these associations to reason about the potential maliciousness of domains. For the first challenge, we take advantage of the modus operandi of attackers. To avoid detection, malicious domains exhibit dynamic behavior by, for example, frequently changing the malicious domain-IP resolutions and creating new domains. This makes it very likely for attackers to reuse resources. It is indeed commonly observed that over a period of time multiple malicious domains are hosted on the same IPs and multiple IPs host the same malicious domains, which creates intrinsic association among them. For the second challenge, we develop a graph-based inference technique over associated domains. Our approach is based on the intuition that a domain having strong associations with known malicious domains is likely to be malicious. Carefully established associations enable the discovery of a large set of new malicious domains using a very small set of previously known malicious ones. Our experiments over a public passive DNS database show that the proposed technique can achieve high true positive rates (over 95%) while maintaining low false positive rates (less than 0.5%). Further, even with a small set of known malicious domains (a couple of hundreds), our technique can discover a large set of potential malicious domains (in the scale of up to tens of thousands).

Original languageEnglish
Title of host publicationASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery, Inc
Pages663-674
Number of pages12
ISBN (Electronic)9781450342339
DOIs
Publication statusPublished - 30 May 2016
Event11th ACM Asia Conference on Computer and Communications Security, ASIA CCS 2016 - Xi'an, China
Duration: 30 May 20163 Jun 2016

Other

Other11th ACM Asia Conference on Computer and Communications Security, ASIA CCS 2016
CountryChina
CityXi'an
Period30/5/163/6/16

Fingerprint

Classifiers
Experiments

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Computer Networks and Communications

Cite this

Khalil, I., Yu, T., & Guan, B. (2016). Discovering malicious domains through passive DNS data graph analysis. In ASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security (pp. 663-674). Association for Computing Machinery, Inc. https://doi.org/10.1145/2897845.2897877

Discovering malicious domains through passive DNS data graph analysis. / Khalil, Issa; Yu, Ting; Guan, Bei.

ASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security. Association for Computing Machinery, Inc, 2016. p. 663-674.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Khalil, I, Yu, T & Guan, B 2016, Discovering malicious domains through passive DNS data graph analysis. in ASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security. Association for Computing Machinery, Inc, pp. 663-674, 11th ACM Asia Conference on Computer and Communications Security, ASIA CCS 2016, Xi'an, China, 30/5/16. https://doi.org/10.1145/2897845.2897877
Khalil I, Yu T, Guan B. Discovering malicious domains through passive DNS data graph analysis. In ASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security. Association for Computing Machinery, Inc. 2016. p. 663-674 https://doi.org/10.1145/2897845.2897877
Khalil, Issa ; Yu, Ting ; Guan, Bei. / Discovering malicious domains through passive DNS data graph analysis. ASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security. Association for Computing Machinery, Inc, 2016. pp. 663-674
@inproceedings{ca5a62f1a49e40939a242e107ef6304e,
title = "Discovering malicious domains through passive DNS data graph analysis",
abstract = "Malicious domains are key components to a variety of cyber attacks. Several recent techniques are proposed to identify malicious domains through analysis of DNS data. The general approach is to build classifiers based on DNS-related local domain features. One potential problem is that many local features, e.g., domain name patterns and temporal patterns, tend to be not robust. Attackers could easily alter these features to evade detection without affecting much their attack capabilities. In this paper, we take a complementary approach. Instead of focusing on local features, we propose to discover and analyze global associations among domains. The key challenges are (1) to build meaningful associations among domains; and (2) to use these associations to reason about the potential maliciousness of domains. For the first challenge, we take advantage of the modus operandi of attackers. To avoid detection, malicious domains exhibit dynamic behavior by, for example, frequently changing the malicious domain-IP resolutions and creating new domains. This makes it very likely for attackers to reuse resources. It is indeed commonly observed that over a period of time multiple malicious domains are hosted on the same IPs and multiple IPs host the same malicious domains, which creates intrinsic association among them. For the second challenge, we develop a graph-based inference technique over associated domains. Our approach is based on the intuition that a domain having strong associations with known malicious domains is likely to be malicious. Carefully established associations enable the discovery of a large set of new malicious domains using a very small set of previously known malicious ones. Our experiments over a public passive DNS database show that the proposed technique can achieve high true positive rates (over 95{\%}) while maintaining low false positive rates (less than 0.5{\%}). Further, even with a small set of known malicious domains (a couple of hundreds), our technique can discover a large set of potential malicious domains (in the scale of up to tens of thousands).",
author = "Issa Khalil and Ting Yu and Bei Guan",
year = "2016",
month = "5",
day = "30",
doi = "10.1145/2897845.2897877",
language = "English",
pages = "663--674",
booktitle = "ASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Discovering malicious domains through passive DNS data graph analysis

AU - Khalil, Issa

AU - Yu, Ting

AU - Guan, Bei

PY - 2016/5/30

Y1 - 2016/5/30

N2 - Malicious domains are key components to a variety of cyber attacks. Several recent techniques are proposed to identify malicious domains through analysis of DNS data. The general approach is to build classifiers based on DNS-related local domain features. One potential problem is that many local features, e.g., domain name patterns and temporal patterns, tend to be not robust. Attackers could easily alter these features to evade detection without affecting much their attack capabilities. In this paper, we take a complementary approach. Instead of focusing on local features, we propose to discover and analyze global associations among domains. The key challenges are (1) to build meaningful associations among domains; and (2) to use these associations to reason about the potential maliciousness of domains. For the first challenge, we take advantage of the modus operandi of attackers. To avoid detection, malicious domains exhibit dynamic behavior by, for example, frequently changing the malicious domain-IP resolutions and creating new domains. This makes it very likely for attackers to reuse resources. It is indeed commonly observed that over a period of time multiple malicious domains are hosted on the same IPs and multiple IPs host the same malicious domains, which creates intrinsic association among them. For the second challenge, we develop a graph-based inference technique over associated domains. Our approach is based on the intuition that a domain having strong associations with known malicious domains is likely to be malicious. Carefully established associations enable the discovery of a large set of new malicious domains using a very small set of previously known malicious ones. Our experiments over a public passive DNS database show that the proposed technique can achieve high true positive rates (over 95%) while maintaining low false positive rates (less than 0.5%). Further, even with a small set of known malicious domains (a couple of hundreds), our technique can discover a large set of potential malicious domains (in the scale of up to tens of thousands).

AB - Malicious domains are key components to a variety of cyber attacks. Several recent techniques are proposed to identify malicious domains through analysis of DNS data. The general approach is to build classifiers based on DNS-related local domain features. One potential problem is that many local features, e.g., domain name patterns and temporal patterns, tend to be not robust. Attackers could easily alter these features to evade detection without affecting much their attack capabilities. In this paper, we take a complementary approach. Instead of focusing on local features, we propose to discover and analyze global associations among domains. The key challenges are (1) to build meaningful associations among domains; and (2) to use these associations to reason about the potential maliciousness of domains. For the first challenge, we take advantage of the modus operandi of attackers. To avoid detection, malicious domains exhibit dynamic behavior by, for example, frequently changing the malicious domain-IP resolutions and creating new domains. This makes it very likely for attackers to reuse resources. It is indeed commonly observed that over a period of time multiple malicious domains are hosted on the same IPs and multiple IPs host the same malicious domains, which creates intrinsic association among them. For the second challenge, we develop a graph-based inference technique over associated domains. Our approach is based on the intuition that a domain having strong associations with known malicious domains is likely to be malicious. Carefully established associations enable the discovery of a large set of new malicious domains using a very small set of previously known malicious ones. Our experiments over a public passive DNS database show that the proposed technique can achieve high true positive rates (over 95%) while maintaining low false positive rates (less than 0.5%). Further, even with a small set of known malicious domains (a couple of hundreds), our technique can discover a large set of potential malicious domains (in the scale of up to tens of thousands).

UR - http://www.scopus.com/inward/record.url?scp=84979709463&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979709463&partnerID=8YFLogxK

U2 - 10.1145/2897845.2897877

DO - 10.1145/2897845.2897877

M3 - Conference contribution

SP - 663

EP - 674

BT - ASIA CCS 2016 - Proceedings of the 11th ACM Asia Conference on Computer and Communications Security

PB - Association for Computing Machinery, Inc

ER -