Lightweight Domain Name Detection Algorithm Based on Morpheme Features
Author:
Affiliation:

Fund Project:

National Natural Science Foundation of China (60973123); State Scientific and Technological Support Plan Project of China (2008BAH37B04); National Basic Research Program of China (973) (2009CB320505)

  • Article
  • | |
  • Metrics
  • |
  • Reference [36]
  • |
  • Related
  • |
  • Cited by [1]
  • | |
  • Comments
    Abstract:

    Detecting malicious services via inspecting the content of DNS packets is a common way to network security monitoring. Such a work often requires quasi real time ability to find suspects among the huge collected domain names, which is costly in processing resources. This work proposes a lightweight algorithm based on the morpheme features (root, affix, Chinese spelling and special noun abbreviation) of domain names to quickly identify the suspects for targeted DPI detection. Compared with algorithms based on n-tuple frequency distribution measurement, the proposed one is proved to have stronger anti-interference ability and better detection accuracy by 35.2% higher while only 58.3% memory overhead increasing. While compared with the methods based on word features, this lightweight algorithm can cut 64.8% of computation complexity and 2.6% memory overhead down with only 2.5% accuracy reduction.

    Reference
    [1] Porras P,Saidi H,Yegneswaran V.A foray into Conficker's logic and rendezvous points.In:Lee W,ed.Proc.of the 2nd USENIX Conf.on Large-Scale Exploits and Emergent Threats:Botnets,Spyware,Worms,and More (LEET 2009).Boston:USENIX,2009.
    [2] Conficker C Analysis.2009.http://mtc.sri.com/Conficker/addendumC
    [3] Royal P.Analysis of the Kraken Botnet.2008.https://www.damballa.com/downloads/r_pubs/KrakenWhitepaper.pdf
    [4] Stone-Gross B,Cova M,Cavallaro L.Your botnet is my botnet:analysis of a botnet takeover.In:Al-Shaer E,Jha S,Keromytis AD,eds.Proc.of the 16th ACM Conf.on Computer and Communications Security (CCS 2009).Chicago:ACM Press,2009.635-647.[doi:10.1145/1653662.1653738]
    [5] Chatzis N,Popescu-Zeletin R.Flow level data mining of DNS query streams for email worm detection.In:Corchado E,Zunino R,Gastaldo P,Herrero A,eds.Proc.of the Int'l Workshop on Computational Intelligence in Security for Information Systems (CISIS 2008).Berlin,Heidelberg:Springer-Verlag,2009.186-194.[doi:10.1007/978-3-540-88181-0_24]
    [6] Chatzis N,Popescu-Zeletin R.Detection of email worm-infected machines on the local name servers using time series analysis.Journal of Information Assurance and Security,2009,4(3):292-300.
    [7] Chatzis N,Popescu-Zeletin R,Brownlee N.Email worm detection by wavelet analysis of DNS query streams.In:Dasgupta D,Zhan J,eds.Proc.of the IEEE Symp.on Computational Intelligence in Cyber Security (CICS 2009).Nashville:IEEE,2009.53-60.[doi:10.1109/CICYBS.2009.4925090]
    [8] Chatzis N,Brownlee N.Similarity search over DNS query streams for email worm detection.In:Awan I,ed.Proc.of the 2009 Int'l Conf.on Advanced Information Networking and Applications (AINA 2009).Bradford:IEEE,2009.588-595.[doi:10.1109/AINA.2009.132]
    [9] Caglayan A,Toothaker M,Drapeau D,Burke D,Eaton G.Real-Time detection of fast flux service networks.In:Walter E,ed.Proc.of the 2009 Cybersecurity Applications&Technology Conf.for Homeland Security (CATCH 2009).Washington:IEEE,2009.285-292.[doi:10.1109/CATCH.2009.44]
    [10] Choi H,Lee H,Kim H.Botnet detection by monitoring group activities in DNS traffic.In:Wei D,ed.Proc.of the 7th IEEE Int'l Conf.on Computer and Information Technology (CIT 2007).Fukushima:IEEE,2007.715-720.
    [11] Choi H,Lee H,Kim H.BotGAD:Detecting botnets by capturing group activities in network traffic.In:Bosch J,Clarke S,eds.Proc.of the 4th Int'l ICST Conf.on Communication System Software and Middleware (COMSWARE 2009).Dublin:ACM Press,2009.[doi:10.1145/1621890.1621893]
    [12] Choi H,Lee H.Identifying botnets by capturing group activities in DNS traffic.Computer Networks:The Int'l Journal of Computer and Telecommunications Networking,2012,56(1):20-33.[doi:10.1016/j.comnet.2011.07.018]
    [13] Antonakakis M,Perdisci R,Lee W,Vasiloglou N,Dagon D.Detecting malware domains at the upper DNS hierarchy.In:Wagner D,ed.Proc.of the 20th USENIX Conf.on Security (SEC 2011).San Francisco:USENIX,2011.
    [14] Antonakakis M,Perdisci R,Nadji Y,Vasiloglou N,Abu-Nimeh S,Lee W,Dagon D.From throw-away traffic to bots:Detecting the rise of DGA-based malware.In:Kohno T,ed.Proc.of the 21st USENIX Conf.on Security Symp.(Security 2012).Bellevue:USENIX,2012.491-506.
    [15] Bilge L,Sen S,Balzarotti D,Kirda E,Kruegel C.Exposure:A passive DNS analysis service to detect and report malicious domains.ACM Trans.on Information and System Security (TISSEC),2014,16(4).[doi:10.1145/2584679]
    [16] Ma J,Saul LK,Savage S,Voelker GM.Beyond blacklists:Learning to detect malicious web sites from suspicious URLs.In:Elder J,Fogelman FS,Flach P,Zaki M,eds.Proc.of the 15th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining (KDD 2009).Paris:ACM Press,2009.1245-1254.[doi:10.1145/1557019.1557153]
    [17] Ma J,Saul LK,Savage S,Voelker GM.Learning to detect malicious URLs.ACM Trans.on Intelligent Systems and Technology (TIST),2011,2(3):493-500.[doi:10.1145/1961189.1961202]
    [18] Prakash P,Kumar M,Kompella RR,Gupta M.PhishNet:Predictive blacklisting to detect phishing attacks.In:Mandyam G,Westphal C,eds.Proc.of the 29th Conf.on Information Communications (INFOCOM 2010).San Diego:IEEE,2010.346-350.[doi:10.1109/INFCOM.2010.5462216]
    [19] Yadav S,Reddy AKK,Reddy ALN,Ranjan S.Detecting algorithmically generated malicious domain names.In:Allman M,ed.Proc.of the 10th ACM SIGCOMM Conf.on Internet Measurement (IMC 2010).Melbourne:ACM Press,2010.48-61.[doi:10.1145/1879141.1879148]
    [20] Yadav S,Reddy AKK,Reddy ALN,Ranjan S.Detecting algorithmically generated domain-flux attacks with DNS traffic analysis.IEEE/ACM Trans.on Networking (TON),2012,20(5):1663-1677.[doi:10.1109/TNET.2012.2184552]
    [21] Khaitan S,Das A,Gain S,Sampath A.Data-Driven compound splitting method for English compounds in domain names.In:Cheung D,Song IY,Chu W,Hu XH,Lin J,eds.Proc.of the 18th ACM Conf.on Information and Knowledge Management (CIKM 2009).Hong Kong:ACM Press,2009.207-214.[doi:10.1145/1645953.1645982]
    [22] Srinivasan S,Bhattacharya S,Chakraborty R.Segmenting Web-domains and hashtags using length specific models.In:Chen XW,Lebanon G,Wang HX,Zaki MJ,eds.Proc.of the 21st ACM Int'l Conf.on Information and Knowledge Management (CIKM 2012).Maui Hawaii:ACM Press,2012.1113-1122.[doi:10.1145/2396761.2398410]
    [23] Marchal S,Francois J,State R,Engel T.Proactive discovery of phishing related domain names.In:Stolfo SJ,Stavrou A,Wright CV,eds.Proc.of the Research in Attacks,Intrusions,and Defenses.Berlin,Heidelberg:Springer-Verlag,2012.190-209.[doi:10.1007/978-3-642-33338-5_10]
    [24] Schiavoni S,Maggi F,Cavallaro L,Zanero S.Tracking and characterizing botnets using automatically generated domains.CoRR,2013.http://arxiv.org/pdf/1311.5612.pdf
    [25] Plag I.Word-Formation in English.Cambridge:Cambridge University Press,2002.
    [26] Alexa.2014.http://www.alexa.com/topsites/
    [27] Palevo tracker.2014.https://palevotracker.abuse.ch/
    [28] Zeus tracker.2014.https://zeustracker.abuse.ch/
    [29] DNS-BH-Malware domain blocklist.2014.http://www.malwaredomains.com/
    [30] Malware domain list.2009.http://www.malwaredomainlist.com
    [31] PhishTank.2014.http://www.phishtank.com/
    [32] Blacklist provided by joewein.net (JWSDB).2014.http://joewein.net/spam/blacklist.htm
    [33] Baddeley A,Della Sala S.Working memory and executive control.Philosophical Trans.of the Royal Society of London Series B-Biological Sciences,1996,351(1346):1397-1403.
    [34] Kotsiantis SB.Supervised machine learning:A review of classification techniques.In:Maglogiannis I,Karpouzis K,Wallace M,Soldatos J,eds.Proc.of the 2007 Conf.on Emerging Artificial Intelligence Applications in Computer Engineering:Real Word AI Systems with Applications in eHealth,HCI,Information Retrieval and Pervasive Technologies.Amsterdam:IOS Press,2007.3-24.
    [35] Crawford H,Aycock J.Kwyjibo:Automatic domain name generation.Software Practice and Experience,2008,38(14):1561-1567.[doi:10.1002/spe.885]
    [36] Quinlan JR.C4.5:Programs for Machine Learning.San Francisco:Morgan Kaufmann Publishers Inc.,1993.
    Related
    Comments
    Comments
    分享到微博
    Submit
Get Citation

张维维,龚俭,刘茜,刘尚东,胡晓艳.基于词素特征的轻量级域名检测算法.软件学报,2016,27(9):2348-2364

Copy
Share
Article Metrics
  • Abstract:4158
  • PDF: 6540
  • HTML: 1411
  • Cited by: 0
History
  • Received:October 11,2014
  • Revised:March 02,2015
  • Online: September 02,2016
You are the first2033297Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063