Lightweight Domain Name Detection Algorithm Based on Morpheme Features

doi:10.13328/j.cnki.jos.004913

微信服务号

微信订阅号

2025-4-6- 6

Home > Archive>Volume 27, Issue 9, 2016 >2348-2364. DOI:10.13328/j.cnki.jos.004913

PDF HTML XML Export Cite reminder

Lightweight Domain Name Detection Algorithm Based on Morpheme Features
DOI:
                        10.13328/j.cnki.jos.004913
                    
Author:
                        ZHANG Wei-WeiZHANG Wei-Wei
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China;Jiangsu Provincial Key Laboratory of Computer Network Technology, Nanjing 210096, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
GONG JianGONG Jian
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China;Jiangsu Provincial Key Laboratory of Computer Network Technology, Nanjing 210096, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LIU QianLIU Qian
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China;Jiangsu Provincial Key Laboratory of Computer Network Technology, Nanjing 210096, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LIU Shang-DongLIU Shang-Dong
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China;Jiangsu Provincial Key Laboratory of Computer Network Technology, Nanjing 210096, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
HU Xiao-YanHU Xiao-Yan
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China;Jiangsu Provincial Key Laboratory of Computer Network Technology, Nanjing 210096, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:National Natural Science Foundation of China (60973123); State Scientific and Technological Support Plan Project of China (2008BAH37B04); National Basic Research Program of China (973) (2009CB320505)

Article

Figures

Metrics

Reference [36]

Cited by [1]

Materials

Comments

Abstract:

Detecting malicious services via inspecting the content of DNS packets is a common way to network security monitoring. Such a work often requires quasi real time ability to find suspects among the huge collected domain names, which is costly in processing resources. This work proposes a lightweight algorithm based on the morpheme features (root, affix, Chinese spelling and special noun abbreviation) of domain names to quickly identify the suspects for targeted DPI detection. Compared with algorithms based on n-tuple frequency distribution measurement, the proposed one is proved to have stronger anti-interference ability and better detection accuracy by 35.2% higher while only 58.3% memory overhead increasing. While compared with the methods based on word features, this lightweight algorithm can cut 64.8% of computation complexity and 2.6% memory overhead down with only 2.5% accuracy reduction.

Key words:network security monitoring;domain name detection;morphemes;string segmentation;C4.5 classifier

Reference

[1] Porras P,Saidi H,Yegneswaran V.A foray into Conficker's logic and rendezvous points.In:Lee W,ed.Proc.of the 2nd USENIX Conf.on Large-Scale Exploits and Emergent Threats:Botnets,Spyware,Worms,and More (LEET 2009).Boston:USENIX,2009.

[2] Conficker C Analysis.2009.http://mtc.sri.com/Conficker/addendumC

[3] Royal P.Analysis of the Kraken Botnet.2008.https://www.damballa.com/downloads/r_pubs/KrakenWhitepaper.pdf

[4] Stone-Gross B,Cova M,Cavallaro L.Your botnet is my botnet:analysis of a botnet takeover.In:Al-Shaer E,Jha S,Keromytis AD,eds.Proc.of the 16th ACM Conf.on Computer and Communications Security (CCS 2009).Chicago:ACM Press,2009.635-647.[doi:10.1145/1653662.1653738]

[5] Chatzis N,Popescu-Zeletin R.Flow level data mining of DNS query streams for email worm detection.In:Corchado E,Zunino R,Gastaldo P,Herrero A,eds.Proc.of the Int'l Workshop on Computational Intelligence in Security for Information Systems (CISIS 2008).Berlin,Heidelberg:Springer-Verlag,2009.186-194.[doi:10.1007/978-3-540-88181-0_24]

[6] Chatzis N,Popescu-Zeletin R.Detection of email worm-infected machines on the local name servers using time series analysis.Journal of Information Assurance and Security,2009,4(3):292-300.

[7] Chatzis N,Popescu-Zeletin R,Brownlee N.Email worm detection by wavelet analysis of DNS query streams.In:Dasgupta D,Zhan J,eds.Proc.of the IEEE Symp.on Computational Intelligence in Cyber Security (CICS 2009).Nashville:IEEE,2009.53-60.[doi:10.1109/CICYBS.2009.4925090]

[8] Chatzis N,Brownlee N.Similarity search over DNS query streams for email worm detection.In:Awan I,ed.Proc.of the 2009 Int'l Conf.on Advanced Information Networking and Applications (AINA 2009).Bradford:IEEE,2009.588-595.[doi:10.1109/AINA.2009.132]

[9] Caglayan A,Toothaker M,Drapeau D,Burke D,Eaton G.Real-Time detection of fast flux service networks.In:Walter E,ed.Proc.of the 2009 Cybersecurity Applications&Technology Conf.for Homeland Security (CATCH 2009).Washington:IEEE,2009.285-292.[doi:10.1109/CATCH.2009.44]

[10] Choi H,Lee H,Kim H.Botnet detection by monitoring group activities in DNS traffic.In:Wei D,ed.Proc.of the 7th IEEE Int'l Conf.on Computer and Information Technology (CIT 2007).Fukushima:IEEE,2007.715-720.

[11] Choi H,Lee H,Kim H.BotGAD:Detecting botnets by capturing group activities in network traffic.In:Bosch J,Clarke S,eds.Proc.of the 4th Int'l ICST Conf.on Communication System Software and Middleware (COMSWARE 2009).Dublin:ACM Press,2009.[doi:10.1145/1621890.1621893]

[12] Choi H,Lee H.Identifying botnets by capturing group activities in DNS traffic.Computer Networks:The Int'l Journal of Computer and Telecommunications Networking,2012,56(1):20-33.[doi:10.1016/j.comnet.2011.07.018]

[13] Antonakakis M,Perdisci R,Lee W,Vasiloglou N,Dagon D.Detecting malware domains at the upper DNS hierarchy.In:Wagner D,ed.Proc.of the 20th USENIX Conf.on Security (SEC 2011).San Francisco:USENIX,2011.

[14] Antonakakis M,Perdisci R,Nadji Y,Vasiloglou N,Abu-Nimeh S,Lee W,Dagon D.From throw-away traffic to bots:Detecting the rise of DGA-based malware.In:Kohno T,ed.Proc.of the 21st USENIX Conf.on Security Symp.(Security 2012).Bellevue:USENIX,2012.491-506.

[15] Bilge L,Sen S,Balzarotti D,Kirda E,Kruegel C.Exposure:A passive DNS analysis service to detect and report malicious domains.ACM Trans.on Information and System Security (TISSEC),2014,16(4).[doi:10.1145/2584679]

[16] Ma J,Saul LK,Savage S,Voelker GM.Beyond blacklists:Learning to detect malicious web sites from suspicious URLs.In:Elder J,Fogelman FS,Flach P,Zaki M,eds.Proc.of the 15th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining (KDD 2009).Paris:ACM Press,2009.1245-1254.[doi:10.1145/1557019.1557153]

[17] Ma J,Saul LK,Savage S,Voelker GM.Learning to detect malicious URLs.ACM Trans.on Intelligent Systems and Technology (TIST),2011,2(3):493-500.[doi:10.1145/1961189.1961202]

[18] Prakash P,Kumar M,Kompella RR,Gupta M.PhishNet:Predictive blacklisting to detect phishing attacks.In:Mandyam G,Westphal C,eds.Proc.of the 29th Conf.on Information Communications (INFOCOM 2010).San Diego:IEEE,2010.346-350.[doi:10.1109/INFCOM.2010.5462216]

[19] Yadav S,Reddy AKK,Reddy ALN,Ranjan S.Detecting algorithmically generated malicious domain names.In:Allman M,ed.Proc.of the 10th ACM SIGCOMM Conf.on Internet Measurement (IMC 2010).Melbourne:ACM Press,2010.48-61.[doi:10.1145/1879141.1879148]

[20] Yadav S,Reddy AKK,Reddy ALN,Ranjan S.Detecting algorithmically generated domain-flux attacks with DNS traffic analysis.IEEE/ACM Trans.on Networking (TON),2012,20(5):1663-1677.[doi:10.1109/TNET.2012.2184552]

[21] Khaitan S,Das A,Gain S,Sampath A.Data-Driven compound splitting method for English compounds in domain names.In:Cheung D,Song IY,Chu W,Hu XH,Lin J,eds.Proc.of the 18th ACM Conf.on Information and Knowledge Management (CIKM 2009).Hong Kong:ACM Press,2009.207-214.[doi:10.1145/1645953.1645982]

[22] Srinivasan S,Bhattacharya S,Chakraborty R.Segmenting Web-domains and hashtags using length specific models.In:Chen XW,Lebanon G,Wang HX,Zaki MJ,eds.Proc.of the 21st ACM Int'l Conf.on Information and Knowledge Management (CIKM 2012).Maui Hawaii:ACM Press,2012.1113-1122.[doi:10.1145/2396761.2398410]

[23] Marchal S,Francois J,State R,Engel T.Proactive discovery of phishing related domain names.In:Stolfo SJ,Stavrou A,Wright CV,eds.Proc.of the Research in Attacks,Intrusions,and Defenses.Berlin,Heidelberg:Springer-Verlag,2012.190-209.[doi:10.1007/978-3-642-33338-5_10]

[24] Schiavoni S,Maggi F,Cavallaro L,Zanero S.Tracking and characterizing botnets using automatically generated domains.CoRR,2013.http://arxiv.org/pdf/1311.5612.pdf

[25] Plag I.Word-Formation in English.Cambridge:Cambridge University Press,2002.

[26] Alexa.2014.http://www.alexa.com/topsites/

[27] Palevo tracker.2014.https://palevotracker.abuse.ch/

[28] Zeus tracker.2014.https://zeustracker.abuse.ch/

[29] DNS-BH-Malware domain blocklist.2014.http://www.malwaredomains.com/

[30] Malware domain list.2009.http://www.malwaredomainlist.com

[31] PhishTank.2014.http://www.phishtank.com/

[32] Blacklist provided by joewein.net (JWSDB).2014.http://joewein.net/spam/blacklist.htm

[33] Baddeley A,Della Sala S.Working memory and executive control.Philosophical Trans.of the Royal Society of London Series B-Biological Sciences,1996,351(1346):1397-1403.

[34] Kotsiantis SB.Supervised machine learning:A review of classification techniques.In:Maglogiannis I,Karpouzis K,Wallace M,Soldatos J,eds.Proc.of the 2007 Conf.on Emerging Artificial Intelligence Applications in Computer Engineering:Real Word AI Systems with Applications in eHealth,HCI,Information Retrieval and Pervasive Technologies.Amsterdam:IOS Press,2007.3-24.

[35] Crawford H,Aycock J.Kwyjibo:Automatic domain name generation.Software Practice and Experience,2008,38(14):1561-1567.[doi:10.1002/spe.885]

[36] Quinlan JR.C4.5:Programs for Machine Learning.San Francisco:Morgan Kaufmann Publishers Inc.,1993.

Get Citation

张维维,龚俭,刘茜,刘尚东,胡晓艳.基于词素特征的轻量级域名检测算法.软件学报,2016,27(9):2348-2364

Copy

Article Metrics

Abstract:4158
PDF: 6540
HTML: 1411
Cited by: 0

History

Received:October 11,2014
Revised:March 02,2015
Adopted:
Online: September 02,2016
Published:

You are the first2033297Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History