[关键词]
[摘要]
随着互联网的迅猛发展,网络广告成为互联网最重要的商业模式之一.网络广告在促进互联网发展的同时,也带来了用户信息泄露、影响用户网页浏览体验等负面问题.为了对网络广告进行系统的研究,需要获取广告生成过程中完整的调用路径.由于加载到页面中的JavaScript文件量大、函数调用路径链路长、路径中的JavaScript代码经过了一定的压缩和混淆,因此很难通过静态方法获取网络广告调用路径.分析了动态广告生成的过程,对相关代码进行动态插桩,利用函数参数实现广告调用信息的传递,并记录下每个iframe内部的调用信息,通过匹配与合并多个iframe的信息,生成了完整的广告调用路径并确定了广告插入的操作方式.针对21个真实网站进行了实验,结果表明:该方法能够在不太影响性能的前提下,获取到静态方法无法获取到的广告动态加载过程信息并生成广告代码调用路径.
[Key word]
[Abstract]
Online advertisement (short as ad) has become one of the most important business patterns, with the rapid development of Internet. Online advertisements are main economic sources of Web applications, but the negative affect is that ads may leak users' privacy, or increase loads of browsers' performance. In order to study online ads systematically, it is necessary to obtain a complete call path in the whole generating process. However, since the sizes of the loaded JavaScript files are usually large, the function call path is long, and even worse, the JavaScript code in the path is compressed and confused, it is difficult to get the call path of the online ads through static analysis method. This study tracks the call path of online ads dynamically, namely instruments the relevant codes at first, then uses the function parameters to transmit the call information and records the internal call information in each iframe, finally, by matching and merging the information in multiple iframes, a complete ad call path about the generating process of online ads is generated. The experiment focused on 21 real websites, and the results show that:the proposed method can obtain the dynamic loading information of ads and generate the whole call paths, which are impossible for static methods, and the overhead is acceptable.
[中图分类号]
TP311
[基金项目]
国家重点基础研究发展计划(973)(2014CB340702);国家自然科学基金(61272080,91418202,61403187)