面向HTTP/2流量多路复用特征的加密视频识别方法
作者:
中图分类号:

TP393

基金项目:

国家重点研发计划(2021YFB3101403)


Encrypted Video Identification Method for HTTP/2 Traffic Multiplexing Features
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [42]
  • | |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    视频应用平台的兴起使得视频得以快速传播并渗透社会生活的各个方面. 网络中传播的视频也混杂了一些公害视频, 因此网络空间安全监管迫切需要准确地识别网络中加密传播的公害视频. 已有方法在网络主要接入点采集流量数据, 提取加密视频流量的特征, 基于公害视频库, 通过流量特征的匹配识别出被传输的公害视频. 然而随着视频加密传输协议的更新, 使用新型多路复用技术的HTTP/2协议已经大规模部署应用, 这导致传统的基于HTTP/1.1传输特征的流量分析方法无法识别使用HTTP/2传输的加密视频. 此外, 当前的研究大多针对的是播放时分辨率固定的视频, 很少考虑到流媒体自适应播放时分辨率切换给识别带来的影响. 针对以上问题, 详细分析了视频平台使用HTTP/2协议传输视频时音视频数据长度发生偏移的原理, 并提出了将多路复用的加密数据精准修正还原为组合音视频数据单元长度的方法, 从而构建出精准还原的加密视频修正指纹. 然后, 利用加密视频修正指纹和大型视频明文指纹库, 提出了视频修正指纹滑动匹配机制和以隐马尔可夫模型与维特比算法为基础的加密视频识别模型. 该模型使用动态规划方法解决了视频分辨率自适应切换带来的问题, 其在40万级的Facebook和Instagram真实指纹库场景中, 对固定分辨率和自适应分辨率的加密视频的识别准确率分别达到了98.41%和97.91%. 使用Triller、Twitter和芒果TV这3个视频平台进行了方法通用性和泛化性验证. 与类似工作在识别效果、泛化性和时间开销方面的比较进一步验证了所提出的方法具有较高的应用价值.

    Abstract:

    The rise of video platforms has led to the rapid dissemination of videos, integrating them into various aspects of social life. Videos transmitted in the network may include harmful content, highlighting an urgent need for cyberspace security supervision to accurately identify harmful videos that are encrypted and transmitted in the network. The existing methods collect traffic data at main network access points to extract the features of encrypted video traffic and identify the harmful videos by matching the traffic features based on harmful video databases. However, with the progress of encryption protocol for video transmission, HTTP/2 using new multiplexing technologies has been widely applied, which makes the traditional traffic analysis method based on HTTP/1.1 features fail to identify encrypted videos using HTTP/2. Moreover, the current research mostly focuses on videos with a fixed resolution during playback. Few studies have considered the impact of resolution switching in video identification. To address the above problems, this study analyzes the factors that cause offsets in the length of the audio/video data during the HTTP/2 transmission process and proposes a method to precisely reconstruct corrected fingerprints for encrypted videos by calculating the size of the combined audio and video segments in the encrypted traffic. The study also proposes an encrypted video identification model based on the hidden Markov model and the Viterbi algorithm by using the corrected fingerprints of encrypted videos and a large plaintext fingerprint database for videos. The model applies dynamic planning to solve the problems caused by adaptive video resolution switching. The proposed model achieves identification accuracy of 98.41% and 97.91% respectively for encrypted videos with fixed and adaptive resolutions in 400000-level fingerprint databases, namely Facebook and Instagram. The study validates the generality and generalization of the proposed method using three video platforms: Triller, Twitter, and Mango TV. The higher application value of the proposed method has been validated through comparisons with similar work in terms of recognition effectiveness, generalization, and time overhead.

    参考文献
    [1] Ericsson Mobility Report. 2023. https://www.ericsson.com/en/reports-and-papers/mobility-report/reports/
    [2] Shutsko A. User-generated short video content in social media. A case study of TikTok. In: Proc. of the 12th Int’l Conf. on Social Computing and Social Media. Participation, User Experience, Consumer Experience, and Applications of Social Computing. Copenhagen: Springer, 2020. 108–125. [doi: 10.1007/978-3-030-49576-3_8]
    [3] Usage Statistics of Default protocol https for Websites. 2024. https://w3techs.com/technologies/details/ce-httpsdefault
    [4] Fu CP, Li Q, Xu K. Detecting unknown encrypted malicious traffic in real time via flow interaction graph analysis. In: Proc. of the Network and Distributed System Security Symp. (NDSS 2023). 2023.
    [5] Fu CP, Li Q, Shen M, Xu K. Realtime robust malicious traffic detection via frequency domain analysis. In: Proc. of the 2021 ACM SIGSAC Conf. on Computer and Communications Security. Virtual Event: ACM, 2021. 3431–3446. [doi: 10.1145/3460120.3484585]
    [6] Ni BL, Peng HW, Chen MH, Zhang SY, Meng GF, Fu JL, Xiang SM, Ling HB. Expanding language-image pretrained models for general video recognition. In: Proc. of the 17th European Conf. on Computer Vision. Tel Aviv: Springer, 2022. 1–18. [doi: 10.1007/978-3-031-19772-7_1]
    [7] Kumar K, Shrimankar DD, Singh N. Eratosthenes sieve based key-frame extraction technique for event summarization in videos. Multimedia Tools and Applications, 2018, 77(6): 7383–7404.
    [8] Yang LM, Fu SJ, Luo YC, Shi JY. Markov probability fingerprints: A method for identifying encrypted video traffic. In: Proc. of the 16th Int’l Conf. on Mobility, Sensing and Networking (MSN). Tokyo: IEEE, 2020. 283–290. [doi: 10.1109/MSN50589.2020.00055]
    [9] Li F, Chung JW, Claypool M. Silhouette: Identifying YouTube video flows from encrypted traffic. In: Proc. of the 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video. Amsterdam: ACM, 2018. 19–24. [doi: 10.1145/3210445.3210448]
    [10] Shi Y, Biswas S. A deep-learning enabled traffic analysis engine for video source identification. In: Proc. of the 11th Int’l Conf. on Communication Systems & Networks (COMSNETS). Bengaluru: IEEE, 2019. 15–21. [doi: 10.1109/COMSNETS.2019.8711478]
    [11] Shi Y, Feng DZ, Cheng Y, Biswas S. A natural language-inspired multilabel video streaming source identification method based on deep neural networks. Signal, Image and Video Processing, 2021, 15(6): 1161–1168.
    [12] Kattadige C, Raman A, Thilakarathna K, Lutu A, Perino D. 360NorVic: 360-degree video classification from mobile encrypted video traffic. In: Proc. of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video. Istanbul: ACM, 2021. 58–65. [doi: 10.1145/3458306.3460998]
    [13] Shen M, Zhang JP, Xu K, Zhu LH, Liu JC, Du XJ. DeepQoE: Real-time measurement of video QoE from encrypted traffic with deep learning. In: Proc. of the 28th IEEE/ACM Int’l Symp. on Quality of Service (IWQoS). Hangzhou: IEEE, 2020. 1–10.
    [14] Gutterman C, Guo K, Arora S, Gilliland T, Wang XY, Wu L, Katz-Bassett E, Zussman G. Requet: Real-time QoE metric detection for encrypted YouTube traffic. ACM Trans. on Multimedia Computing, Communications, and Applications, 2020, 16(2S): 71.
    [15] Wu H, Li X, Cheng G, Hu XY. Monitoring video resolution of adaptive encrypted video traffic based on HTTP/2 features. In: Proc. of the IEEE INFOCOM 2021—IEEE Conf. on Computer Communications Workshops (INFOCOM WKSHPS). Vancouver: IEEE, 2021. 1–6. [doi: 10.1109/INFOCOMWKSHPS51825.2021.9484509]
    [16] Wu H, Yu ZH, Cheng G, Guo SY. Identification of encrypted video streaming based on differential fingerprints. In: Proc. of the IEEE INFOCOM 2020—IEEE Conf. on Computer Communications Workshops (INFOCOM WKSHPS). Toronto: IEEE, 2020. 74–79.
    [17] Reed A, Kranch M. Identifying HTTPS-protected netflix videos in real-time. In: Proc. of the 7th ACM on Conf. on Data and Application Security and Privacy. Scottsdale: ACM, 2017. 361–368. [doi: 10.1145/3029806.3029821]
    [18] Shen M, Liu YT, Zhu LH, Xu K, Du XJ, Guizani N. Optimizing feature selection for efficient encrypted traffic classification: A systematic approach. IEEE Network, 2020, 34(4): 20–27.
    [19] Dubin R, Dvir A, Pele O, Hadar O. I know what you saw last minute—Encrypted HTTP adaptive video streaming title classification. IEEE Trans. on Information Forensics and Security, 2017, 12(12): 3039–3049. [doi: 10.1109/TIFS.2017.2730819]
    [20] Liu YT, Li S, Zhang CW, Sun Y, Liu QY. ITP-KNN: Encrypted video flow identification based on the intermittent traffic pattern of video and K-nearest neighbors classification. In: Proc. of the 20th Int’l Conf. on Computational Science. Amsterdam: Springer, 2020. 279–293.
    [21] Xu SC, Sen S, Mao ZM. CSI: Inferring mobile ABR video adaptation behavior under HTTPS and QUIC. In: Proc. of the 15th European Conf. on Computer Systems. Heraklion: ACM, 2020. 33. [doi: 10.1145/3342195.3387558]
    [22] 吴桦, 于振华, 程光, 胡晓艳. 大型指纹库场景中加密视频识别方法. 软件学报, 2021, 32(10): 3310–3330. http://www.jos.org.cn/1000-9825/6025.htm
    Wu H, Yu ZH, Cheng G, Hu XY. Encrypted video recognition in large-scale fingerprint database. Ruan Jian Xue Bao/Journal of Software, 2021, 32(10): 3310–3330 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6025.htm
    [23] Gu JX, Wang JL, Yu ZW, Shen KL. Traffic-based side-channel attack in video streaming. IEEE/ACM Trans. on Networking, 2019, 27(3): 972–985.
    [24] Afandi W, Bukhari SMAH, Khan MUS, Maqsood T, Khan ASU. Fingerprinting technique for YouTube videos identification in network traffic. IEEE Access, 2022, 10: 76731–76741.
    [25] Bae S, Son M, Kim D, Park C, Lee J, Son S, Kim Y. Watching the watchers: Practical video identification attack in LTE networks. In: Proc. of the 31st USENIX Security Symp. (USENIX Security 2022). Boston: USENIX Association, 2022. 1307–1324.
    [26] Karagkioules T, Concolato C, Tsilimantos D, Valentin S. A comparative case study of HTTP adaptive streaming algorithms in mobile networks. In: Proc. of the 27th Workshop on Network and Operating Systems Support for Digital Audio and Video. Taipei: ACM, 2017. 1–6. [doi: 10.1145/3083165.3083170]
    [27] ISO/IEC 23009–1: 2014 Information technology—Dynamic adaptive streaming over HTTP (DASH)—Part 1: Media presentation description and segment formats. 2024. https://www.iso.org/standard/65274.html
    [28] Durak K, Akcay MN, Erinc YK, Pekel B, Begen AC. Evaluating the performance of Apple’s low-latency HLS. In: Proc. of the 22nd IEEE Int’l Workshop on Multimedia Signal Processing. Tampere: IEEE, 2020. 1–6. [doi: 10.1109/MMSP48831.2020.9287117]
    [29] Behravesh R, Rao A, Perez-Ramirez DF, Harutyunyan D, Riggio R, Boman M. Machine learning at the mobile edge: The case of dynamic adaptive streaming over HTTP (DASH). IEEE Trans. on Network and Service Management, 2022, 19(4): 4779–4793. [doi: 10.1109/TNSM.2022.3193856]
    [30] Yu L, Tillo T, Xiao JM. QoE-driven dynamic adaptive video streaming strategy with future information. IEEE Trans. on Broadcasting, 2017, 63(3): 523–534.
    [31] ISO/IEC 23009–6: 2017 Information technology—Dynamic adaptive streaming over HTTP (DASH)—Part 6: DASH with server push and WebSockets. 2024. https://www.iso.org/standard/71072.html
    [32] Comparison of the usage statistics of HTTP/2 for websites. 2024. https://w3techs.com/technologies/comparison/ce-http2
    [33] Chaudhary S, Shukla NK, Chakraborty S, Maity M. A dataset for analyzing streaming media performance over HTTP/3 browsers. In: Proc. of the 37th Int’l Conf. on Neural Information Processing Systems. New Orleans: Curran Associates Inc., 2024. 78069–78081.
    [34] Wijnants M, Marx R, Quax P, Lamotte W. HTTP/2 prioritization and its impact on Web performance. In: Proc. of the 2018 World Wide Web Conf. Lyon: Int’l World Wide Web Conferences Steering Committee, 2018. 1755–1764. [doi: 10.1145/3178876.3186181]
    [35] Belshe M, Peon R, Thomson M. Hypertext transfer protocol version 2 (HTTP/2). RFC 7540, Internet Engineering Task Force, 2015. [doi: 10.17487/RFC7540]
    [36] Yang LM, Fu SJ, Luo YC, Wang YJ, Zhao WT. A clustering method of encrypted video traffic based on Levenshtein distance. In: Proc. of the 17th Int’l Conf. on Mobility, Sensing and Networking. Exeter: IEEE, 2021. 1–8. [doi: 10.1109/MSN53354.2021.00017]
    [37] Schuster R, Shmatikov V, Tromer E. Beauty and the burst: Remote identification of encrypted video streams. In: Proc. of the 26th USENIX Conf. on Security Symp. Vancouver: USENIX Association, 2017. 1357–1374.
    [38] ISO/IEC 23009–8: 2022 Information technology—Dynamic adaptive streaming over HTTP (DASH)—Part 8: Session-based DASH operations. 2024. https://www.iso.org/standard/80898.html
    [39] Transport Layer Security (tls). 2018. https://datatracker.ietf.org/wg/tls/about/
    [40] 邹福泰, 俞汤达, 许文亮. 基于隐马尔可夫模型的加密恶意流量检测. 软件学报, 2022, 33(7): 2683–2698. http://www.jos.org.cn/1000-9825/6282.htm
    Zou FT, Yu TD, Xu WL. Encrypted malicious traffic detection based on hidden Markov model. Ruan Jian Xue Bao/Journal of Software, 2022, 33(7): 2683–2698 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6282.htm
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

吴桦,罗浩,赵士顺,刘嵩涛,程光,胡晓艳.面向HTTP/2流量多路复用特征的加密视频识别方法.软件学报,,():1-30

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-03-15
  • 最后修改日期:2023-12-21
  • 在线发布日期: 2024-11-18
文章二维码
您是第19789626位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号