Technique for Continuous Truth Discovery Over Multiple-Source Sensor Data Streams

doi:10.13328/j.cnki.jos.005033

微信服务号

微信订阅号

2025-4-24- 11

Home > Archive>Volume 27, Issue 7, 2016 >1655-1670. DOI:10.13328/j.cnki.jos.005033

PDF HTML XML Export Cite reminder

Technique for Continuous Truth Discovery Over Multiple-Source Sensor Data Streams
DOI:
                        10.13328/j.cnki.jos.005033
                    
Author:
                        LI Tian-YiLI Tian-Yi
School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
GU YuGU Yu
School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
MA QianMA Qian
School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LI Fang-FangLI Fang-Fang
School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
YU GeYU Ge
School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:National Key Basic Research Program of China (973) (2012CB316201); National Natural Science Foundation of China (61433008, 61472071, 61272179); Fundamental Research Funds for Central Universities (N140404013)

Article

Figures

Metrics

Reference [23]

Related [20]

Cited by

Materials

Comments

Abstract:

As a method of assessing validity of conflicting information provided by various data sources, truth discovery has been widely researched in the conventional database community. However, most of the existing solutions of truth discovery are not suitable for applications involving data streams, mainly because their methods include iterative processes. This paper studies the problem of continuous truth discovery in a special kind of data streams-sensor data streams. Combining with the characteristics of sensor data itself and its application, a strategy is proposed based on changing the frequency of assessing source reliability to reduce the iterative processes, and therefore to improve the efficiency of truth discovery in multiple-source sensor data streams. First, definitions are provided on when the relative errors and accumulative errors are relatively small, and the necessary conditions of the variation on source reliability from adjacent time points. Next, a probabilistic model is given to predict the probability of meeting these necessary conditions. Then, by integrating the above conclusions, maximal assessing period of source reliability is achieved, under the condition that the cumulative error of prediction is smaller than the given threshold in a certain confidence level of probabilities, in order to improve efficiency. Thus the truth discovery problem is transformed into an optimization problem. Furthermore, an algorithm, CTF-Stream (continuous truth finding over sensor data streams) is constructed to assessing source reliability with changeable frequencies. CTF-Stream utilizes the historic data to dynamically determine the time needed to assess the source reliability, and finds the truth with a certain accuracy given by customers while improving the efficiency. Finally, both efficiency and accuracy of the presented methods for truth discovery in sensor data streams are validated by conducting the extensive experiments on real sensor dataset.

Key words:multiple-source;data stream;sensor data;truth discovery;source reliability

Reference

[1] Yin XX, Han JW, Yu PS. Truth discovery with multiple conflicting information providers on the Web. IEEE Trans. on Knowledge and Data Engineering, 2007,20(6):796-808.[doi:10.1109/TKDE.2007.190745]

[2] Galland A, Abiteboul S, Marian A, Senellart P. Corroborating information from disagreeing views. In:Proc. of the WSDM. New York, 2010.131-140. https://hal.inria.fr/inria-00429546/document

[3] Zhao B, Han JW. A probabilistic model for estimating real-valued truth from conflicting sources. In:Proc. of the QDB. Istanbul, 2012. http://web.engr.illinois.edu/~hanj/pdf/qdb12_bzhao.pdf

[4] Zhao B, Rubinstein BIP, Gemmell J, Han JW. A Bayesian approach to discovering truth from conflicting sources for data integration. PVLDB, 2012,5(6):550-561.[doi:10.14778/2168651.2168656]

[5] Li Q, Li YL, Gao J, Zhao B, Fan W, Han JW. Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In:Proc. of the SIGMOD. Snowbird, 2014.1187-1198. http://hanj.cs.illinois.edu/pdf/sigmod14_jgao.pdf

[6] Li Q, Li YL, Gao J, Demirbas M, Zhao B, Su L, Fan W, Han JW. A confidence-aware approach for truth discovery on long-tail data. PVLDB, 2014,8(4):425-436.

[7] Dong XL, Berti-Equille L, Srivastava D. Integrating conflicting data:The role of source dependence. PVLDB, 2009,2(1):550-561.

[8] Dong XL, Berti-Equille L, Srivastava D. Truth discovery and copying detection in a dynamic world. PVLDB, 2009,2(1):562-573.[doi:10.14778/1687627.1687691]

[9] Dong XL, Berti-Equille L, Hu YF, Srivastava D. Global detection of complex copying relationships between sources. PVLDB, 2010,3(1-2):1358-1369.

[10] Dong XL, Berti-Equille L, Hu YF, Srivastava D. Solomon:Seeking the truth via copying detection. PVLDB, 2010,3(1-2):1617-1620.[doi:10.1145/1966883.1966887]

[11] Dong XL, Gabrilovich E, Murphy K, Dang V, Horn W, Lugaresi C, Sun S, Zhang W. Knowledge-Based trust:Estimating the trustworthiness of Web sources. PVLDB, 2015,8(9):938-949.

[12] Pochampally R, Das-Sarma A, Dong XL, Meliou A, Srivastava D. Fusing data with correlations. In:Proc. of the SIGMOD. Snowbird, 2014.433-444. http://lunadong.com/publication/fusionWCorr_sigmod.pdf

[13] Li X, Dong XL, Lyons K, Meng W, Srivastava D. Truth finding on the deep Web:Is the problem solved. PVLDB, 2012,6(2):97-108.

[14] Song SX, Zhang AQ, Wang JM, Yu PS. SCREEN:Stream data cleaning under speed constraints. In:Proc. of the SIGMOD. Melbourne, 2015.827-841. http://ise.thss.tsinghua.edu.cn/sxsong/doc/15sigmod-screen.pdf

[15] Cao L, Yang D, Wang QY, Yu YW, Wang JY, Rundensteiner EA. Scalable distance-based outlier detection over high-volume data streams. In:Proc. of the ICDE. 2014.76-87.[doi:10.1109/ICDE.2014.6816641]

[16] Zhao Z, Cheng J, Ng W. Truth discovery in data streams:A single-pass probabilistic approach. In:Proc. of the CIKM. Shanghai, 2014.1589-1598. http://er2004.cse.ust.hk/~wilfred/paper/cikm14a.pdf

[17] Li JZ, Li JB, Shi SF. Concepts, issues and advance of sensor networks and data management of sensor networks. Ruan Jian Xue Bao/Journal of Software, 2003,14(10):1717-1727(in Chinese with English abstract). http://www.jos.org.cn/ch/reader/create_pdf.aspx?file_no=20031007&journal_id=jos

[18] Zhao Z, Ng W. A model-based approach for rfid data stream cleansing. In:Proc. of the CIKM. Hawaii, 2012.862-871. http://www.cs.ust.hk/~wilfred/paper/cikm12b.pdf

[19] Cheng SY, Li JZ, Yu L. Location aware peak value queries in sensor networks. In:Proc. of the INFOCOM. 2012.486-494.[doi:10.1109/INFCOM.2012.6195789]

[20] Raza U, Camerra A, Murphy A, Palpanas T, Picco GP. Practical data prediction for real-world wireless sensor networks. IEEE Trans. on Knowledge and Data Engineering, 2015,PP(8):1.[doi:10.1109/TKDE.2015.2411594]

[21] Li YL, Li Q, Gao J, Su L, Fan W, Han JW. On the discovery of evolving truth. In:Proc. of the SIGKDD. Sydney, 2015.675-684. http://www.cse.buffalo.edu/~lusu/papers/KDD2015Yaliang.pdf

附中文参考文献:

[1] 李建中,李金宝,石胜飞.传感器网络及其数据管理的概念、问题与进展.软件学报,2003,14(10):1717-1727. http://www.jos.org.cn/ch/reader/create_pdf.aspx?file_no=20031007&journal_id=jos

Get Citation

李天义,谷峪,马茜,李芳芳,于戈.一种多源感知数据流上的连续真值发现技术.软件学报,2016,27(7):1655-1670

Copy

Article Metrics

Abstract:5607
PDF: 6958
HTML: 3303
Cited by: 0

History

Received:September 25,2015
Revised:January 12,2016
Adopted:
Online: March 24,2016
Published:

You are the first2038023Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History