Data Governance Technology
Author:
Affiliation:

Fund Project:

National Key Researh and Development Program of China (2016YFB1000901); National Natural Science Foundation of China (91746209); Program for Changjiang Scholars and Innovative Research Team in University (PCSIRT) of the Ministry of Education (IRT17R3)

  • Article
  • | |
  • Metrics
  • |
  • Reference [107]
  • |
  • Related
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    Along with the pervasiveness of information technology, the amount of data generated by human beings is growing at an exponential rate. Such massive data requires management with new methodologies. Data governance is the management of data for an organization (enterprise or government) as a strategic asset, from the collection of data to a set of management mechanisms for processing and applications, aiming to improve data quality, achieve a wide range of data sharing, and ultimately maximize the data value. Research and development on big data is nowadays popular in various domains, but big data governance is still in its infancy, and the decision-making of an organization cannot be separated from excellent data governance. This paper first introduces the concepts, developments, and necessity of data governance and big data governance, then analyzes existing data governance technologies-data specification, data cleaning, data exchange, and data integration, and also discusses the maturity measurement and framework design of data governance. Based on these introductions, analyses and reviews, the paper puts forward a "HAO governance" model for big data governance, which aims to facilitate HAO Intelligence with human intelligence (HI), artificial intelligence (AI), and organizational intelligence (OI), and then instantiates the "HAO governance" model with public security data governance as an example. Finally, the paper summarizes data governance with its challenges and opportunities.

    Reference
    [1] Li JZ, Wang HZ, Gao H. State-of-the-Art of research on big data usability. Ruan Jian Xue Bao/Journal of Software, 2016,27(7):1605-1625(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5038.htm[doi:10.13328/j.cnki.jos.005038]
    [2] Redman TC. The impact of poor data quality on the typical enterprise. Communications of the ACM, 1998,41(2):79-82.[doi:10.1145/269012.269025]
    [3] Miller Jr DW, Yeast JD, Evans RL. Missing prenatal records at a birth center:A communication problem quantified. In:Proc. of the AMIA Annual. Bethesda:American Medical Informatics Association, 2005. 535-539.
    [4] Swartz N. Gartner warns firms of ‘dirty data’. Information Management, 2007,41(3):6.
    [5] Huang LS, Tian MM, Huang H. Preserving privacy in big data:A survey from the cryptographic perspective. Ruan Jian Xue Bao/Journal of Software, 2015,26(4):945-959(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4794.htm[doi:10.13328/j.cnki.jos.004794]
    [6] Zhang SH, Pan R, Zong YW. Big Data Technology and Application Series. Shanghai:Shanghai Scientific & Technical Publishers, 2016. 1-224(in Chinese).
    [7] Otto B. Data governance. Business & Information Systems Engineering, 2011,3(4):241-244.[doi:10.1007/s12599-011-0162-8]
    [8] Wu XD, He J, Lu RQ, Zheng NN. From big data to big knowledge:HACE+BigKE. Acta Automatica Sinica, 2016,42(7):965-982(in Chinese with English abstract).[doi:10.16383/j.aas.2016.c160239]
    [9] Wu X, Zhu X, Wu G, Ding W. Data mining with big data. IEEE Trans. on Knowledge and Data Engineering, 2014,26(1):97-107.[doi:10.1109/TKDE.2013.109]
    [10] Soares S. Big Data Governance:An Emerging Imperative. Boise:MC Press, 2012. 3-286.
    [11] Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I. A view of cloud computing. Communications of the ACM, 2010,53(4):50-58.[doi:10.1145/1721654.1721672]
    [12] Feng DG, Min Z, Yan Z, Zhen X. Study on cloud computing security. Ruan Jian Xue Bao/Journal of Software, 2011,22(1):71-83(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/3958.htm[doi:10.3724/SP.J.1001.2011.03958]
    [13] Baek J, Safavinaini R, Susilo W. Public key encryption with keyword search revisited. In:Proc. of the Int'l Conf. on Computational Science and ITS Applications. Heidelberg:Springer-Verlag, 2008. 1249-1259.[doi:10.1007/978-3-540-69839-5_96]
    [14] Fang L, Susilo W, Ge C, Wang J. A secure channel free public key encryption with keyword search scheme without random oracle. In:Proc. of the Int'l Conf. on Cryptology and Network Security. Heidelberg:Springer-Verlag, 2009. 248-258.[doi:10.1007/978-3-642-10433-6_16]
    [15] Di Crescenzo G, Saraswat V. Public key encryption with searchable keywords based on Jacobi symbols. In:Proc. of the Int'l Conf. on Cryptology in India. Heidelberg:Springer-Verlag, 2007. 282-296.[doi:10.1007/978-3-540-77026-8_21]
    [16] Bellare M, Boldyreva A, O'Neill A. Deterministic and efficiently searchable encryption. In:Proc. of the Annual Int'l Cryptology Conf. Heidelberg:Springer-Verlag, 2007. 535-552.[doi:10.1007/978-3-540-74143-5_30]
    [17] Bellare M, Fischlin M, O'Neill A, Ristenpart T. Deterministic encryption:Definitional equivalences and constructions without random oracles. In:Proc. of the Annual Int'l Cryptology Conf. Heidelberg:Springer-Verlag, 2008. 360-378.[doi:10.1007/978-3-540-85174-5_20]
    [18] Wee H. Dual projective hashing and its applications-Lossy trapdoor functions and more. In:Proc. of the Annual Int'l Conf. on the Theory and Applications of Cryptographic Techniques. Heidelberg:Springer-Verlag, 2012. 246-262.[doi:10.1007/978-3-642-29011-4_16]
    [19] Xie X, Xue R, Zhang R. Deterministic public key encryption and identity-based encryption from lattices in the auxiliary-input setting. In:Proc. of the Int'l Conf. on Security and Cryptography for Networks. Heidelberg:Springer-Verlag, 2012. 1-18.[doi:10. 1007/978-3-642-32928-9_1]
    [20] Boneh D, Waters B. Conjunctive, subset, and range queries on encrypted data. In:Proc. of the Theory of Cryptography Conf. Heidelberg:Springer-Verlag, 2007. 535-554.[doi:10.1007/978-3-540-70936-7_29]
    [21] Hwang YH, Lee PJ. Public key encryption with conjunctive keyword search and its extension to a multi-user system. In:Proc. of the Int'l Conf. on Pairing-based Cryptography. Heidelberg:Springer-Verlag, 2007. 2-22.[doi:10.1007/978-3-540-73489-5_2]
    [22] Katz J, Sahai A, Waters B. Predicate encryption supporting disjunctions, polynomial equations, and inner products. In:Proc. of the Annual Int'l Conf. on the Theory and Applications of Cryptographic Techniques. Heidelberg:Springer-Verlag, 2008. 146-162.[doi:10.1007/978-3-540-78967-3_9]
    [23] Gentry C. Fully homomorphic encryption using ideal lattices. In:Proc. of the STOC, 2009. 169-178.[doi:10.1007/978-3-642-13013-7_25]
    [24] Smart NP, Vercauteren F. Fully homomorphic encryption with relatively small key and ciphertext sizes. In:Proc. of the Int'l Workshop on Public Key Cryptography. Heidelberg:Springer-Verlag, 2010. 420-443.[doi:10.1007/978-3-642-13013-7_25]
    [25] Gentry C, Halevi S, Smart NP. Better bootstrapping in fully homomorphic encryption. In:Proc. of the Int'l Workshop on Public Key Cryptography. Heidelberg:Springer-Verlag, 2012. 1-16.[doi:10.1007/978-3-642-30057-8_1]
    [26] Brakerski Z, Gentry C, Halevi S. Packed ciphertexts in LWE-based homomorphic encryption. In:Proc. of the Public-Key Cryptography (PKC 2013). Heidelberg:Springer-Verlag, 2013. 1-13.[doi:10.1007/978-3-642-36362-7_1]
    [27] Brakerski Z. Fully homomorphic encryption without modulus switching from classical GapSVP. In:Proc. of the Advances in Cryptology (CRYPTO 2012). Heidelberg:Springer-Verlag, 2012. 868-886.[doi:10.1007/978-3-642-32009-5_50]
    [28] Van Dijk M, Gentry C, Halevi S, Vaikuntanathan V. Fully homomorphic encryption over the integers. In:Proc. of the Annual Int'l Conf. on the Theory and Applications of Cryptographic Techniques. Heidelberg:Springer-Verlag, 2010. 24-43.[doi:10.1007/978-3-642-13190-5_2]
    [29] Coron JS, Naccache D, Tibouchi M. Public key compression and modulus switching for fully homomorphic encryption over the integers. In:Proc. of the Annual Int'l Conf. on the Theory and Applications of Cryptographic Techniques. Heidelberg:Springer-Verlag, 2012. 446-464.[doi:10.1007/978-3-642-29011-4_27]
    [30] Luo C, He F, Yan D, Zhang D, Zhou X, Wang BY. PSpec:A formal specification language for fine-grained control on distributed data analytics. In:Proc. of the 39th Int'l Conf. on Software Engineering Companion. Buenos Aires:IEEE Press, 2017. 300-302.[doi:10.1109/ICSE-C.2017.120]
    [31] Rahm E, Do HH. Data cleaning:Problems and current approaches. IEEE Data Engineering Bulletin, 2000,23(4):3-13.
    [32] Tang N. Big data cleaning. In:Chen L, ed. Proc. of the Web Technologies and Applications. Cham:Springer Int'l Publishing, 2014. 13-24.[doi:10.1007/978-3-319-11116-2_2]
    [33] Lee ML, Ling TW, Low WL. IntelliClean:A knowledge-based intelligent data cleaner. In:Proc. of the 6th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. Boston:ACM Press, 2000. 290-294.
    [34] Monge AE. Matching algorithms within a duplicate detection system. IEEE Data Engineering Bulletin, 2000,23(4):14-20.
    [35] Chu X, Ilyas IF, Papotti P. Holistic data cleaning:Putting violations into context. In:Proc. of the 2013 IEEE 29th Int'l Conf. on Data Engineering (ICDE). Brisbane:IEEE, 2013. 458-469.[doi:10.1109/ICDE.2013.6544847]
    [36] Dallachiesa M, Ebaid A, Eldawy A, Elmagarmid A, Ilyas IF, Ouzzani M, Tang N. NADEEF:A commodity data cleaning system. In:Proc. of the 2013 ACM SIGMOD Int'l Conf. on Management of Data. New York:ACM Press, 2013. 541-552.
    [37] Batini C, Cappiello C, Francalanci C, Maurino A. Methodologies for data quality assessment and improvement. ACM Computing Surveys (CSUR), 2009,41(3):16.
    [38] Beskales G, Ilyas IF, Golab L, Galiullin A. On the relative trust between inconsistent data and inaccurate constraints. In:Proc. of the 2013 IEEE 29th Int'l Conf. on Data Engineering (ICDE). Brisbane:IEEE, 2013. 541-552.[doi:10.1109/ICDE.2013.6544854]
    [39] Fan W, Ma S, Tang N, Yu W. Interaction between record matching and data repairing. Journal of Data and Information Quality (JDIQ), 2014,4(4):16.[doi:10.1145/1989323.1989373]
    [40] Fan W, Geerts F, Tang N, Yu W. Inferring data currency and consistency for conflict resolution. In:Proc. of the 2013 IEEE 29th Int'l Conf. on Data Engineering (ICDE). Brisbane:IEEE, 2013. 470-481.[doi:10.1109/ICDE.2013.6544848]
    [41] Shen W, DeRose P, Vu L, Doan A, Ramakrishnan R. Source-Aware entity matching:A compositional approach. In:Proc. of the IEEE 23rd Int'l Conf. on Data Engineering (ICDE 2007). Istanbul:IEEE, 2007. 196-205.[doi:10.1109/ICDE.2007.367865]
    [42] Yang DH, Li NN, Wang HZ, Li JZ, Gao H. The optimization of the big data cleaning based on task merging. Chinese Journal of Computers, 2016,39(1):97-108(in Chinese with English abstract).[doi:10.11897/SP.J.1016.2016.00097]
    [43] Guo ZM, Zhou AY. Research on data quality and data cleaning:A survey. Ruan Jian Xue Bao/Journal of Software, 2002,13(11):2076-2107(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/13/2076.htm[doi:10.13328/j.cnki.jos.2002. 11.003]
    [44] Aggarwal CC. Outlier Analysis. Cham:Springer Int'l Publishing, 2015. 237-263.[doi:10.1007/978-3-319-14142-8_8]
    [45] Chu X, Ilyas IF. Qualitative data cleaning. Proceedings of the VLDB Endowment, 2016,9(13):1605-1608.
    [46] Raman V, Hellerstein JM. Potter's wheel:An interactive data cleaning system. In:Proc. of the 27th VLDB Conf. Roma:VLDB, 2001. 381-390.
    [47] Hua M, Pei J. Cleaning disguised missing data:A heuristic approach. In:Proc. of the 13th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2007). New York:ACM Press, 2007. 950-958.[doi:10.1145/1281192.1281294]
    [48] Elmagarmid AK, Ipeirotis PG, Verykios VS. Duplicate record detection:A survey. IEEE Trans. on Knowledge and Data Engineering, 2007,19(1):1-16.[doi:10.1109/TKDE.2007.250581]
    [49] Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O. Translating embeddings for modeling multi-relational data. In:Proc. of the 26th Int'l Conf. on Neural Information Processing Systems (NIPS 2013). Curran Associates Inc., 2013. 2787-2795.
    [50] Chen M, Tian Y, Yang M, Zaniolo C. Multilingual knowledge graph embeddings for cross-lingual knowledge alignment. In:Proc. of the 26th Int'l Joint Conf. on Artificial Intelligence. AAAI Press, 2017. 1511-1517.
    [51] Sun Z, Hu W, Li C. Cross-Lingual entity alignment via joint attribute-preserving embedding. In:Proc. of the Int'l Semantic Web Conf. Springer-Verlag, 2017. 628-644.[doi:10.1007/978-3-319-68288-4_37]
    [52] Zhu H, Xie R, Liu Z, Sun M. Iterative entity alignment via joint knowledge embeddings. In:Proc. of the 26th Int'l Joint Conf. on Artificial Intelligence. AAAI Press, 2017. 4258-4264.
    [53] Guan S, Jin X, Jia Y, Wang Y, Shen H, Cheng X. Self-Learning and embedding based entity alignment. In:Proc. of the 2017 IEEE Int'l Conf. on Big Knowledge (ICBK). Hefei:IEEE, 2017. 33-40.[doi:10.1109/ICBK.2017.15]
    [54] Chirkova R, Libkin L, Reutter JL. Tractable XML data exchange via relations. In:Proc. of the 20th ACM Int'l Conf. on Information and Knowledge Management. New York:ACM Press, 2011. 1629-1638.[doi:10.1145/2063576.2063813]
    [55] Fagin R, Kimelfeld B, Kolaitis PG. Probabilistic data exchange. Journal of the ACM (JACM), 2011,58(4):15.[doi:10.1145/1989727.1989729]
    [56] Afrati F, Kolaitis PG. Answering aggregate queries in data exchange. In:Proc. of the 27th ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems. Vancouver:ACM Press, 2008. 129-138.[doi:10.1145/1376916.1376936]
    [57] Xiao Z, Fu X, Goh RSM. Data privacy-preserving automation architecture for industrial data exchange in smart cities. IEEE Trans. on Industrial Informatics, 2018,14(6):2780-2791.[doi:10.1109/TⅡ.2017.2772826]
    [58] Wu Y, He F, Zhang D, Li X. Service-Oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Trans. on Services Computing, 2018,11(2):341-353.[doi:10.1109/TSC.2015.2501981]
    [59] Wu M, Li Y. Investigations on XML-based data exchange between heterogeneous databases. In:Proc. of the 2012 Ninth Web Information Systems and Applications Conf. Haikou:IEEE, 2012. 21-24.[doi:10.1109/WISA.2012.44]
    [60] Tyagi H, Watanabe S. Universal multiparty data exchange and secret key agreement. IEEE Trans. on Information Theory, 2017, 63(7):4057-4074.[doi:10.1109/TIT.2017.2694850]
    [61] Tyagi H, Viswanath P, Watanabe S. Interactive communication for data exchange. IEEE Trans. on Information Theory, 2018,64(1):26-37.[doi:10.1109/TIT.2017.2769124]
    [62] Hernández MA, Stolfo SJ. The merge/purge problem for large databases. In:Proc. of the ACM Sigmod Record. San Jose:ACM Press, 1995. 127-138.[doi:10.1145/223784.223807]
    [63] Doan A, Halevy A, Ives Z. Principles of Data Integration. Burlington:Elsevier, 2012. 19-58.
    [64] Halevy AY. Answering queries using views:A survey. The VLDB Journal, 2001,10(4):270-294.[doi:10.1007/s007780100054]
    [65] Hull R. Managing semantic heterogeneity in databases:A theoretical prospective. In:Proc. of the 16th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. New York:ACM Press, 1997. 51-61.[doi:10.1145/263661.263668]
    [66] Lenzerini M. Data integration:A theoretical perspective. In:Proc. of the 21st ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems. New York:ACM Press, 2002. 233-246.[doi:10.1145/543613.543644]
    [67] Ullman JD. Information integration using logical views. In:Proc. of the Int'l Conf. on Database Theory. Berlin, Heidelberg:Springer-Verlag, 1997. 19-40.[doi:10.1007/3-540-62222-5_34]
    [68] Ipeirotis PG, Gravano L, Sahami M. Probe, count, and classify:Categorizing hidden Web databases. In:Proc. of the 2001 ACM SIGMOD Int'l Conf. on Management of Data (SIGMOD 2001). Santa Barbara:ACM Press, 2001. 67-78.[doi:10.1145/376284. 375671]
    [69] Wu W, Yu C, Doan A, Meng W. An interactive clustering-based approach to integrating source query interfaces on the deep Web. In:Proc. of the 2004 ACM SIGMOD Int'l Conf. on Management of Data (SIGMOD 2004). Paris:ACM Press, 2004. 95-106.[doi:10.1145/1007568.1007582]
    [70] He H, Meng W, Yu C, Wu Z. Automatic integration of Web search interfaces with WISE-Integrator. The VLDB Journal, 2004, 13(3):256-273.[doi:10.1007/s00778-004-0126-4]
    [71] He H, Meng W, Yu C, Wu Z. Constructing interface schemas for search interfaces of web databases. In:Proc. of the Int'l Conf. on Web Information Systems Engineering. New York:Springer-Verlag, 2005. 29-42.[doi:10.1007/11581062_3]
    [72] Wu Z, Raghavan V, Du C, Komanduru SC, Meng W, He H, Yu C. SE-LEGO:Creating metasearch engines on demand. In:Proc. of the 26th Annual Int'l ACM SIGIR Conf. on Research and Development in Informaion Retrieval. Toronto:DBLP, 2003. 464-464.[doi:10.1145/860435.860555]
    [73] Liu W, Meng XF, Meng WY. A survey of deep Web data integration. Chinese Journal of Cumputers, 2007,30(9):1475-1489(in Chinese with English abstract).
    [74] Calì A, Calvanese D, De Giacomo G, Lenzerini M. Accessing data integration systems through conceptual schemas. In:Proc. of the Int'l Conf. on Conceptual Modeling. Berlin Heidelberg:Springer-Verlag, 2001. 270-284.[doi:10.1007/3-540-45581-7_21]
    [75] Goh CH, Bressan S, Madnick S, Siegel M. Context interchange:New features and formalisms for the intelligent integration of information. ACM Trans. on Information Systems (TOIS), 1999,17(3):270-293.[doi:10.1145/314516.314520]
    [76] Duschka OM, Genesereth MR. Answering recursive queries using views. In:Proc. of the 16th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. Tucson:ACM Press, 1997. 109-116.
    [77] Halevy AY. Theory of answering queries using views. ACM SIGMOD Record, 2000,29(4):40-47.[doi:10.1145/369275.369284]
    [78] Abiteboul S, Duschka OM. Complexity of answering queries using materialized views. In:Proc. of the 17th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. Seattle:ACM Press, 1998. 254-263.[doi:10.1145/275487.275516]
    [79] Widom J. Research problems in data warehousing. In:Proc. of the 4th Int'l Conf. on Information and Knowledge Management. Baltimore:ACM Press, 1995. 25-30.
    [80] Chaudhuri S, Dayal U. An overview of data warehousing and OLAP technology. ACM Sigmod Record, 1997,26(1):65-74.[doi:10.1145/248603.248616]
    [81] Benedikt M, Grau BC, Kostylev EV. Logical foundations of information disclosure in ontology-based data integration. Artificial Intelligence, 2018,262(2018):52-95.
    [82] Tao C, Zhang L, Shi BL. Query processing for ontology-based XML data integration. Journal of Computer Research and Development, 2005,42(3):112-121(in Chinese with English abstract).
    [83] Gregory A. Data governance-Protecting and unleashing the value of your customer data assets. Journal of Direct, Data and Digital Marketing Practice, 2011,12(3):230-248.[doi:10.1057/dddmp.2010.41]
    [84] Wróbel A, Komnata K, Rudek K. IBM data governance solutions. In:Proc. of the 2017 Int'l Conf. on Behavioral, Economic, Socio-Cultural Computing (BESC). Krakow:IEEE, 2017. 1-3.[doi:10.1109/BESC.2017.8256387]
    [85] Khatri V, Brown CV. Designing data governance. Communications of the ACM, 2010,53(1):148-152.[doi:10.1145/1629175. 1629210]
    [86] Wu M, Wu X. On big wisdom. Knowledge and Information Systems, 2018,58(2019):1.[doi:10.1007/s10115-018-1282-y]
    [87] Bizer C, Berners-Lee T. Linked data-the story so far. Int'l Journal on Semantic Web and Information Systems, 2009,5(3):1-22.[doi:10.4018/jswis.2009081901]
    [88] Liu Q, Li Y, Duan H, Liu Y, Qin ZG. Konwledge gragh construction techniques. Journal of Computer Research and Development, 2016,53(3):582-600(in Chinese with English abstract).[doi:10.7544/issn1000-1239.2016.20148228]
    [89] Yang YJ, Xu B, Hu JW, Tong MH, Zhang P, Zheng L. Accurate and efficient method for constructing domain knowledge graph. Ruan Jian Xue Bao/Journal of Software, 2018,29(10):2931-2947(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5552.htm[doi:10.13328/j.cnki.jos.005552]
    [90] Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou ZH, Steinbach M, Hand DJ, Steinberg D. Top 10 algorithms in data mining. Knowledge and Information Systems, 2008,14(1):1-37.[doi:10.1007/s10115-007-0114-2]
    [91] Wu XD, Ji SK. Comparative study on MapReduce and spark for big data analytics. Ruan Jian Xue Bao/Journal of Software, 2018, 29(6):1770-1791(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5557.htm[doi:10.13328/j.cnki.jos.005557]
    [92] Meng XF, Du ZJ. Research on the big data fusion:Issues and challenges. Journal of Computer Research and Development, 2016, 53(2):231-246(in Chinese with English abstract).[doi:10.7544/issn1000-1239.2016.20150874]
    [93] Ni Q, Bertino E, Lobo J, Brodie C, Karat CM, Karat J, Trombeta A. Privacy-Aware role-based access control. ACM Trans. on Information and System Security (TISSEC), 2010,13(3):24.[doi:10.1145/1805974.1805980]
    附中文参考文献:
    [1] 李建中,王宏志,高宏.大数据可用性的研究进展.软件学报,2016,27(7):1605-1625. http://www.jos.org.cn/1000-9825/5038.htm[doi:10.13328/j.cnki.jos.005038]
    [5] 黄刘生,田苗苗,黄河.大数据隐私保护密码技术研究综述.软件学报,2015,26(4):945-959. http://www.jos.org.cn/1000-9825/4794.htm[doi:10.13328/j.cnki.jos.004794]
    [6] 张绍华,潘蓉,宗宇伟.大数据治理与服务.上海:上海科学技术出版社,2016.1-224.
    [8] 吴信东,何进,陆汝钤,郑南宁.从大数据到大知识:HACE+BigKE.自动化学报,2016,42(7):965-982.[doi:10.16383/j.aas.2016. c160239]
    [12] 冯登国,张敏,张妍,徐震.云计算安全研究.软件学报,2011,22(1):71-83. http://www.jos.org.cn/1000-9825/3958.htm[doi:10.3724/SP.J.1001.2011.03958]
    [42] 杨东华,李宁宁,王宏志,李建中,高宏.基于任务合并的并行大数据清洗过程优化.计算机学报,2016,39(1):97-108.[doi:10.11897/SP.J.1016.2016.00097]
    [43] 郭志懋,周傲英.数据质量和数据清洗研究综述.软件学报,2002,13(11):2076-2107. http://www.jos.org.cn/1000-9825/13/2076.htm[doi:10.13328/j.cnki.jos.2002.11.003]
    [73] 刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述.计算机学报,2007,30(9):1475-1489.
    [82] 陶春,张亮,施伯乐.基于本体的XML数据集成的查询处理.计算机研究与发展,2005,42(3):112-121.
    [88] 刘峤,李杨,段宏,刘瑶,秦志光.知识图谱构建技术综述.计算机研究与发展,2016,53(3):582-600.[doi:10.7544/issn1000-1239.2016. 20148228]
    [89] 杨玉基,许斌,胡家威,仝美涵,张鹏,郑莉.一种准确而高效的领域知识图谱构建方法.软件学报,2018,29(10):2931-2947. http://www.jos.org.cn/1000-9825/5552.htm[doi:10.13328/j.cnki.jos.005552]
    [91] 吴信东,嵇圣硙.MapReduce与Spark用于大数据分析之比较.软件学报,2018,29(6):1770-1791. http://www.jos.org.cn/1000-9825/5557.htm[doi:10.13328/j.cnki.jos.005557]
    [92] 孟小峰,杜治娟.大数据融合研究:问题与挑战.计算机研究与发展,2016,53(2):231-246.[doi:10.7544/issn1000-1239.2016. 20150874]
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

吴信东,董丙冰,堵新政,杨威.数据治理技术.软件学报,2019,30(9):2830-2856

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:December 25,2018
  • Revised:March 11,2019
  • Online: May 24,2019
You are the first2044065Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063