刘枬,刘润,李卓城.考虑数据维度与质量维度二重性的数据质量评估研究[J].情报工程,2024,10(4):025-035 |
考虑数据维度与质量维度二重性的数据质量评估研究 |
Research on Data Quality Evaluation Considering the Duality of Data Dimension and Quality Dimension |
|
DOI:10.3772/j.issn.2095-915X.2024.04.003 |
中文关键词: 质量评估模型;数据维度;数据质量维度 |
英文关键词: Quality Evaluation Model; Data Dimension; Data Quality Dimension |
基金项目:教育部基金项目“大数据资产的定价机理、方法和规范研究”(20XJAZH007);重庆市社会科学规划基金项目“西部陆海新通道跨境数据互联互通机制研究”(2023ZDLH11)。 |
作者 | 单位 | 刘枬 | 重庆交通大学经济与管理学院 重庆 400074 | 刘润 | 重庆交通大学经济与管理学院 重庆 400074 | 李卓城 | 北京邮电大学玛丽女王海南学院 北京 572400 |
|
摘要点击次数: 262 |
全文下载次数: 306 |
中文摘要: |
[目的/意义]针对数据开发中关联性差、完整性不足所引发的高成本问题,本文构建了考虑数据维度与质量维度的数据质量评估模型,旨在评估数据的各维度质量并进行价值分析。[方法/过程]首先通过数据质量相关的文献分析,利用修正德尔菲法构建了数据质量维度指标体系;之后使用层次分析法选取指标,采用蒙特卡洛采样计算Shapley值进而构建了针对数据不同维度的质量评估模型。在房地产数据质量评估案例中,对其12个数据维度运用质量评估模型进行评估,结果显示其中三个维度质量较差,建议删除,同时针对其他维度提出了修改意见。[局限]未解决数据质量维度指标难以量化问题,也未对数据进行分类,仅提出笼统的评估模型。[结果/结论]将数据维度和数据质量维度进行融合,并基于数据质量维度指标体系及Shapley构建了一个较为全面的数据质量评估模型,能更为详细地对数据库进行质量评估。 |
英文摘要: |
[Objective/Significance] This article constructs a dimensional evaluation model based on Shapley value to address the high cost issues caused by poor relevance and insufficient integrity in data development. The purpose is to evaluate the quality of each dimension of data and conduct value analysis. [Methods/Processes] The research first analyzes the literature related to data quality, and uses the modified Delphi method to construct a data quality evaluation index system. Then, the analytic hierarchy process is used to select indicators, and the Monte Carlo sampling method is used to calculate the Shapley value and obtain the dimensional evaluation results of data. In the case study of real estate data quality evaluation, the quality evaluation model is applied to evaluate 12 data dimensions. The results show that three dimensions have poor quality and are recommended for deletion. At the same time, suggestions for modification are proposed for other dimensions. [Limitations] This article does not address the difficulty of quantifying data quality dimensions, nor does it classify data, only proposing a general evaluation model. [Results/Conclusions] The research integrates data dimensions and data quality dimensions, and constructs a relatively complete data quality evaluation model based on the data quality evaluation index system and Shapley value. It can conduct more detailed quality evaluation of the database. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |