文章摘要
李善青.一种用于科技项目查重的数据整合及描述模型[J].情报工程,2017,3(5):053-059
一种用于科技项目查重的数据整合及描述模型
A Data Model of Integration and Representation for Similar Scientific Projects Detection
  
DOI:10.3772/j.issn.2095-915X.2017.05.007
中文关键词: 数据整合,描述模型,科技项目查重,Hadoop 架构
英文关键词: Data integration, project representation model, similar scientific project detection, Hadoop architecture
基金项目:本文受国家自然科学基金“大数据挖掘在科技项目查重中的应用研究”(编号:71303223)的资助。
作者单位
李善青 中国科学技术信息研究所 
摘要点击次数: 2642
全文下载次数: 1725
中文摘要:
      整合科技项目所产出成果的信息能间接反映项目的研究内容,可以弥补项目查重过程中申报书难以获取的不足,具有重要的研究意义。本文提出一种整合科技项目相关产出信息的数据模型。该模型通过整合项目产出的科技报告、学术论文和科技成果等信息,抽取其中的关键词、标题和摘要等对项目的研究内容进行准确的描述,并强化了项目负责人和承担机构等辅助信息对项目查重的重要性,从而为解决项目查重问题提供客观的数据支撑。
英文摘要:
      Information integration of research project outputs which are closely related to research contents can represent the research content of a project without the project proposal. This indirect description method is of important research value for the similar project detection. This paper proposed a data integration model of research project outputs, which precisely represented the research content of a project with keywords, titles and abstracts extracted from its published reports, papers and achievements. The information of principle investigator and research organization was also introduced and applied to reinforce the similarity calculation. This model will provide data support and lay the foundation for similar project detection.
查看全文   查看/发表评论  下载PDF阅读器
关闭

分享按钮