古迎志,董诚,裴兵兵,杜永萍.基于术语抽取与分级匹配的项目指南推荐方法[J].情报工程,2018,4(3):058-068 |
基于术语抽取与分级匹配的项目指南推荐方法 |
The Recommendation Approach Based on Term Extraction and Graduation Matching |
|
DOI:10.3772/j.issn.2095-915X.2018.03.008 |
中文关键词: 术语抽取;推荐技术;科技文献 |
英文关键词: Term extraction; recommendation technology; scientific literature |
基金项目:科技部创新方法工作专项(2015IM020500);北京市自然科学基金资助项目(4153058) |
作者 | 单位 | 古迎志 | 北京工业大学信息学部 | 董诚 | 2.中国科学技术信息研究所 3.富媒体数字出版内容组织与知识服务重点实验 | 裴兵兵 | 北京工业大学信息学部 | 杜永萍 | 北京工业大学信息学部 |
|
摘要点击次数: 2430 |
全文下载次数: 1442 |
中文摘要: |
信息推荐是自然语言处理领域的重要技术,为进一步向科研人员进行有效的项目指南推荐,本文采用术语词表征文本特征的方式,进行分级匹配推送。通过基于词性规则和句法信息相结合的方法抽取候选术语词,并利用基于统计的方法如C-value、SCP(Symmetrical Conditional Probability)等进行术语词过滤,提高抽取质量。由指南和科研人员术语词进行分级匹配来表征二者之间的相似度,进而实现对科研人员的个性化指南推荐。对国家科技管理信息系统公共服务平台2017 年发布的42 篇指南设计实验进行验证,分析术语抽取结果,评价指南推荐的准确率,结果表明基于C-value+SCP的方法取得了更优的术语抽取质量,指南的个性化推荐准确率最高达到80%。 |
英文摘要: |
Text recommendation is an important technology in the field of Natural Language Processing.In order to recommend the project guidelines to the researcher, this paper uses terminology to represent the text features and gives the recommendation based on the graduation matching. The rule of part of speech and syntactic information are used for term extraction and the candidate terms are filtered by statistical methods, such as C-value, SCP (Symmetrical Conditional Probability) and so on, so as to improve the extraction quality. The graduation based matching between the guidelines and researchers’terminology is used to measure the similarity, and then to achieve the personalized recommendation. The experiments are implemented on the 42 project guidelines published by the public service platform in 2017.The results show that the C-value+SCP based method achieves better term extraction quality and the personalized recommendation precision is up to 80%. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |