孟旭阳,白海燕.文献摘要结构功能识别在关键词抽取中的应用[J].情报工程,2022,8(1):079-089 |
文献摘要结构功能识别在关键词抽取中的应用 |
Structure-Function Recognition of Literature Abstract and Application in Keyword Extraction |
|
DOI:10.3772/j.issn.2095-915X.2022.01.008 |
中文关键词: 学术文献;关键词抽取;结构功能;分类模型 |
英文关键词: Academic literature; keyword extraction; structure-function; classification model |
基金项目: |
作者 | 单位 | 孟旭阳 | 中国科学技术信息研究所 北京 100038 | 白海燕 | 中国科学技术信息研究所 北京 100038 |
|
摘要点击次数: 1704 |
全文下载次数: 1424 |
中文摘要: |
[ 目的 / 意义 ] 传统的关键词自动抽取将摘要看成一个整体,常以候选词的出现频次等非语义信息构建特征,并未考虑学术文献摘要中目的、方法、结论等各个结构功能语义蕴含的差异性。本文以中文文献为研究对象,探讨候选词所在的结构功能域对关键词抽取的影响和作用。[ 方法 / 过程 ] 本文将文献标题和摘要文本共分为 4 个结构功能域,在传统的词频、词长、词跨度等基准特征上,融合了基于 BERT 的语义特征和结构功能特征,并以不同的特征组合方式,使用图书情报领域的中文学术文献,基于分类模型进行关键词自动抽取实验。[ 结果 / 结论 ] 实验结果表明,融合结构功能特征后,关键词抽取效果整体提升了 6.82%,证明了学术文献摘要结构功能的识别形成的结构功能特征对关键词抽取效果的提升有良好作用。 |
英文摘要: |
[Purpose/significance] Traditional automatic keyword extraction takes the abstract text as a whole, and often constructs features based on the frequency of alternative words, which are ignored the differences in semantic meaning of each structurefunction in the academic abstract, such as the purpose, method and conclusion. This paper takes Chinese literature as the research object to research the influence of the structure-function of candidate words on keyword extraction. [Method/process] This paper divided the document title and abstract into 4 structure-function domains. On the traditional benchmark features such as word frequency, word length and word span, we proposed a mixed feature method with academic text semantic features based on BERT and structure features, and conducted automatic keyword extraction experiments based on the classification model using different feature combination methods and Chinese academic literature in the field of library and information. [Result/conclusion] The experimental results show that the keyword extraction effect is improved by 6.82% after fusing the structure-function features, which proves that the structure-function recognition of academic abstracts plays a positive role in automatic keyword extraction. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |