文章摘要
杨雅娜,刘胜奇.基于TValue融合领域度的术语抽取法[J].情报工程,2015,1(5):025-031
基于TValue融合领域度的术语抽取法
Automatic Term Extraction Based on Advanced TValue and Fieldhood Integration
  
DOI:10.3772/j.issn.2095-915X.2015.05.004
中文关键词: 术语抽取,术语识别,数据挖掘,领域度
英文关键词: Term Extraction, Term Recognition, Data Mining, Fieldhood
基金项目:
作者单位
杨雅娜 中国邮政储蓄银行 
刘胜奇 中国专利信息中心 
摘要点击次数: 3215
全文下载次数: 2422
中文摘要:
      提出 ATValue(Advanced TValue and Fieldhood Integration) 术语抽取法。为提高术语抽取质量,在 TValue 五属性的基础上,提出领域度。通过相关性分析获得六属性组合值 AValue,最后识别AValue 高于术语可信度的词串来选择候选术语。能源行业的实验结果表明:ATValue 术语抽取法的F值约比 TValue 术语抽取法高出 2 个百分点,原因在于 ATValue 的领域度测算了词串中各种单词对领域的贡献。
英文摘要:
      It proposes an automatic term extraction based on ATValue (advanced TValue and fieldhood integration). In order to increase the quality of term extraction, it puts forward the degree of fieldhood based on the five attributes of TValue. The value of AValue is computed by the six attributes of the strings based on multiplication of probability after their correlations are analyzed. It gains the candidate terms by the analysis of the strings whose value of AValue is more than the pre-defined confidence threshold. The simulation results of term extraction in energy industry show that the F-score of automatic term extraction based on ATValue is about 2% higher than that based on TValue, because it measures the score of importance of compound words by the degree of fieldhood of ATValue.
查看全文   查看/发表评论  下载PDF阅读器
关闭

分享按钮