基于隐马尔科夫模型的专利功效词识别

张博培; 杜永萍; 马文建

文章摘要

张博培,杜永萍,马文建.基于隐马尔科夫模型的专利功效词识别[J].情报工程,2015,1(3):081-089

基于隐马尔科夫模型的专利功效词识别

Efficacy Word Recognition Method Based on the Hidden Markov Model

DOI：10.3772/j.issn.2095-915X.2015.03.011

中文关键词: 专利数据，功效词识别，隐马尔科夫模型

英文关键词: Patent Data, Efficacy Word Recognition, Hidden Markov Model

基金项目:国家科技支撑计划子课题（2013BAH21B02-01）；北京市自然科学基金资助项目(4153058)；上海市智能信息处理重点实验室开放基金（IIPL-2014-004）

作者	单位
张博培	北京工业大学计算机学院
杜永萍	北京工业大学计算机学院
马文建	北京工业大学计算机学院

摘要点击次数: 2861

全文下载次数: 2395

中文摘要:

随着专利数据规模的不断增长，对专利数据的深入挖掘也变得日益重要，特别是专利数据中所蕴含的技术功效等信息具有较高的价值。本文提出了一种基于隐马尔科夫模型的专利功效词识别方法，通过词法与句法分析筛选出候选功效词，在此基础上，采用隐马尔科夫模型并结合专利发明改进的特征设计了功效词识别算法，对候选功效词进行过滤。在新能源汽车等不同领域的专利数据集上，以准确率与召回率作为评价标准，验证所提出方法的有效性。实验结果表明，此方法有效提高了识别准确率与召回率。

英文摘要:

With the development of the patent data, the technique of patent data mining becomes more important,especially the technical efficiency information entailed in the patent data which have the higher value. We put forward the method to recognize the efficacy word based on the hidden Markov model. We select the candidate efficacy word by the use of lexical and syntactic analysis approach firstly. The recognition algorithm is designed by the combination of Hidden Markov model and features in the patent data. We give the experiment in different patent fields and the metric of precision and recall are used for the evaluation. The experimental result shows that our method gets the better performance.

查看全文查看/发表评论下载PDF阅读器

关闭