神经网络语言模型在统计机器翻译中的应用

张家俊; 宗成庆

文章摘要

张家俊,宗成庆.神经网络语言模型在统计机器翻译中的应用[J].情报工程,2017,3(3):021-028

神经网络语言模型在统计机器翻译中的应用

Application of Neural Language Model in Statistical Machine Translation

DOI：10.3772/j.issn.2095-915X.2017.03.004

中文关键词: 统计机器翻译，神经网络语言模型，基于词的语言模型，基于短语的语言模型

英文关键词: Statistical machine translation, neural language model, word-based language model, phrasebased language model

基金项目:本文受国家自然科学基金“视听觉信息的认知计算”重点项目：面向汉语文本理解的语义计算方法（91520204），国家自然科学基金面上项目：基于弱监督的神经网络翻译模型研究（61673380）的资助。

作者	单位
张家俊	模式识别国家重点实验室中国科学院自动化研究所中国科学院大学
宗成庆	中国科学院脑科学与智能技术卓越创新中心

摘要点击次数: 2321

全文下载次数: 1526

中文摘要:

近两年来，神经机器翻译（Neural Machine Translation, NMT）模型主导了机器翻译的研究，但是统计机器翻译（Statistical Machine Translation, SMT）在很多应用场合（尤其是专业领域）仍有较强的竞争力。如何利用深度学习技术提升现有统计机器翻译的水平成为研究者们关注的主要问题。由于语言模型是统计机器翻译中最核心的模块之一，本文主要从语言模型的角度入手，探索神经网络语言模型在统计机器翻译中的应用。本文分别探讨了基于词和基于短语的神经网络语言模型，在汉语到英语和汉语到日语的翻译实验表明神经网络语言模型能够显著改善统计机器翻译的译文质量。

英文摘要:

Neural Machine Translation (NMT) dominates the research of machine translation in recent two years. However, Statistical Machine Translation (SMT) is very competitive in many scenarios such as some specific domains. It became a key issue how to apply deep learning technology to improve SMT performance. As language model is one of the most crucial modules in SMT, this paper investigated the usage of neural language model in statistical machine translation. We explored respectively the word-based and phrasebased neural language model, and evaluated the models on both Chinese-to-English and Chinese-to-Japanese translation tasks. The extensive experiments demonstrated that the neural language models can significantly improve the translation performance of statistical machine translation.

查看全文查看/发表评论下载PDF阅读器

关闭