基于大语言模型与混合检索增强的教育知识图谱构建与智能问答

刘杰平; 夏磊

文章摘要

刘杰平,夏磊.基于大语言模型与混合检索增强的教育知识图谱构建与智能问答[J].情报工程,2026,(1):029-039

基于大语言模型与混合检索增强的教育知识图谱构建与智能问答

Educational Knowledge Graph Construction and Intelligent Question Answering Based on Large Language Models and Hybrid Retrieval-Augmented Generation

DOI：

中文关键词: 大语言模型；知识图谱；Graph RAG；Hybrid RAG

英文关键词: LLM; Knowledge Graph; Graph RAG; Hybrid RAG

基金项目:中国高等教育学会高等教育科学研究规划课题“大数据专业知识图谱构建与智能问答平台研究”（22SZH0305）；全国高等院校计算机基础教育研究会教学研究项目“融合大语言模型的知识图谱构建在人才培养中的应用研究”（2024-AFCEC-219）。

作者	单位
刘杰平	成都东软学院智能科学与工程学院　成都　611844
夏磊	成都东软学院智能科学与工程学院　成都　611844

摘要点击次数: 406

全文下载次数: 263

中文摘要:

[目的意义]教育知识图谱在帮助学习者理解知识结构、规划学习路径等方面扮演着关键角色。然而，其构建过程面临如命名实体识别（NER）和关系抽取（RE）等技术难题。本文提出一种结合大语言模型（LLMs）和混合检索增强生成（Hybrid RAG）技术的框架，旨在优化教育知识图谱的构建及智能问答。[方法/过程]该框架融合多源数据，遵循OBE 教育理念，通过对LLM 进行微调，有效提升了NER 和RE 任务的性能。此外，Hybrid RAG 技术进一步增强了问答系统的准确性。[ 结果/ 结论] 首先，提出了基于OBE 理念的知识图谱构建新视角；其次，构建了Text2RDF 数据集，提升了LLM 在NER 和RE 任务的性能；最后，通过整合Vector RAG 与Graph RAG 的优势，改善了模型在事实检索、多跳推理和信息融合方面的性能。实验结果显示，经过LoRA 微调的模型在NER 和RE 任务上的F1 分数显著提高，且Hybrid RAG 在多种问答场景中均表现出更高的准确性。

英文摘要:

[Objective/Significance] The educational knowledge graph plays a crucial role in helping learners understand the structure of knowledge and plan learning paths. However, its construction faces technical challenges such as Named Entity Recognition (NER) and Relation Extraction (RE). This study proposes a framework that combines Large Language Models (LLMs) and Hybrid Retrieval-Augmented Generation (Hybrid RAG) technology, aiming to optimize the construction of the educational knowledge graph and intelligent question answering. [Methods/Processes] This framework integrates multi-source educational data in accordance with the Outcome-Based Education (OBE) paradigm. Through fine-tuning the LLM, it achieves significant performance improvements in both NER and RE tasks. Additionally, the Hybrid RAG technology further enhances the accuracy of the question answering system. [Results/Conclusions] Firstly, a new perspective for constructing knowledge graphs based on the OBE is proposed; secondly, the development of the Text2RDF dataset, which enhances the performance of LLM in NER and RE tasks; and finally, the integration of the advantages of Vector RAG and Graph RAG, which improves the model’sperformance in fact retrieval, multi-hop reasoning, and information integration. Experimental results show that the model finetuned with LoRA significantly increases the F1 score in NER and RE tasks, and Hybrid RAG demonstrates higher accuracy in various question answering scenarios.

查看全文查看/发表评论下载PDF阅读器

关闭