融合文本与知识图谱的蛋鸡疫病智能诊断模型

    Intelligent diagnosis model for laying hens diseases using text and knowledge graph

    • 摘要: 针对利用单一文本描述进行蛋鸡疫病诊断存在关联信息分析不够全面、未能提供完整蛋鸡疫病知识,进而导致在复杂蛋鸡疫病诊断中存在准确率不高等问题,该研究提出一种采用基于转换器的双向编码预训练模型(bidirectional encoder representation from transformers,BERT)融合蛋鸡典型疫病知识图谱和文本的方法,结合双向长短期记忆(bidirectional long short-term memory,BiLSTM)神经网络构建了BERT-LHDKG(BERT-laying hens disease knowledge graph)诊断模型,实现对滑液囊支原体、新城疫、传染性鼻炎等38种蛋鸡典型疫病的智能诊断。模型通过引入表示知识图谱的三元组向量,使模型更全面地结合疫病文本和知识图谱数据对蛋鸡发病情况进行综合分析;通过增加BERT模型的Embedding结构,将文本特征向量与三元组向量在BERT模型内部相加形成融合向量,有助于模型提取更有用的特征进行疫病分析和诊断。性能对比试验结果显示,BERT-LHDKG诊断模型的宏准确率为94.27%,宏召回率为94.12%,宏F1为94.01%,与TextCNN、结合CNN(convolutional neural networks)的BERT模型、结合BiLSTM的ERNIE模型等深度学习模型相比,宏准确率分别提升了10.02、2.64、2.18个百分点,宏召回率分别提升了10.28、2.29、2.29个百分点,宏F1分别提升了10.66、2.51、2.19个百分点。对于蛋鸡养殖过程中容易发生的病毒性疫病、细菌性疫病、中毒性疫病和代谢性疫病,BERT-LHDKG诊断模型的宏F1分别为96.43%、95.57%、96.72%、98.24%,性能均优于其他对比模型。研究结果表明融入知识图谱可以使模型将疫病文本中的实体、关系链接到知识图谱中对应的实体,丰富文本的语义信息,提升模型全面理解文本内容的能力,进而提高模型进行疫病诊断的准确性和鲁棒性,为畜禽疫病智能诊断提供了新的思路;此外,基于BERT-LHDKG诊断模型开发的蛋鸡疫病诊断Web系统以人机对话的形式提高了养殖户远程诊断蛋鸡疫病的灵活性。

       

      Abstract: Frequent occurrence of diseases has posed a serious threat to the healthy development of the laying hens industry. Previous studies have focused mainly on the application of artificial intelligence technologies to the intelligent diagnosis of laying hens diseases at present. However, the low accuracy with single data or knowledge cannot fully meet the diagnosis of complex laying hens diseases, particularly for the small number of diagnosable diseases. Graph structure and knowledge graphs can be expected to easily acquire a large amount of data. Furthermore, diagnosing diseases has been limited in the incomplete analysis of correlation information and incomplete knowledge of laying hens using a single text description. This study aims to propose a Bidirectional Encoder Representation from Transformers (BERT) that integrates the knowledge graph and text of typical laying hens diseases using a Transformer based Bidirectional Long Short Term Memory Network (BiLSTM), in order to construct BERT-LHDKG (BERT Laying Hens Disease Knowledge). The diagnostic model was used for intelligent diagnosis of 38 typical diseases in the laying hens, such as Mycoplasma Synoviae, Newcastle Disease, and Infectious Rhinitis. The disease text and knowledge graph data were more comprehensively combined to conduct a comprehensive analysis of the incidence of laying hens. The triple vectors were introduced to represent the knowledge graph. The text feature vector and the triple vector were added inside the BERT model to form a fusion vector. More useful features were extracted for the disease analysis and diagnosis. The performance comparison test showed that the macroprecision of the BERT-LHDKG diagnostic model was 94.27%, the macrorecall was 94.12%, and the macro-F1 was 94.01%. The macroprecision was improved by 10.02, 2.64, 2.18 percentage points, the macrorecall was improved by 10.28, 2.29, 2.29 percentage points, and macro-F1 was improved by 10.66, 2.51, 2.19 percentage points, respectively, compared with deep learning models, such as TextCNN, the BERT model combined with CNN (convolutional neural networks), and the ERNIE model combined with BiLSTM. Better performance was also achieved in the five viral, bacterial, toxic, and metabolic diseases that were prone to occur in laying hens farming. The macro-F1 values of the BERT-LHDKG diagnostic model were 96.43%, 95.57%, 96.72%, and 98.24%, respectively. The ablation experiments showed that the removal of each layer from the BERT-LHDKG structure reduced the diagnostic accuracy of the improved model. Meanwhile, the macroprecision, macrorecall, and macro-F1 of the improved model decreased by 5.14, 4.98, and 5.46 percentage points, respectively, after the removal of the laying hens disease knowledge graph. Therefore, the knowledge graph was integrated to link the entities and relationships in the epidemic texts to the corresponding entities in the knowledge graph. The semantic information of the text was enriched to fully understand the content of the text. Thereby, the accuracy and robustness of the improved model was improved for epidemic diagnosis. In addition, the laying hen disease diagnosis web system was developed on the basis of the BERT-LHDKG diagnostic model, indicating the flexibility of remote diagnosis for laying hen diseases through human-machine dialogue.

       

    /

    返回文章
    返回