Incorporating relational contexts and paths for tea-planting knowledge graph completion
-
Graphical Abstract
-
Abstract
Tea planting has developed rapidly in China in recent years. It is of ever-increasing importance to integrate massive and complex knowledge in a structured manner, particularly with the development of the internet. The diverse and scattered sources of tea planting have resulted in fragmented knowledge. Knowledge graphs can be expected to greatly enhance the manageability and accessibility of data. However, it is also still lacking in the complete knowledge graphs of tea planting. It is of great significance to promote the informatization and modernization of the entire tea industry. In this study, a systematic and structured knowledge graph of tea planting was constructed on the orderly knowledge resources for the tea planting field. A tea-planting knowledge graph completion (TPKGC) model was integrated into the relationship context and path for the missing relationships between entities in the tea-planting knowledge graph. The improved model consisted mainly of three parts: a relationship context layer, a relationship path layer, and a fusion output layer. In the relationship context layer, the TeaConAggr (tea context aggregate) module was first used to aggregate the hop relationship contexts of entities. After that, the hop relationship contexts were summarized to obtain the relational context of entity pairs using a multi-layer relational messaging mechanism. In the relationship path layer, the relationship path aggregation module was used to aggregate the relationship paths among entities. The relationship path learning module was used to learn the paths. In the fusion output layer, multiplication attention and Softmax function were used to fuse the relationship context representation from the relationship context layer with the relationship path feature from the relationship path layer, in order to obtain the weight of each path. Subsequently, the weight was calculated with the relationship path feature to obtain the aggregated representation of the relationship path. Finally, additive attention was used to process the relationship context representation. The relationship path that aggregated representation was obtained for the predicted ranking of relationships between entity pairs, in order to achieve the goal of knowledge graph relationship completion. Data collection was constructed a knowledge graph in the field of tea planting. There were two major types of resources: books and the internet. The dataset was defined as 16 entity and 16 relationship types to represent various entities and their complex relationships in the tea planting knowledge graph. The TPKGData dataset was also developed for the tea-planting knowledge graph. The experimental results showed that the performance of the improved model was achieved in 85.40%, 80.95%, and 90.08% on the Mean Reciprocal Rank, Hits@1, and Hits@3 metrics, respectively, which were 2.56, 2.63, and 4.25 percentage points higher than those of Shallom model. The effectiveness of the model was fully met to predict the missing relationships between entities in the tea-planting knowledge graph. In addition, the comparative experiments were also conducted on the publicly available datasets FB15K-237 and WN18RR, in order to further verify the generalization of the model. At the same time, the TPKGC model also achieved the best performance on these datasets. Therefore, the model was performed outstandingly on the specific domain datasets, indicating better generalization. The stable performance was achieved in the different datasets of the knowledge graph. The TPKGC model demonstrated excellent performance and generalization potential in the task of relationship completion within the tea-planting knowledge graph. The finding can also provide important guidance for the construction of knowledge graphs.
-
-