Joint extraction model for entity relationships using reinforcement learning in low resource scenarios
-
Graphical Abstract
-
Abstract
Annotating entities and relationships have been found with the high cost in the joint extraction tasks of entity relationship. There is the strong correlation with the professional fields. It is crucial to improve the extraction performance of the model in the scenarios of low resource. In this study, the reinforcement learning-based joint extraction model was proposed for the entity relationship. The model included the following two modules: entity recognition and relationship extraction. The extraction efficiency and generalization were improved to jointly extract the entities and relationships in the low resource scenarios. The feature extractor was used to convert the text into feature representations with the richer semantics, which were shared by entity recognition and relationship extraction modules. Entity recognition was realized to utilize the CRF for sequence labeling. The output entity labeling sequence was traversed to locate the entity boundaries, and then connect the entity features for the entity embedding vectors. The limited labeled and abundant unlabeled data was considered in the low resource scenarios. Reinforcement learning was used to train the relationship extraction module. The input of the relationship extraction module included sentence features, entity embeddings, and relationships for the labeled data, and sentence features, and entity embeddings for the unlabeled data. The improved model was trained to simulate the pseudo labels that generated by unlabeled data in the gradient direction of labeled data, in order to maximize the similarity of the average gradients between them. There was the increase in the diversity and richness of the data, particularly for the better generalization with the less risk of overfitting. Meanwhile, the generated pseudo labels were reduced the dependence on a large amount of labeled data and lower annotation costs. More importantly, the gradient simulation was also balanced the sample distribution of different relationship categories in the dataset, especially in the cases of imbalanced relationship categories. The effectiveness of the model was verified to compare the mainstream models of low resource relationship extraction in the apple cultivation corpus (ATC). The results showed that the F1 score of the model was 88.71%, when the proportion of labeled data reached 30%, indicating the significantly improved model than the rest baselines. In addition, the entity relationships model was effectively extracted from the public dataset TACRED in the low resource scenarios. The proportion of unlabeled data was changed in the ATC and TACRED datasets. The experiments showed that the F1 performance varied on the fixed 10% labeled data and 10%, 30%, 50%, 70%, and 90% unlabeled data. The improved performance was achieved to add the unlabeled data for training. The optimal F1 performance was consistently achieved in the different proportions of unlabeled data. The effectiveness of the gradient simulation module was verified through ablation experiments. The relationship extraction model without gradient simulation module was basically the same as the Self-TrainedBERT model. There was an average F1 decrease of 6.12% in the Self-Trained BERT model using different proportions of labeled data. The improved performance of the relationship extraction module was attributed to the gradient simulation module, which was improved the quality of pseudo labels. Finally, principal component analysis was used to demonstrate the gradient descent direction of the relationship extraction module for the labeled and pseudo labeled data, representing the quality of pseudo labeled data. The gradient simulation module was also added to gradually approach the ideal local minimum, although the optimization direction of pseudo label data initially fluctuated greatly. The effectiveness of the gradient simulation module was further proved to generate the high-quality pseudo labels. Therefore, the proposed model can effectively extract the entity relationships in the low resource scenarios, indicating the high generalization and performance
-
-