Ren Yuan, Yu Hong, Yang He, Liu Jusheng, Yang Huining, Sun Zhetao, Zhang Sijia, Liu Mingjian, Sun Hua. Recognition of quantitative indicator of fishery standard using attention mechanism and the BERT+BiLSTM+CRF model[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(10): 135-141. DOI: 10.11975/j.issn.1002-6819.2021.10.016
    Citation: Ren Yuan, Yu Hong, Yang He, Liu Jusheng, Yang Huining, Sun Zhetao, Zhang Sijia, Liu Mingjian, Sun Hua. Recognition of quantitative indicator of fishery standard using attention mechanism and the BERT+BiLSTM+CRF model[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(10): 135-141. DOI: 10.11975/j.issn.1002-6819.2021.10.016

    Recognition of quantitative indicator of fishery standard using attention mechanism and the BERT+BiLSTM+CRF model

    • Abstract: Fishery information service is a vital component to realize data analysis, feature extraction, and fishing forecasting, particularly for a high comprehensive production capacity and modernized management in fishery. The commonly-used keyword matching without standard contents cannot meet the high demand for accurate service in the current information system of fishery. The standard quantitative indicators in fishery have become one of the most important tasks in the information service. Therefore, it is very necessary to accurately identify the effective standard quantitative indicators for the automatic extraction of fishery. Combining the attention mechanism and the BERT+BiLSTM+CRF (Bidirectional Encoder Representations from Transformers + Bi-directional Long Short-Term Memory + Conditional Random Field) model, this study aims to propose a highly accurate recognition method of standard quantitative indicators in fishery, further to replace the commonly-used entity recognition. The quantitative indicators were firstly divided into four types of entities: the indicator name, indicator value, unit, and qualified words for identification. This operation effectively dealt with the difficult identification of fishery standard quantitative indicator entities. It was found that the location information behaved a significant impact on the recognition of indicator names and other entities. Vector data was also utilized to improve the recognition of indicator names. Secondly, the BiLSTM model was used to learn the semantic features of long sequences in the fishery standard text quantitative indicators. The attention mechanism was then integrated to treat the long-sequence semantic dilution. Finally, all sequence tags were obtained through the CRF layer. The test results showed that the accuracy rate was 94.51%, the recall rate was 96.37%, and the F1 value was 95.43% for the fusion attention mechanism and the BERT+BiLSTM+CRF model. Compared with the fusion attention + BiLSTM + CRF (named entity recognition model), the accuracy, recall rate, and F1 value increased by 2.78, 6.73, and 4.65 percentage points, respectively. The word vectors, position vectors, and sentence features were combined for better recognition in the model. The self-attention mechanism of the BERT model was pre-trained, where a bidirectional encoder was used for the transformer layer in the BERT model, indicating a better performance on the text context memory. Compared with the BERT+BiLSTM+CRF model, the accuracy, recall, and F1 value increased by 1.62, 0.25, and 0.97 percentage points, respectively, indicating that the attention mechanism contributed to the greater weight of the target entity in the long- and short-term memory network. The features were then weighted to make the model more accurately identify quantitative indicators. The proposed model can be expected to more accurately identify the fishery standard quantitative indicators, especially the indicator names, indicator values, units, qualifiers. This investigation can provide promising data support to accurate information using standard content services. The effective fishery standard quantitative index can also offer new ideas for the identification of quantitative indicator named entities in agricultural, medical, and biological fields
    • loading

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return