Abstract
As the important information in MEDLINE database, grant support (GS) refers to funding agencies and contract numbers. For funding organizations, GS plays a crucial role in tracking their funding outcomes. In this paper, we present a pipeline system called GrantExtractor that is able to automatically extract funding information from biomedical literature. GrantExtractor is a novel solution to the practical problem of GS information extraction, which is related to both name entity recognition and relation extraction. Our approaches rely on an integration of several modern machine learning techniques. In particular, funding sentences in articles are first identified by a sentence classifier. Entities of grant numbers and agencies are then extracted from these funding sentences by a bi-directional LSTM and the CRF layer (BiLSTM-CRF), as well as pattern matching. After removing noisy numbers by a multi-class model, we finally match each grant number with its corresponding agency. Experimental results on benchmark datasets show that GrantExtractor clearly outperformed all baseline methods. In addition, GrantExtractor won the first place in Task 5C of 2017 BioASQ challenge, achieving the Micro-recall of 0.9526 for 22,610 articles. This number is 33% higher than 0.7174, which is the highest score as the baseline of“BioASQ Filtering” provided by National Library of Medicine (NLM). Moreover, GrantExtractor has achieved the Micro F-measure score as high as 0.90 in the task of extracting grant pairs.
Original language | English |
---|---|
Title of host publication | 2018 IEEE International Conference on bioinformatics and Biomedicine (BIBM) |
Publisher | IEEE Xplore |
Pages | 333-340 |
Number of pages | 8 |
ISBN (Electronic) | 9781538654880 |
ISBN (Print) | 9781538654897 (Print on demand) |
DOIs | |
Publication status | Published - 24 Jan 2019 |
Event | 2018 International Conference on Bioinformatics and Biomedicine: BIBM 2018 - NH Collection Madrid Eurobuilding, Madrid, Spain Duration: 03 Dec 2018 → 06 Dec 2018 http://orienta.ugr.es/bibm2018/ https://ieeexplore.ieee.org/xpl/conhome/8609864/proceeding (proceedings) |
Conference
Conference | 2018 International Conference on Bioinformatics and Biomedicine |
---|---|
Country/Territory | Spain |
City | Madrid |
Period | 03/12/18 → 06/12/18 |
Internet address |