Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Tytuł pozycji:

Improving biomedical named entity recognition with syntactic information.

Tytuł:
Improving biomedical named entity recognition with syntactic information.
Autorzy:
Tian Y; University of Washington, Seattle, USA.
Shen W; Hunan University, Changsha, China.
Song Y; The Chinese University of Hong Kong, Shenzhen, China. .; Shenzhen Research Institute of Big Data, Shenzhen, China. .
Xia F; University of Washington, Seattle, USA.
He M; Hunan University, Changsha, China.
Li K; Hunan University, Changsha, China.
Źródło:
BMC bioinformatics [BMC Bioinformatics] 2020 Nov 25; Vol. 21 (1), pp. 539. Date of Electronic Publication: 2020 Nov 25.
Typ publikacji:
Journal Article
Język:
English
Imprint Name(s):
Original Publication: [London] : BioMed Central, 2000-
MeSH Terms:
Biomedical Research*
Data Mining*
Semantics*
Benchmarking ; Databases as Topic ; Deep Learning ; Statistics as Topic
References:
BMC Bioinformatics. 2019 May 29;20(Suppl 10):249. (PMID: 31138109)
Database (Oxford). 2016 Oct 24;2016:. (PMID: 27777244)
J Biomed Inform. 2014 Feb;47:1-10. (PMID: 24393765)
Genome Biol. 2008;9 Suppl 2:S2. (PMID: 18834493)
Bioinformatics. 2005 Jun 1;21(11):2794-6. (PMID: 15814561)
J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S10. (PMID: 25810767)
Bioinformatics. 2020 Feb 15;36(4):1234-1240. (PMID: 31501885)
BMC Bioinformatics. 2010 Feb 11;11:85. (PMID: 20149233)
Bioinformatics. 2018 Oct 15;34(20):3539-3546. (PMID: 29718118)
Bioinformatics. 2017 Jul 15;33(14):i37-i48. (PMID: 28881963)
PLoS One. 2013 Jun 18;8(6):e65390. (PMID: 23823062)
Bioinformatics. 2019 May 15;35(10):1745-1752. (PMID: 30307536)
J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S3. (PMID: 25810774)
Bioinformatics. 2016 Sep 15;32(18):2839-46. (PMID: 27283952)
Bioinformatics. 2013 Nov 15;29(22):2909-17. (PMID: 23969135)
Bioinformatics. 2018 Apr 15;34(8):1381-1388. (PMID: 29186323)
Bioinformatics. 2018 Dec 1;34(23):4087-4094. (PMID: 29868832)
Bioinformatics. 2013 Mar 1;29(5):638-44. (PMID: 23325619)
PLoS One. 2018 Jan 26;13(1):e0190926. (PMID: 29373599)
BMC Bioinformatics. 2020 Jan 30;21(1):35. (PMID: 32000677)
Nucleic Acids Res. 2017 Jan 4;45(D1):D362-D368. (PMID: 27924014)
Grant Information:
UDF01001809 The Chinese University of Hong Kong (Shenzhen)
Contributed Indexing:
Keywords: Key-value memory networks; Named entity recognition; Neural networks; Syntactic information; Text mining
Entry Date(s):
Date Created: 20201126 Date Completed: 20201218 Latest Revision: 20201218
Update Code:
20240105
PubMed Central ID:
PMC7687711
DOI:
10.1186/s12859-020-03834-6
PMID:
33238875
Czasopismo naukowe
Background: Biomedical named entity recognition (BioNER) is an important task for understanding biomedical texts, which can be challenging due to the lack of large-scale labeled training data and domain knowledge. To address the challenge, in addition to using powerful encoders (e.g., biLSTM and BioBERT), one possible method is to leverage extra knowledge that is easy to obtain. Previous studies have shown that auto-processed syntactic information can be a useful resource to improve model performance, but their approaches are limited to directly concatenating the embeddings of syntactic information to the input word embeddings. Therefore, such syntactic information is leveraged in an inflexible way, where inaccurate one may hurt model performance.
Results: In this paper, we propose BIOKMNER, a BioNER model for biomedical texts with key-value memory networks (KVMN) to incorporate auto-processed syntactic information. We evaluate BIOKMNER on six English biomedical datasets, where our method with KVMN outperforms the strong baseline method, namely, BioBERT, from the previous study on all datasets. Specifically, the F1 scores of our best performing model are 85.29% on BC2GM, 77.83% on JNLPBA, 94.22% on BC5CDR-chemical, 90.08% on NCBI-disease, 89.24% on LINNAEUS, and 76.33% on Species-800, where state-of-the-art performance is obtained on four of them (i.e., BC2GM, BC5CDR-chemical, NCBI-disease, and Species-800).
Conclusion: The experimental results on six English benchmark datasets demonstrate that auto-processed syntactic information can be a useful resource for BioNER and our method with KVMN can appropriately leverage such information to improve model performance.
Zaloguj się, aby uzyskać dostęp do pełnego tekstu.

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies