-
Tytuł:
-
Next generation community assessment of biomedical entity recognition web servers: metrics, performance, interoperability aspects of BeCalm.
-
Autorzy:
-
Pérez-Pérez M; Department of Computer Science, ESEI, University of Vigo, Campus As Lagoas, 32004, Ourense, Spain.; The Biomedical Research Centre (CINBIO), Campus Universitario Lagoas-Marcosende, 36310, Vigo, Spain.; SING Research Group, Galicia Sur Health Research Institute (ISS Galicia Sur), SERGAS-UVIGO, Vigo, Spain.
Pérez-Rodríguez G; Department of Computer Science, ESEI, University of Vigo, Campus As Lagoas, 32004, Ourense, Spain.; The Biomedical Research Centre (CINBIO), Campus Universitario Lagoas-Marcosende, 36310, Vigo, Spain.; SING Research Group, Galicia Sur Health Research Institute (ISS Galicia Sur), SERGAS-UVIGO, Vigo, Spain.
Blanco-Míguez A; Department of Computer Science, ESEI, University of Vigo, Campus As Lagoas, 32004, Ourense, Spain.; The Biomedical Research Centre (CINBIO), Campus Universitario Lagoas-Marcosende, 36310, Vigo, Spain.; SING Research Group, Galicia Sur Health Research Institute (ISS Galicia Sur), SERGAS-UVIGO, Vigo, Spain.; Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias (IPLA), Consejo Superior de Investigaciones Científicas (CSIC), Paseo Río Linares S/N 33300, Villaviciosa, Asturias, Spain.
Fdez-Riverola F; Department of Computer Science, ESEI, University of Vigo, Campus As Lagoas, 32004, Ourense, Spain.; The Biomedical Research Centre (CINBIO), Campus Universitario Lagoas-Marcosende, 36310, Vigo, Spain.; SING Research Group, Galicia Sur Health Research Institute (ISS Galicia Sur), SERGAS-UVIGO, Vigo, Spain.
Valencia A; Life Science Department, Barcelona Supercomputing Centre (BSC-CNS), C/Jordi Girona 29-31, 08034, Barcelona, Spain.; Joint BSC-IRB-CRG Program in Computational Biology, Parc Científic de Barcelona, C/Baldiri Reixac 10, 08028, Barcelona, Spain.; Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig de Lluís Companys 23, 08010, Barcelona, Spain.; Spanish Bioinformatics Institute INB-ISCIII ES-ELIXIR, 28029, Madrid, Spain.
Krallinger M; Life Science Department, Barcelona Supercomputing Centre (BSC-CNS), C/Jordi Girona 29-31, 08034, Barcelona, Spain. .; Joint BSC-IRB-CRG Program in Computational Biology, Parc Científic de Barcelona, C/Baldiri Reixac 10, 08028, Barcelona, Spain. .; Biological Text Mining Unit, Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre, C/Melchor Fernández Almagro 3, 28029, Madrid, Spain. .
Lourenço A; Department of Computer Science, ESEI, University of Vigo, Campus As Lagoas, 32004, Ourense, Spain. .; The Biomedical Research Centre (CINBIO), Campus Universitario Lagoas-Marcosende, 36310, Vigo, Spain. .; SING Research Group, Galicia Sur Health Research Institute (ISS Galicia Sur), SERGAS-UVIGO, Vigo, Spain. .; Centre of Biological Engineering (CEB), University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal. .
-
Źródło:
-
Journal of cheminformatics [J Cheminform] 2019 Jun 24; Vol. 11 (1), pp. 42. Date of Electronic Publication: 2019 Jun 24.
-
Typ publikacji:
-
Journal Article
-
Język:
-
English
-
Imprint Name(s):
-
Publication: [London, United Kingdom] : BioMed Central
Original Publication: [London] : Chemistry Central Ltd. in association with BioMed Central, 2009-
-
References:
-
Nat Genet. 2000 May;25(1):25-9. (PMID: 10802651)
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70. (PMID: 14681409)
Brief Bioinform. 2005 Jun;6(2):178-88. (PMID: 15975226)
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D140-4. (PMID: 16381832)
Genome Biol. 2008;9 Suppl 2:S1. (PMID: 18834487)
Genome Biol. 2008;9 Suppl 2:S2. (PMID: 18834493)
Genome Biol. 2008;9 Suppl 2:S4. (PMID: 18834495)
Genome Biol. 2008;9 Suppl 2:S6. (PMID: 18834497)
Bioinformatics. 2009 Aug 1;25(15):1997-8. (PMID: 19414535)
J Bioinform Comput Biol. 2010 Feb;8(1):163-79. (PMID: 20183881)
J Biomed Semantics. 2010 Aug 21;1(1):8. (PMID: 20727200)
Nucleic Acids Res. 2012 Jan;40(Database issue):D1100-7. (PMID: 21948594)
BMC Bioinformatics. 2011 Oct 03;12 Suppl 8:S3. (PMID: 22151929)
BMC Bioinformatics. 2011 Oct 03;12 Suppl 8:S4. (PMID: 22151968)
Database (Oxford). 2013 Sep 18;2013:bat064. (PMID: 24048470)
Database (Oxford). 2014 Jun 10;2014:null. (PMID: 24919658)
F1000Res. 2014 Apr 25;3:96. (PMID: 25254099)
J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S1. (PMID: 25810766)
Brief Bioinform. 2016 Jan;17(1):132-44. (PMID: 25935162)
Nucleic Acids Res. 2016 Jan 4;44(D1):D1214-9. (PMID: 26467479)
Database (Oxford). 2016 Mar 19;2016:null. (PMID: 26994911)
Database (Oxford). 2016 Aug 07;2016:. (PMID: 27504010)
Database (Oxford). 2016 Sep 01;2016:. (PMID: 27589962)
Chem Rev. 2017 Jun 28;117(12):7673-7761. (PMID: 28475312)
J Biomol Tech. 2018 Jul;29(2):25-38. (PMID: 29805321)
J Cheminform. 2018 Dec 5;10(1):58. (PMID: 30519990)
J Cheminform. 2018 Dec 14;10(1):63. (PMID: 30552534)
J Cheminform. 2018 Dec 21;10(1):68. (PMID: 30578450)
J Cheminform. 2019 Jan 21;11(1):7. (PMID: 30666476)
J Cheminform. 2019 Mar 8;11(1):19. (PMID: 30850898)
-
Grant Information:
-
ED431C2018/55-GRC Consellería de Cultura, Educación e Ordenación Universitaria, Xunta de Galicia (ES); 654021 (OpenMinTeD) H2020 European Institute of Innovation and Technology; Encomienda MINETAD-CNIO Plan for the Advancement of Language Technology; UID/BIO/04469/2013 Portuguese Foundation for Science and Technology (FCT); COMPETE 2020 (POCI-01-0145-FEDER-006684) Portuguese Foundation for Science and Technology (FCT)
-
Contributed Indexing:
-
Keywords: Annotation server; BeCalm metaserver; BioCreative; Continuous evaluation; Named entity recognition; Patent mining; REST-API; Shared task; TIPS; Text mining
-
Entry Date(s):
-
Date Created: 20190626 Latest Revision: 20201001
-
Update Code:
-
20240104
-
PubMed Central ID:
-
PMC6591930
-
DOI:
-
10.1186/s13321-019-0363-6
-
PMID:
-
31236786
-
Background: Shared tasks and community challenges represent key instruments to promote research, collaboration and determine the state of the art of biomedical and chemical text mining technologies. Traditionally, such tasks relied on the comparison of automatically generated results against a so-called Gold Standard dataset of manually labelled textual data, regardless of efficiency and robustness of the underlying implementations. Due to the rapid growth of unstructured data collections, including patent databases and particularly the scientific literature, there is a pressing need to generate, assess and expose robust big data text mining solutions to semantically enrich documents in real time. To address this pressing need, a novel track called "Technical interoperability and performance of annotation servers" was launched under the umbrella of the BioCreative text mining evaluation effort. The aim of this track was to enable the continuous assessment of technical aspects of text annotation web servers, specifically of online biomedical named entity recognition systems of interest for medicinal chemistry applications.
Results: A total of 15 out of 26 registered teams successfully implemented online annotation servers. They returned predictions during a two-month period in predefined formats and were evaluated through the BeCalm evaluation platform, specifically developed for this track. The track encompassed three levels of evaluation, i.e. data format considerations, technical metrics and functional specifications. Participating annotation servers were implemented in seven different programming languages and covered 12 general entity types. The continuous evaluation of server responses accounted for testing periods of low activity and moderate to high activity, encompassing overall 4,092,502 requests from three different document provider settings. The median response time was below 3.74 s, with a median of 10 annotations/document. Most of the servers showed great reliability and stability, being able to process over 100,000 requests in a 5-day period.
Conclusions: The presented track was a novel experimental task that systematically evaluated the technical performance aspects of online entity recognition systems. It raised the interest of a significant number of participants. Future editions of the competition will address the ability to process documents in bulk as well as to annotate full-text documents.
Zaloguj się, aby uzyskać dostęp do pełnego tekstu.