Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Tytuł pozycji:

The effect of statistical normalization on network propagation scores.

Tytuł:
The effect of statistical normalization on network propagation scores.
Autorzy:
Picart-Armada S; B2SLab, Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, 08028, Spain.; Esplugues de Llobregat, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Barcelona, 08950, Spain.
Thompson WK; Mental Health Center Sct. Hans, 4000 Roskilde, Denmark.; Department of Family Medicine and Public Health, University of California, San Diego, La Jolla, CA, USA.
Buil A; Mental Health Center Sct. Hans, 4000 Roskilde, Denmark.
Perera-Lluna A; B2SLab, Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, 08028, Spain.; Esplugues de Llobregat, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Barcelona, 08950, Spain.
Źródło:
Bioinformatics (Oxford, England) [Bioinformatics] 2021 May 05; Vol. 37 (6), pp. 845-852.
Typ publikacji:
Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't
Język:
English
Imprint Name(s):
Original Publication: Oxford : Oxford University Press, c1998-
MeSH Terms:
Computational Biology*
Protein Interaction Maps*
Diffusion ; Prospective Studies ; Proteins/genetics
References:
Brief Bioinform. 2020 May 21;21(3):919-935. (PMID: 31155636)
Nature. 2002 May 23;417(6887):399-403. (PMID: 12000970)
Nat Rev Genet. 2011 Jan;12(1):56-68. (PMID: 21164525)
Bioinformatics. 2018 Feb 1;34(3):533-534. (PMID: 29029016)
Bioinformatics. 2013 Nov 1;29(21):2757-64. (PMID: 23986566)
Nucleic Acids Res. 2017 Jan 4;45(D1):D369-D379. (PMID: 27980099)
PLoS One. 2017 Dec 6;12(12):e0189012. (PMID: 29211807)
Sci Rep. 2017 Apr 21;7:46598. (PMID: 28429739)
Genome Res. 2011 Jul;21(7):1109-21. (PMID: 21536720)
J Chem Inf Model. 2019 Apr 22;59(4):1645-1657. (PMID: 30730731)
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D411-4. (PMID: 16381900)
PLoS One. 2013 Sep 03;8(9):e73074. (PMID: 24019896)
Genome Biol. 2008;9 Suppl 1:S4. (PMID: 18613948)
Artif Intell Med. 2014 Jun;61(2):63-78. (PMID: 24726035)
Bioinformatics. 2017 Jun 15;33(12):1829-1836. (PMID: 28200073)
Nat Rev Genet. 2013 Oct;14(10):719-32. (PMID: 24045689)
N Engl J Med. 2002 Jun 20;346(25):1937-47. (PMID: 12075054)
Front Genet. 2019 Jan 22;10:4. (PMID: 30723490)
Mol Syst Biol. 2007;3:88. (PMID: 17353930)
Cell. 2005 May 20;121(4):511-513. (PMID: 15907465)
Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361. (PMID: 27899662)
Blood. 2004 Apr 1;103(7):2771-8. (PMID: 14684422)
Bioinformatics. 2005 Mar;21(6):788-93. (PMID: 15509611)
KDD. 2016 Aug;2016:855-864. (PMID: 27853626)
Bioinformatics. 2017 Jan 1;33(1):145-147. (PMID: 27591081)
Sci Rep. 2016 Oct 12;6:34841. (PMID: 27731320)
Nature. 2000 Feb 10;403(6770):601-3. (PMID: 10688178)
J Comput Biol. 2011 Mar;18(3):507-22. (PMID: 21385051)
Bioinformatics. 2014 Jun 15;30(12):i219-27. (PMID: 24931987)
Nat Rev Genet. 2017 Sep;18(9):551-562. (PMID: 28607512)
PLoS Comput Biol. 2019 Sep 3;15(9):e1007276. (PMID: 31479437)
BioData Min. 2011 Jun 24;4:19. (PMID: 21699738)
PLoS Comput Biol. 2019 Dec 20;15(12):e1007403. (PMID: 31860671)
Grant Information:
R01 GM104400 United States GM NIGMS NIH HHS
Substance Nomenclature:
0 (Proteins)
Entry Date(s):
Date Created: 20201018 Date Completed: 20210603 Latest Revision: 20230519
Update Code:
20240105
PubMed Central ID:
PMC8097756
DOI:
10.1093/bioinformatics/btaa896
PMID:
33070187
Czasopismo naukowe
Motivation: Network diffusion and label propagation are fundamental tools in computational biology, with applications like gene-disease association, protein function prediction and module discovery. More recently, several publications have introduced a permutation analysis after the propagation process, due to concerns that network topology can bias diffusion scores. This opens the question of the statistical properties and the presence of bias of such diffusion processes in each of its applications. In this work, we characterized some common null models behind the permutation analysis and the statistical properties of the diffusion scores. We benchmarked seven diffusion scores on three case studies: synthetic signals on a yeast interactome, simulated differential gene expression on a protein-protein interaction network and prospective gene set prediction on another interaction network. For clarity, all the datasets were based on binary labels, but we also present theoretical results for quantitative labels.
Results: Diffusion scores starting from binary labels were affected by the label codification and exhibited a problem-dependent topological bias that could be removed by the statistical normalization. Parametric and non-parametric normalization addressed both points by being codification-independent and by equalizing the bias. We identified and quantified two sources of bias-mean value and variance-that yielded performance differences when normalizing the scores. We provided closed formulae for both and showed how the null covariance is related to the spectral properties of the graph. Despite none of the proposed scores systematically outperformed the others, normalization was preferred when the sought positive labels were not aligned with the bias. We conclude that the decision on bias removal should be problem and data-driven, i.e. based on a quantitative analysis of the bias and its relation to the positive entities.
Availability: The code is publicly available at https://github.com/b2slab/diffuBench and the data underlying this article are available at https://github.com/b2slab/retroData.
Supplementary Information: Supplementary data are available at Bioinformatics online.
(© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies