Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Tytuł pozycji:

An introduction to new robust linear and monotonic correlation coefficients.

Tytuł:
An introduction to new robust linear and monotonic correlation coefficients.
Autorzy:
Tabatabai M; Meharry Medical College, Nashville, TN, 37208, USA. .
Bailey S; Meharry Medical College, Nashville, TN, 37208, USA.
Bursac Z; Department of Biostatistics, Florida International University, Miami, FL, 33199, USA.
Tabatabai H; Department of Civil and Environmental Engineering, University of Wisconsin Milwaukee, Milwaukee, WI, 53211, USA.
Wilus D; Meharry Medical College, Nashville, TN, 37208, USA.
Singh KP; Department of Epidemiology and Biostatistics, University of Texas Health Sciences Center at Tyler, Tyler, TX, 75708, USA.
Źródło:
BMC bioinformatics [BMC Bioinformatics] 2021 Mar 31; Vol. 22 (1), pp. 170. Date of Electronic Publication: 2021 Mar 31.
Typ publikacji:
Journal Article
Język:
English
Imprint Name(s):
Original Publication: [London] : BioMed Central, 2000-
MeSH Terms:
Correlation of Data*
Computer Simulation ; Humans
References:
Ben-Dor A, Shamir R, Yakhini Z. Clustering gene expression patterns. J Comput Biol. 1999;6:281–97. (PMID: 10582567)
Bezuidenhout CN, Domleo RR. A demonstration of correlation graphs to human body dimensions. Sci Res Essays. 2013;9:1273–81.
Fujita A, Takahashi DY, Balardin JB, Sato JR. Correlation between graphs with an application to brain networks analysis. 2015. arXiv:1512.06830 [q-bio, stat]. Accessed 12 Jan 2020.
Iwasaki Y, Kusne AG, Takeuchi I. Comparison of dissimilarity measures for cluster analysis of X-ray diffraction data from combinatorial libraries. NPJ Comput Mater. 2017;3:4.
Jay JJ, Eblen JD, Zhang Y, Benson M, Perkins AD, Saxton AM, et al. A systematic comparison of genome-scale clustering algorithms. BMC Bioinform. 2012;13(Suppl 10):S7.
Lin W-T, Wu Y-C, Cheng A, Chao S-J, Hsu H-M. Engineering properties and correlation analysis of fiber cementitious materials. Materials. 2014;7:7423–35. (PMID: 287882565512644)
Neto AM, Victorino AC, Fantoni I, Zampieri DE, Ferreira JV, Lima DA. Image processing using Pearson’s correlation coefficient: applications on autonomous robotics. In: 2013 13th international conference on autonomous robot systems. Lisbon, Portugal: IEEE; 2013. p. 1–6. https://doi.org/10.1109/Robotica.2013.6623521 .
Preacher KJ, Zhang Z, Zyphur MJ. Multilevel structural equation models for assessing moderation within and across levels of analysis. Psychol Methods. 2016;21:189–205. (PMID: 26651982)
Snape P, Pszczolkowski S, Zafeiriou S, Tzimiropoulos G, Ledig C, Rueckert D. A robust similarity measure for volumetric image registration with outliers. Image Vis Comput. 2016;52:97–113.
Suzuki Y, Hino H, Kotsugi M, Ono K. Automated estimation of materials parameter from X-ray absorption and electron energy-loss spectra with similarity measures. NPJ Comput Mater. 2019;5:39.
Vlachos M, Gunopulos D, Kollios G. Robust similarity measures for mobile object trajectories. In: Proceedings. 13th international workshop on database and expert systems applications. Aix-en-Provence, France: IEEE Comput. Soc.; 2002. p. 721–6. https://doi.org/10.1109/DEXA.2002.1045983 .
Zhao W, Chellappa R, Phillips PJ, Rosenfeld A. Face recognition: a literature survey. ACM Comput Surv. 2003;35:399–458.
Yellowlees A, Bursa F, Fleetwood KJ, Charlton S, Hirst KJ, Sun R, et al. The appropriateness of robust regression in addressing outliers in an anthrax vaccine potency test. Bioscience. 2016;66:63–72.
Hardin J, Mitani A, Hicks L, VanKoten B. A robust measure of correlation between two genes on a microarray. BMC Bioinform. 2007;8:220.
Mukaka MM. Statistics corner: a guide to appropriate use of correlation coefficient in medical research. Malawi Med J. 2012;24:69–71. (PMID: 236382783576830)
Gentleman R, Ding B, Dudoit S, Ibrahim J. Distance measures in DNA microarray data analysis. In: Gentleman R, Carey VJ, Huber W, Irizarry RA, Dudoit S, editors. Bioinformatics and computational biology solutions using R and bioconductor. New York: Springer; 2005. p. 189–208. https://doi.org/10.1007/0-387-29362-0_12 .
Jaskowiak PA, Campello RJ, Costa IG. On the selection of appropriate distances for gene expression data clustering. BMC Bioinform. 2014;15:S2.
Guan J, Hsieh F, Koehl P. DCG++: a data-driven metric for geometric pattern recognition. PLoS ONE. 2019;14:e0217838. (PMID: 311702086553753)
Shevlyakov G, Smirnov P. Robust estimation of the correlation coefficient: an attempt of survey. Aust J Stat. 2011;40:10.
de Winter JCF, Gosling SD, Potter J. Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: a tutorial using simulations and empirical data. Psychol Methods. 2016;21:273–90. (PMID: 27213982)
Shirkhorshidi AS, Aghabozorgi S, Wah TY. A comparison study on similarity and dissimilarity measures in clustering continuous data. PLoS ONE. 2015;10:e0144059. (PMID: 266589874686108)
Kim J, Fessler JA. Intensity-based image registration using robust correlation coefficients. IEEE Trans Med Imaging. 2004;23:1430–44. (PMID: 15554130)
Mohammad TA, Tsai YS, Ameer S, Chen H-IH, Chiu Y-C, Chen Y. CeL-ID: cell line identification using RNA-seq data. BMC Genomics. 2019;20:81. (PMID: 307125116360649)
Yona G, Dirks W, Rahman S, Lin DM. Effective similarity measures for expression profiles. Bioinformatics. 2006;22:1616–22. (PMID: 16595558)
Badsha MB, Mollah MNH, Jahan N, Kurata H. Robust complementary hierarchical clustering for gene expression data analysis by β-divergence. J Biosci Bioeng. 2013;116:397–407. (PMID: 23608734)
Moore CS, Wood TJ, Beavis AW, Saunderson JR. Correlation of the clinical and physical image quality in chest radiography for average adults with a computed radiography imaging system. BJR. 2013;86:20130077. (PMID: 235683623922182)
Wang H, Wang Z, Li X, Gong B, Feng L, Zhou Y. A robust approach based on Weibull distribution for clustering gene expression data. Algorithms Mol Biol. 2011;6:14. (PMID: 216241413118357)
Ray SS, Bandyopadhyay S, Pal SK. Dynamic range-based distance measure for microarray expressions and a fast gene-ordering algorithm. IEEE Trans Syst Man Cybern B. 2007;37:742–9.
Hasan MN, Rana MM, Begum AA, Rahman M, Mollah MNH. Robust co-clustering to discover toxicogenomic biomarkers and their regulatory doses of chemical compounds using logistic probabilistic hidden variable model. Front Genet. 2018;9:516. (PMID: 304501126225736)
Spainhour JC, Lim HS, Yi SV, Qiu P. Correlation patterns between DNA methylation and gene expression in the cancer genome atlas. Cancer Inform. 2019;18:117693511982877.
Córdova-Palomera A, Palma-Gudiel H, Forés-Martos J, Tabarés-Seisdedos R, Fañanás L. Epigenetic outlier profiles in depression: a genome-wide DNA methylation analysis of monozygotic twins. PLoS ONE. 2018;13:e0207754. (PMID: 304580226245788)
Nishimura A, Tabuchi Y, Kikuchi M, Masuda R, Goto K, Iijima T. The amount of fluid given during surgery that leaks into the interstitium correlates with infused fluid volume and varies widely between patients. Anesth Anal. 2016;123:925–32.
Kim JY, Ahn HJ, Kim JK, Kim J, Lee SH, Chae HB. Morphine suppresses lung cancer cell proliferation through the interaction with opioid growth factor receptor: an in vitro and human lung tissue study. Anesth Anal. 2016;123:1429–36.
Bloch KM, Arce GR. Median correlation for the analysis of gene expression data. Signal Process. 2003;83:811–23.
Liu L, Hawkins DM, Ghosh S, Young SS. Robust singular value decomposition analysis of microarray data. Proc Natl Acad Sci USA. 2003;100:13167–72. (PMID: 14581611263735)
Rousseeuw PJ, Hubert M. Robust statistics for outlier detection. WIREs Data Min Knowl Discov. 2011;1:73–9.
Eby W, Li T, Bae S, Singh K. TELBS robust linear regression method. OAMS. 2012:65.
Maronna R, Martin R, Yohai V, Salibián-Barrera M, Safari an OMC. Robust statistics. 2nd ed. 2019. https://www.wiley.com/en-us/Robust+Statistics:+Theory+and+Methods+(with+R),+2nd+Edition-p-9781119214687 . Accessed 23 Jan 2020.
Shevlyakov G, Morgenthaler S, Shurygin A. Redescending M-estimators. J Stat Plan Inference. 2008;138:2906–17.
Rousseeuw PJ, Croux C. Alternatives to the median absolute deviation. J Am Stat Assoc. 1993;88:1273–83.
Croux C, Rousseeuw PJ. Time-efficient algorithms for two highly robust estimators of scale. In: Dodge Y, Whittaker J, editors. Computational Statistics. Heidelberg: Springer; 1992. p. 411–28. https://doi.org/10.1007/978-3-662-26811-7_58 . (PMID: 10.1007/978-3-662-26811-7_58)
Rousseeuw PJ, Leroy AM. Robust regression and outlier detection. Hoboken: Wiley; 2003.
Bonett DG, Wright TA. Sample size requirements for estimating pearson, kendall and spearman correlations. Psychometrika. 2000;65:23–8.
Ruscio J. Constructing confidence intervals for Spearman’s rank correlation with ordinal data: a simulation study comparing analytic and bootstrap methods. J Mod App Stat Meth. 2008;7:416–34.
Bishara AJ, Hittner JB. Confidence intervals for correlations when data are not normal. Behav Res. 2017;49:294–309.
Raymaekers J, Rousseeuw PJ. Fast robust correlation for high-dimensional data. Technometrics. 2019;2019:1–15.
Rousseeuw PJ, Driessen KV. A fast algorithm for the minimum covariance determinant estimator. Technometrics. 1999;41:212–23.
Barak B, Zhang Z, Liu Y, Nir A, Trangle SS, Ennis M, et al. Neuronal deletion of Gtf2i, associated with Williams syndrome, causes behavioral and myelin alterations rescuable by a remyelinating drug. Nat Neurosci. 2019;22:700–8. (PMID: 31011227)
Lalli MA, Jang J, Park J-HC, Wang Y, Guzman E, Zhou H, et al. Haploinsufficiency of BAZ1B contributes to Williams syndrome through transcriptional dysregulation of neurodevelopmental pathways. Hum Mol Genet. 2016;25:1294–306. (PMID: 26755828)
De Cegli R, Iacobacci S, Fedele A, Ballabio A, di Bernardo D. A transcriptomic study of Williams–Beuren syndrome associated genes in mouse embryonic stem cells. Sci Data. 2019;6:262. (PMID: 316950496834640)
de Torrenté L, Zimmerman S, Suzuki M, Christopeit M, Greally JM, Mar JC. The shape of gene expression distributions matter: how incorporating distribution shape improves the interpretation of cancer transcriptomic data. BMC Bioinform. 2020;21:562.
Tsukumo Y, Tsukahara S, Furuno A, Iemura S, Natsume T, Tomida A. TBL2 is a novel PERK-binding protein that modulates stress-signaling and cell survival during endoplasmic reticulum stress. PLoS ONE. 2014;9:e112761. (PMID: 253932824231078)
Fisch GS. Genetics and genomics of neurobehavioral disorders. Totowa: Humana Press; 2003.
TBL2 transducin beta like 2 [ Homo sapiens (human) ]. National Center for Biotechnology Information; 2020. https://www.ncbi.nlm.nih.gov/gene/26608?_ga=2.241965378.1379159307.1606244325-79102781.1606244325#bibliography .
Meng X, Lu X, Li Z, Green ED, Massa H, Trask BJ, et al. Complete physical map of the common deletion region in Williams syndrome and identification and characterization of three novel genes. Hum Genet. 1998;103:590–9. (PMID: 9860302)
Capossela S, Muzio L, Bertolo A, Bianchi V, Dati G, Chaabane L, et al. Growth defects and impaired cognitive-behavioral abilities in mice with knockout for Eif4h, a gene located in the mouse homolog of the Williams–Beuren syndrome critical region. Am J Pathol. 2012;180:1121–35. (PMID: 22234171)
Vandeweyer G, Van der Aa N, Reyniers E, Kooy RF. The contribution of CLIP2 haploinsufficiency to the clinical manifestations of the Williams–Beuren syndrome. Am J Hum Genet. 2012;90:1071–8. (PMID: 226087123370266)
Tsukumo Y, Tsukahara S, Furuno A, Iemura S, Natsume T, Tomida A. The endoplasmic reticulum-localized protein TBL2 interacts with the 60S ribosomal subunit. Biochem Biophys Res Commun. 2015;462:383–8. (PMID: 25976671)
Tsukumo Y, Tsukahara S, Furuno A, Iemura S, Natsume T, Tomida A. TBL2 associates with ATF4 mRNA via its WD40 domain and regulates its translation during ER stress: TBL2 regulates translation of ATF4 during ER stress. J Cell Biochem. 2016;117:500–9. (PMID: 26239904)
Pérez Jurado LA, Wang Y-K, Francke U, Cruces J. TBL2, a novel transducin family member in the WBS deletion: characterization of the complete sequence, genomic structure, transcriptional variants and the mouse ortholog. Cytogenet Genome Res. 1999;86:277–84.
Talwar S, Munson PJ, Barb J, Fiuza C, Cintron AP, Logun C, et al. Gene expression profiles of peripheral blood leukocytes after endotoxin challenge in humans. Physiol Genomics. 2006;25:203–15. (PMID: 16403844)
Grant Information:
G12 MD007586 United States MD NIMHD NIH HHS; U54 MD007586 United States MD NIMHD NIH HHS; MD007586 United States MD NIMHD NIH HHS
Contributed Indexing:
Keywords: Dissimilarity measures; Gene expression; Median correlation; Minimum covariance determinant correlation; Pearson correlation; Quadrant correlation; Spearman correlation; Williams syndrome
Entry Date(s):
Date Created: 20210401 Date Completed: 20210412 Latest Revision: 20210731
Update Code:
20240104
PubMed Central ID:
PMC8011137
DOI:
10.1186/s12859-021-04098-4
PMID:
33789571
Czasopismo naukowe
Background: The most common measure of association between two continuous variables is the Pearson correlation (Maronna et al. in Safari an OMC. Robust statistics, 2019. https://login.proxy.bib.uottawa.ca/login?url=https://learning.oreilly.com/library/view/-/9781119214687/?ar&orpq&email=^u). When outliers are present, Pearson does not accurately measure association and robust measures are needed. This article introduces three new robust measures of correlation: Taba (T), TabWil (TW), and TabWil rank (TWR). The correlation estimators T and TW measure a linear association between two continuous or ordinal variables; whereas TWR measures a monotonic association. The robustness of these proposed measures in comparison with Pearson (P), Spearman (S), Quadrant (Q), Median (M), and Minimum Covariance Determinant (MCD) are examined through simulation. Taba distance is used to analyze genes, and statistical tests were used to identify those genes most significantly associated with Williams Syndrome (WS).
Results: Based on the root mean square error (RMSE) and bias, the three proposed correlation measures are highly competitive when compared to classical measures such as P and S as well as robust measures such as Q, M, and MCD. Our findings indicate TBL2 was the most significant gene among patients diagnosed with WS and had the most significant reduction in gene expression level when compared with control (P value = 6.37E-05).
Conclusions: Overall, when the distribution is bivariate Log-Normal or bivariate Weibull, TWR performs best in terms of bias and T performs best with respect to RMSE. Under the Normal distribution, MCD performs well with respect to bias and RMSE; but TW, TWR, T, S, and P correlations were in close proximity. The identification of TBL2 may serve as a diagnostic tool for WS patients. A Taba R package has been developed and is available for use to perform all necessary computations for the proposed methods.
Erratum in: BMC Bioinformatics. 2021 Jun 15;22(1):328. (PMID: 34130626)
Zaloguj się, aby uzyskać dostęp do pełnego tekstu.

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies