Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Tytuł pozycji:

Seagull: lasso, group lasso and sparse-group lasso regularization for linear regression models via proximal gradient descent.

Tytuł:
Seagull: lasso, group lasso and sparse-group lasso regularization for linear regression models via proximal gradient descent.
Autorzy:
Klosa J; Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology, 18196, Dummerstorf, Germany.
Simon N; Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA.
Westermark PO; Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology, 18196, Dummerstorf, Germany.
Liebscher V; Institute of Mathematics and Computer Science, University of Greifswald, 17489, Greifswald, Germany.
Wittenburg D; Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology, 18196, Dummerstorf, Germany. .
Źródło:
BMC bioinformatics [BMC Bioinformatics] 2020 Sep 15; Vol. 21 (1), pp. 407. Date of Electronic Publication: 2020 Sep 15.
Typ publikacji:
Journal Article
Język:
English
Imprint Name(s):
Original Publication: [London] : BioMed Central, 2000-
MeSH Terms:
Linear Models*
Machine Learning/*standards
Algorithms ; Humans
References:
Cell Metab. 2017 Apr 4;25(4):954-960.e6. (PMID: 28380383)
Comput Math Methods Med. 2017;2017:7691937. (PMID: 28546826)
Genome Biol. 2019 Nov 25;20(1):249. (PMID: 31767039)
Front Genet. 2020 Mar 03;11:155. (PMID: 32194631)
Grant Information:
WI 4450/2-1 German Research Foundation (DFG)
Contributed Indexing:
Keywords: High-dimensional data; Machine learning; Optimization; R package
Entry Date(s):
Date Created: 20200916 Date Completed: 20201019 Latest Revision: 20201019
Update Code:
20240104
PubMed Central ID:
PMC7493359
DOI:
10.1186/s12859-020-03725-w
PMID:
32933477
Czasopismo naukowe
Background: Statistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalization approaches are often the methods of choice. They are especially useful in case of multicollinearity, which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalized by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realized by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally, seagull -the R package presented here- produces complete regularization paths.
Results: Publicly available high-dimensional methylation data are used to compare seagull to the established R package SGL. The results of both packages enabled a precise prediction of biological age from DNA methylation status. But even though the results of seagull and SGL were very similar (R 2  > 0.99), seagull computed the solution in a fraction of the time needed by SGL. Additionally, seagull enables the incorporation of weights for each penalized feature.
Conclusions: The following operators for linear regression models are available in seagull: lasso, group lasso, sparse-group lasso and Integrative LASSO with Penalty Factors (IPF-lasso). Thus, seagull is a convenient envelope of lasso variants.
Zaloguj się, aby uzyskać dostęp do pełnego tekstu.

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies