diff --git a/README.md b/README.md index 9d71f590..bc348b6b 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ [![R-CMD-check](https://github.com/PLN-team/PLNmodels/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/PLN-team/PLNmodels/actions/workflows/R-CMD-check.yaml) [![Coverage status](https://codecov.io/gh/pln-team/PLNmodels/branch/master/graph/badge.svg)](https://codecov.io/github/pln-team/PLNmodels?branch=master) -[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/PLNmodels.png)](https://cran.r-project.org/package=PLNmodels) +[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/PLNmodels)](https://cran.r-project.org/package=PLNmodels) [![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html) [![](https://img.shields.io/github/last-commit/pln-team/PLNmodels.svg)](https://github.com/pln-team/PLNmodels/commits/master) @@ -18,8 +18,9 @@ stable](https://img.shields.io/badge/lifecycle-stable-blue.svg)](https://lifecyc > of multivariate problems when count data are at play. This package > implements efficient variational algorithms to fit such models, > accompanied with a set of functions for visualization and diagnostic. -> See [this deck of slides](https://pln-team.github.io/slideshow/slides) -> for a comprehensive introduction. +> See [all the dedicated +> vignettes](https://pln-team.github.io/PLNmodels/articles/) for a +> comprehensive introduction. **PLNmodels** covers the following models, all built around the multivariate Poisson-lognormal distribution and sharing a common @@ -40,7 +41,7 @@ experimental torch backend): of PLN models. - **ZIPLN**[^8]: a zero-inflated extension of PLN for data with excess zeros, with the same family of covariance structures and an optional - sparse (`ZIPLNnetwork`) variant. + sparse (`ZIPLNnetwork`[^9]) variant. ## Installation @@ -56,7 +57,7 @@ remotes::install_github("pln-team/PLNmodels@tag_number") # a specific tagged re ## Illustration -We illustrate the main models on the `barents` data set[^9]: the +We illustrate the main models on the `barents` data set[^10]: the abundance of 30 fish species observed in 89 sites in the Barents sea, along with depth, temperature and geographic coordinates for each site. @@ -272,6 +273,12 @@ table(cluster = myMixture$memberships, zone = barents$zone) Statistics and Computing, 35, 2025. [doi:10.1007/s11222-025-10729-0](https://doi.org/10.1007/s11222-025-10729-0) -[^9]: Fossheim, M., Nilssen, E. M. and Aschan, M. Fish assemblages in +[^9]: Tous, J., Chiquet, J., Deacon, A. E., Fontrodona-Eslava, A., + Fraser, D. F. and Magurran, A. E. A JSDM with zero-inflation to + improve inference of association networks from count community data + with structural zeros. bioRxiv preprint, 2025. + [doi:10.1101/2025.07.24.666553](https://doi.org/10.1101/2025.07.24.666553) + +[^10]: Fossheim, M., Nilssen, E. M. and Aschan, M. Fish assemblages in the Barents Sea. Marine Biology Research, 2(4), 2006. [doi:10.1080/17451000600815698](https://doi.org/10.1080/17451000600815698) diff --git a/README.qmd b/README.qmd index cf68c0a5..812156f4 100644 --- a/README.qmd +++ b/README.qmd @@ -1,12 +1,14 @@ --- title: "PLNmodels: Poisson lognormal models for multivariate count data" -format: gfm +format: + gfm: + default-image-extension: "" --- [![R-CMD-check](https://github.com/PLN-team/PLNmodels/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/PLN-team/PLNmodels/actions/workflows/R-CMD-check.yaml) [![Coverage status](https://codecov.io/gh/pln-team/PLNmodels/branch/master/graph/badge.svg)](https://codecov.io/github/pln-team/PLNmodels?branch=master) -[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/PLNmodels)](https://cran.r-project.org/package=PLNmodels) +[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/PLNmodels)](https://cran.r-project.org/package=PLNmodels) [![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html) [![](https://img.shields.io/github/last-commit/pln-team/PLNmodels.svg)](https://github.com/pln-team/PLNmodels/commits/master) @@ -23,7 +25,7 @@ knitr::opts_chunk$set( ## Description -> The Poisson lognormal model and variants[^1] can be used for a variety of multivariate problems when count data are at play. This package implements efficient variational algorithms to fit such models, accompanied with a set of functions for visualization and diagnostic. See [this deck of slides](https://pln-team.github.io/slideshow/slides) for a comprehensive introduction. +> The Poisson lognormal model and variants[^1] can be used for a variety of multivariate problems when count data are at play. This package implements efficient variational algorithms to fit such models, accompanied with a set of functions for visualization and diagnostic. See [all the dedicated vignettes](https://pln-team.github.io/PLNmodels/articles/) for a comprehensive introduction. **PLNmodels** covers the following models, all built around the multivariate Poisson-lognormal distribution and sharing a common formula-based interface (covariates, offsets, weights) and a choice of optimization backends (a fast built-in Newton solver, NLOPT, and an experimental torch backend): @@ -32,7 +34,7 @@ knitr::opts_chunk$set( - **PLNLDA**: Poisson lognormal discriminant analysis[^4] for the supervised classification of count data. - **PLNnetwork**[^5]: sparse inverse-covariance (network) inference via a graphical-lasso-like penalty[^6]. - **PLNmixture**: model-based clustering[^7] of count data via a mixture of PLN models. -- **ZIPLN**[^8]: a zero-inflated extension of PLN for data with excess zeros, with the same family of covariance structures and an optional sparse (`ZIPLNnetwork`) variant. +- **ZIPLN**[^8]: a zero-inflated extension of PLN for data with excess zeros, with the same family of covariance structures and an optional sparse (`ZIPLNnetwork`[^9]) variant. [^1]: J. Chiquet, M. Mariadassou and S. Robin: The Poisson-lognormal model as a versatile framework for the joint analysis of species abundances, Frontiers in Ecology and Evolution, 2021. [doi:10.3389/fevo.2021.588292](https://www.frontiersin.org/articles/10.3389/fevo.2021.588292/full) [^2]: Aitchison, J. and Ho, C. H. The multivariate Poisson-log normal distribution. Biometrika, 76(4), 1989, 643–653. @@ -42,6 +44,7 @@ knitr::opts_chunk$set( [^6]: Friedman, J., Hastie, T. and Tibshirani, R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 2008. [^7]: Fraley, C. and Raftery, A. E. MCLUST: Software for model-based cluster analysis. Journal of Classification, 16(2), 1999. [^8]: Batardière, B., Chiquet, J., Gindraud, F. and Mariadassou, M. Zero-inflation in the multivariate Poisson lognormal family. Statistics and Computing, 35, 2025. [doi:10.1007/s11222-025-10729-0](https://doi.org/10.1007/s11222-025-10729-0) +[^9]: Tous, J., Chiquet, J., Deacon, A. E., Fontrodona-Eslava, A., Fraser, D. F. and Magurran, A. E. A JSDM with zero-inflation to improve inference of association networks from count community data with structural zeros. bioRxiv preprint, 2025. [doi:10.1101/2025.07.24.666553](https://doi.org/10.1101/2025.07.24.666553) ## Installation @@ -55,9 +58,9 @@ remotes::install_github("pln-team/PLNmodels@tag_number") # a specific tagged re ## Illustration -We illustrate the main models on the `barents` data set[^9]: the abundance of 30 fish species observed in 89 sites in the Barents sea, along with depth, temperature and geographic coordinates for each site. +We illustrate the main models on the `barents` data set[^10]: the abundance of 30 fish species observed in 89 sites in the Barents sea, along with depth, temperature and geographic coordinates for each site. -[^9]: Fossheim, M., Nilssen, E. M. and Aschan, M. Fish assemblages in the Barents Sea. Marine Biology Research, 2(4), 2006. [doi:10.1080/17451000600815698](https://doi.org/10.1080/17451000600815698) +[^10]: Fossheim, M., Nilssen, E. M. and Aschan, M. Fish assemblages in the Barents Sea. Marine Biology Research, 2(4), 2006. [doi:10.1080/17451000600815698](https://doi.org/10.1080/17451000600815698) ```{r load} library(PLNmodels) diff --git a/vignettes/ZIPLN.Rmd b/vignettes/ZIPLN.Rmd index c8322382..2c6aeddf 100644 --- a/vignettes/ZIPLN.Rmd +++ b/vignettes/ZIPLN.Rmd @@ -50,7 +50,7 @@ mean(microcosm$Abundance == 0) ### Mathematical background -The zero-inflated PLN model (ZIPLN) combines the Poisson lognormal model [@AiH89] -- see [the PLN vignette](PLN.html) -- with a zero-inflation mechanism: each count $Y_{ij}$ is either a structural zero (with probability $\pi_{ij}$) or drawn from the usual PLN generative process: +The zero-inflated PLN model (ZIPLN) [@ZIPLN] combines the Poisson lognormal model [@AiH89] -- see [the PLN vignette](PLN.html) -- with a zero-inflation mechanism: each count $Y_{ij}$ is either a structural zero (with probability $\pi_{ij}$) or drawn from the usual PLN generative process: \begin{equation} \begin{array}{rcl} \text{latent space } & \mathbf{Z}_i \sim \mathcal{N}\left({\boldsymbol\mu},\boldsymbol\Sigma\right) & \\ @@ -66,7 +66,7 @@ Just like PLN, ${\boldsymbol\mu}$ generalizes to $\mathbf{o}_i + \mathbf{x}_i^\t - `"col"`: one $\pi_j$ per species. - covariates: $\text{logit}(\pi_{ij}) = \mathbf{x}_{0,i}^\top\mathbf{B}_{0,j}$, specified with the formula syntax `Y ~ PLN effect | ZI effect` (see below). -`ZIPLNnetwork` further adds a sparsity penalty on $\boldsymbol\Omega = \boldsymbol\Sigma^{-1}$, exactly as `PLNnetwork` does for PLN (see [the PLNnetwork vignette](PLNnetwork.html) and @PLNnetwork), so that both the excess of zeros and the residual dependency structure between taxa are accounted for. +`ZIPLNnetwork` further adds a sparsity penalty on $\boldsymbol\Omega = \boldsymbol\Sigma^{-1}$, exactly as `PLNnetwork` does for PLN (see [the PLNnetwork vignette](PLNnetwork.html) and @PLNnetwork), so that both the excess of zeros and the residual dependency structure between taxa are accounted for. See @ZIPLNnetwork for an application to species association networks from count data with structural zeros. ## Analysis of microcosm with ZIPLN diff --git a/vignettes/article/PLNreferences.bib b/vignettes/article/PLNreferences.bib index a4bd9069..c5c7cc2c 100644 --- a/vignettes/article/PLNreferences.bib +++ b/vignettes/article/PLNreferences.bib @@ -44,6 +44,23 @@ @InProceedings{PLNnetwork url = {http://proceedings.mlr.press/v97/chiquet19a.html}, } +@Article{ZIPLN, + author = {Batardière, Bastien and Chiquet, Julien and Gindraud, François and Mariadassou, Mahendra}, + title = {Zero-inflation in the multivariate Poisson lognormal family}, + journal = {Statistics and Computing}, + year = {2025}, + volume = {35}, + doi = {10.1007/s11222-025-10729-0}, +} + +@Unpublished{ZIPLNnetwork, + author = {Tous, Jeanne and Chiquet, Julien and Deacon, Amy E. and Fontrodona-Eslava, Ada and Fraser, Douglas F. and Magurran, Anne E.}, + title = {A JSDM with zero-inflation to improve inference of association networks from count community data with structural zeros}, + year = {2025}, + note = {bioRxiv preprint}, + doi = {10.1101/2025.07.24.666553}, +} + @inproceedings{trichoptera, title={Influence des facteurs météorologiques sur les résultats de piégeage lumineux}, author={Usseglio-Polatera, P. and Auda, Y.},