LIVI is a probabilistic model for single-cell RNA-seq data collected from a large population of individuals. At its core, LIVI builds on variational autoencoders (VAEs), employing structured linear decoders to decompose observed variation in single-cell expression to cell-state variation, donor-driven variation and their interaction. The resulting model has properties that resemble classical factor analysis, where the decoder is a factor loadings matrix instead of a neural network with non-linear activations.
Once trained, LIVI enables efficient donor-level association testing, while retaining single-cell resolution and interpretation. Because donor latent factors are inferred without information on specific donor-level characteristics, such as SNP genotypes, they can be used as quantitative phenotypes to test for genetic effects without the risk of circularity. Following association testing at the donor level, the discovered effects can be projected back onto single cells via LIVI's latent donor-cell-state interaction model (
Check out our preprint for more details on the model and analyses: Vagiaki et al., 2026
We are working on more comprehensive documentation. In the meantime, if you need assistance using our tool beyond this Quick start guide, feel free to reach out at danai.vagiaki@embl.de
Install dependencies
# clone project
git clone https://github.com/PMBio/LIVI
cd LIVI
# [OPTIONAL] create conda environment
conda create -n LIVIenv python=3.11
conda activate LIVIenv
# install pytorch according to instructions
# https://pytorch.org/get-started/
# install requirements
pip install -r requirements.txtTrain model with chosen experiment configuration from configs/experiment/
python src/train.py experiment=experiment_name.yamlTrain model on CPU/GPU
# train on CPU
python src/train.py trainer=cpu
# train on GPU
python src/train.py trainer=gpuYou can override any parameter from command line like this
python src/train.py trainer.max_epochs=100 datamodule.batch_size=528The following performs inference on the gene expression data stored in --adata, using the "best" model checkpoint stored under --model_run_dir. Subsequently, it runs association testing between inferred donor factors and the SNP genotypes in --genotype_matrix (prefix of .bed, .bim, .fam PLINK files), while accounting for covariates (e.g. expression PCs) specified under --covariates and population structure specified under --kinship using a LMM. Output files are saved under -od.
For a full list of options please run python src/analysis/livi_analysis.py --h.
python src/analysis/livi_analysis.py \
--model_run_dir /path/to/model/checkpoints/ \
--adata /path/to/adata.h5ad \
--celltype_column CELLTYPE_COLUMN \
--individual_column INDIVIDUAL_COLUMN \
--covariates /path/to/association/testing/covariate_file.tsv \
--fdr_threshold FDR \
--genotype_matrix /path/to/PLINK/genotype/matrix --plink \
--method LMM \
--kinship /path/to/Kinship_matrix.tsv \
-od /path/to/output/directoryExamples of downstream intrepretation and plotting of association testing results can be found in https://github.com/danaivagiaki/LIVI_analyses
If you use LIVI in your research, please cite:
@article{vagiaki2026livi,
title = {Mapping trans-eQTLs at single-cell resolution using Latent Interaction Variational Inference},
author = {Vagiaki, Danai and Heinen, Tobias and Saraswat, Manu and Clarke, Brian and Stegle, Oliver},
journal = {bioRxiv},
year = {2026},
doi = {10.64898/2026.02.04.703363},
URL = {https://www.biorxiv.org/content/early/2026/02/06/2026.02.04.703363},
}This project builds on the Lightning-Hydra-Template.