monaLisa was inspired by her father Homer
to look for enriched motifs in sets (bins) of genomic regions, compared to all other
regions (“binned motif enrichment analysis”).
It uses known motifs representing transcription factor binding preferences,
for example for the JASPAR2020 Bioconductor package. The regions are for
example promoters or accessible regions, which are grouped into bins according
to a numerical value assigned to each region, such as change of expression
or accessibility. The goal of the analysis is to identify transcription
factors that are associated to that numerical value and thus candidates
to be drivers in the underlying biological process.
In addition to the “binned motif enrichment analysis”, monaLisa can also be
used to address the above question using stability selection (a form of linear
regression), or to look for motif matches in sequences.
The return value se is a SummarizedExperiment with motifs in rows and bins
in columns, and multiple assays with significance and magnitude of the enrichments.
The inputs for calcBinnedMotifEnrR can be easily obtained using other
Bioconductor packages:
# get sequences ('lmrs' is a GRanges object)
library(Biostrings)
library(BSgenome.Mmusculus.UCSC.mm10)
seqs <- getSeq(BSgenome.Mmusculus.UCSC.mm10, lmrs)
# bin sequences ('deltaMeth' is a numerical vector)
bins <- monaLisa::bin(x = deltaMeth, binmode = "equalN", nElement = 800)
# obtain known motifs from Jaspar
library(JASPAR2020)
library(TFBSTools)
pwms <- getMatrixSet(JASPAR2020, list(matrixtype = "PWM", tax_group = "vertebrates"))
monaLisa: MOtif aNAlysis with LisaOverview
monaLisawas inspired by her father Homer to look for enriched motifs in sets (bins) of genomic regions, compared to all other regions (“binned motif enrichment analysis”).It uses known motifs representing transcription factor binding preferences, for example for the
JASPAR2020Bioconductor package. The regions are for example promoters or accessible regions, which are grouped into bins according to a numerical value assigned to each region, such as change of expression or accessibility. The goal of the analysis is to identify transcription factors that are associated to that numerical value and thus candidates to be drivers in the underlying biological process.In addition to the “binned motif enrichment analysis”,
monaLisacan also be used to address the above question using stability selection (a form of linear regression), or to look for motif matches in sequences.Current contributors include:
News
monaLisais available on BioconductormonaLisais now published in BioinformaticsCitation
To cite
monaLisaplease use the publication found here or seecitation("monaLisa").Installation
monaLisacan be installed from Bioconductor via theBiocManagerpackage:Functionality
Here is a minimal example to run a
monaLisaanalysis:The return value
seis aSummarizedExperimentwith motifs in rows and bins in columns, and multiple assays with significance and magnitude of the enrichments.The inputs for
calcBinnedMotifEnrRcan be easily obtained using other Bioconductor packages:The results can be conveniently visualized:
Github Actions (multiple OS):
