Standard set of data-loaders for training and making predictions for DNA sequence-based models.
All dataloaders in kipoiseq.dataloaders decorated with @kipoi_dataloader (SeqIntervalDl and StringSeqIntervalDl) are compatible Kipoi models and can be directly used when specifying a new model in model.yaml:
from kipoiseq.dataloaders import SeqIntervalDl
dl = SeqIntervalDl.init_example() # use the provided example files
# your own files
dl = SeqIntervalDl("intervals.bed", "genome.fa")
len(dl) # length of the dataset
dl[0] # get one instance. # returns a dictionary:
# dict(inputs=<one-hot-encoded-array>,
# targets=<additional columns in the bed file>,
# metadata=dict(ranges=GenomicRanges(chr=, start, end)...
all = dl.load_all() # load the whole dataset
# load batches of data
it = dl.batch_iter(32, num_workers=8) # load batches of data in parallel using 8 workers
# returns a dictionary with all three keys: inputs, targets, metadata
it = dl.batch_train_iter(32, num_workers=8)
# returns a tuple: (inputs, targets), can be used directly with keras' `model.fit_generator`
kipoiseq
Standard set of data-loaders for training and making predictions for DNA sequence-based models.
All dataloaders in
kipoiseq.dataloadersdecorated with@kipoi_dataloader(SeqIntervalDl and StringSeqIntervalDl) are compatible Kipoi models and can be directly used when specifying a new model inmodel.yaml:Installation
Optional dependencies:
Getting started
More info:
How to write your own data-loaders
SeqIntervalDlin kipoiseq/dataloaders/sequence.py@kipoi_dataloaderand the long yaml doc-string. These are only required if you want to use dataloaders in Kipoi’s model.yaml files.