Bacterial Annotation by Learned Representation Of Genes (C++ version)
Overview
This is intended as a repository for stable bioconda releases. The main Balrog repo can be found here
Balrog is a prokaryotic gene finder based on a Temporal Convolutional Network. We took a data-driven approach to prokaryotic gene finding, relying on the large and diverse collection of already-sequenced genomes. By training a single, universal model of bacterial genes on protein sequences from many different species, we were able to match the sensitivity of current gene finders while reducing the overall number of gene predictions. Balrog does not need to be refit on any new genome.
Bacterial Annotation by Learned Representation Of Genes (C++ version)
Overview
This is intended as a repository for stable bioconda releases. The main Balrog repo can be found here
Balrog is a prokaryotic gene finder based on a Temporal Convolutional Network. We took a data-driven approach to prokaryotic gene finding, relying on the large and diverse collection of already-sequenced genomes. By training a single, universal model of bacterial genes on protein sequences from many different species, we were able to match the sensitivity of current gene finders while reducing the overall number of gene predictions. Balrog does not need to be refit on any new genome.
Preprint available on bioRxiv here.
Getting started
Conda can be very slow and are working on providing precompiled binaries for unix systems in the near future