GFF3 and BED are common formats for storing the coordinates of genomic features such as genes. GFF3 format is more versatile, but BED format is simpler and enjoys a rich ecosystem of utilities such as bedtools. For this reason, it is often convenient to store genomic features in GFF3 format and convert them to BED format for genome arithmetic.
This module provides two convenience functions to streamline converting data from GFF3 to BED format for bioinformatics analysis: parse(), which reads data from a GFF3 file, and convert(), which converts GFF3-formatted data to BED-formatted data that can be passed on e.g. to pybedtools.
Note: The implementation of gff2bed follows a philosophy of simplicity. It depends on nothing but the built-in python libraries, and it includes nothing but the parse() and convert() functions. Typically when applying gff2bed in practice, you will use it in conjunction with other modules such as pandas or pybedtools.
To create a data frame of BED formatted data, pass the stream to gff2bed.convert() before passing to pd.DataFrame()
gff2bed
Overview
GFF3 and BED are common formats for storing the coordinates of genomic features such as genes. GFF3 format is more versatile, but BED format is simpler and enjoys a rich ecosystem of utilities such as bedtools. For this reason, it is often convenient to store genomic features in GFF3 format and convert them to BED format for genome arithmetic.
This module provides two convenience functions to streamline converting data from GFF3 to BED format for bioinformatics analysis:
parse(), which reads data from a GFF3 file, andconvert(), which converts GFF3-formatted data to BED-formatted data that can be passed on e.g. to pybedtools.Documentation
See full online documentation at http://salk-tm.gitlab.io/gff2bed
Installation
With
condagff2bedis available from bioconda, and can be installed withcondaWith
pipgff2bedis available from PyPI, and can be installed withpipTutorial
To follow this tutorial, first ensure you have the following modules installed in addition to
gff2bed:This tutorial will involve working with some files on disk, so we’ll make a temporary directory for easy cleanup later.
Next, download an example GFF3 file
To read the GFF3 file into a Pandas data frame without converting to BED, use
gff2bed.parse()To create a data frame of BED formatted data, pass the stream to
gff2bed.convert()before passing topd.DataFrame()You can similarly create a
BedToolwithpybedtoolsTo complete the tutorial, clean up the temporary directory