Many bioinformatics tasks require converting gene identifiers from one
convention to another, or annotating gene identifiers with gene symbol,
description, position, etc. Sure,
biomaRt
does this for you, but I got tired of remembering biomaRt syntax and
hammering Ensembl’s servers every time I needed to do this.
This package has basic annotation information from Ensembl Genes 114
for:
Human build 38 (grch38)
Human build 37 (grch37)
Mouse (grcm38)
Rat (rnor6)
Chicken (galgal5)
Worm (wbcel235)
Fly (bdgp6)
Macaque (mmul801)
Pig (Sscrofa11.1)
Dog (ROS_Cfam_1.0)
Zebrafish (GRCz11)
Where each table contains:
ensgene: Ensembl gene ID
entrez: Entrez gene ID
symbol: Gene symbol
chr: Chromosome
start: Start
end: End
strand: Strand
biotype: Protein coding, pseudogene, mitochondrial tRNA, etc.
description: Full gene name/description
Additionally, there are tx2gene tables that link Ensembl gene IDs to
Ensembl transcript IDs.
Usage
library(annotables)
Look at the human genes table (note the description column gets cut off
because the table becomes too wide to print nicely):
## Warning: `tbl_df()` was deprecated in dplyr 1.0.0.
## ℹ Please use `tibble::as_tibble()` instead.
## ℹ The deprecated feature was likely used in the biobroom package.
## Please report the issue at <https://github.com/StoreyLab/biobroom/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
annotables
Provides tables for converting and annotating Ensembl Gene IDs.
Installation
Rationale
Many bioinformatics tasks require converting gene identifiers from one convention to another, or annotating gene identifiers with gene symbol, description, position, etc. Sure, biomaRt does this for you, but I got tired of remembering biomaRt syntax and hammering Ensembl’s servers every time I needed to do this.
This package has basic annotation information from Ensembl Genes 114 for:
grch38)grch37)grcm38)rnor6)galgal5)wbcel235)bdgp6)mmul801)Sscrofa11.1)ROS_Cfam_1.0)GRCz11)Where each table contains:
ensgene: Ensembl gene IDentrez: Entrez gene IDsymbol: Gene symbolchr: Chromosomestart: Startend: Endstrand: Strandbiotype: Protein coding, pseudogene, mitochondrial tRNA, etc.description: Full gene name/descriptionAdditionally, there are
tx2genetables that link Ensembl gene IDs to Ensembl transcript IDs.Usage
Look at the human genes table (note the description column gets cut off because the table becomes too wide to print nicely):
Look at the human genes-to-transcripts table:
Tables are saved in tibble format, pipe-able with dplyr:
Example with DESeq2 results from the airway package, made tidy with biobroom: