Key Bioconductor and R Packages for Bioinformatics

Top Bioconductor Packages

  • DESeq2: Differential gene expression analysis based on the negative binomial distribution.
  • edgeR: Differential expression analysis of RNA-Seq and other count data.
  • limma: Linear models for microarray data analysis.
  • GenomicRanges: Representation and manipulation of genomic intervals and variables.
  • SummarizedExperiment: Container for storing high-throughput assay data and metadata.
  • SingleCellExperiment: Container for single-cell RNA-Seq data.
  • AnnotationHub: Access to a variety of genome annotation resources.
  • BiocGenerics: S4 generic functions used across Bioconductor packages.
  • Rsamtools: Input/output, manipulation, and analysis of SAM/BAM files.
  • biomaRt: Interface to BioMart databases like Ensembl.
  • tximport: Import and summarize transcript-level estimates for gene-level analysis.
  • VariantAnnotation: Annotation of variants detected by high-throughput sequencing.
  • GenomicFeatures: Representation and manipulation of transcript annotation databases.
  • Biostrings: Efficient manipulation of biological strings.
  • Gviz: Plotting data and annotation information along genomic coordinates.
  • ComplexHeatmap: Making complex, annotated heatmaps.
  • scran: Methods for single-cell RNA-Seq data analysis.
  • scater: Single-cell analysis tools for quality control, normalization, and visualization.
  • IRanges: Infrastructure for representing and manipulating intervals.
  • HDF5Array: HDF5 backend for DelayedArray objects.
  • TCGAbiolinks: Tools for downloading, preparing, and analyzing TCGA data.
  • edgeRun: High-performance differential expression analysis for RNA-Seq data.
  • ChIPseeker: Annotation of peaks in ChIP-Seq data.
  • clusterProfiler: Statistical analysis and visualization of functional profiles for genes and gene clusters.
  • ensembldb: Utilities for working with Ensembl-based annotations.
  • GEOquery: Get data from NCBI Gene Expression Omnibus (GEO).
  • pathview: Pathway-based data integration and visualization.
  • GSEABase: Gene set enrichment data structures and methods.
  • GOstats: Tools for manipulating GO and microarray data.
  • GO.db: A set of annotation maps describing the entire Gene Ontology.
  • WGCNA: Weighted correlation network analysis for gene expression data.
  • maftools: Analysis and visualization of mutation annotation format (MAF) files.
  • DiffBind: Differential binding analysis of ChIP-Seq peak data.
  • BSgenome: Infrastructure for Biostrings-based genome data packages.
  • Rhtslib: High-throughput sequencing library as used by Rsamtools.
  • ShortRead: Import and analyze high-throughput sequencing data.
  • BiocParallel: Bioconductor facilities for parallel evaluation.
  • msigdb: Import Molecular Signatures Database (MSigDB) gene sets.
  • goseq: Gene Ontology analysis for RNA-Seq data.
  • ReactomePA: Pathway enrichment analysis with Reactome Pathway Database.
  • IRanges: Infrastructure for manipulating intervals on sequences.
  • sva: Surrogate variable analysis for removing batch effects and other unwanted variation in high-throughput experiments.
  • AnnotationForge: Tools for building SQLite-based annotation data packages.
  • BioCycData: Access to BioCyc Pathway/Genome Database Collection.
  • biovizBase: Basic graphic utilities for visualization of genomic data.
  • scRNAseq: Single-cell RNA sequencing data package.
  • TxDb.Hsapiens.UCSC.hg38.knownGene: Annotation package for TxDb object(s).
  • SummarizedExperiment: Container for storing high-throughput assay data and metadata.
  • GenomicAlignments: Representation and manipulation of short genomic alignments.
  • rtracklayer: Extensible framework for interacting with multiple genome browsers.

Top General R Packages

  • ggplot2: Data visualization package based on the Grammar of Graphics.
  • dplyr: Data manipulation functions that simplify complex operations on data frames.
  • tidyr: Tools for converting data to tidy format.
  • stringr: Simple, consistent functions to manipulate strings.
  • lubridate: Functions to work with date-times and timespans.
  • shiny: Web application framework for R.
  • caret: Classification and regression training package.
  • rmarkdown: Dynamic documents for R.
  • knitr: A general-purpose literate programming engine.
  • data.table: Extension of data.frame for fast manipulation of large datasets.
  • xtable: Export tables to LaTeX or HTML.
  • forecast: Tools for forecasting and time series analysis.
  • randomForest: Classification and regression based on a forest of trees using random inputs.
  • survival: Survival analysis, including penalized likelihood.
  • glmnet: Lasso and elastic-net regularized generalized linear models.
  • plotly: Interactive, web-based graphs via plotly’s JavaScript graphing library.
  • sf: Simple features for R, for handling vector data.
  • zoo: S3 infrastructure for regular and irregular time series.
  • tm: Text mining package for text mining applications.
  • lme4: Linear mixed-effects models using ‘Eigen’ and S4.
  • httr: Tools for working with URLs and HTTP.
  • rcpp: Seamless R and C++ integration.
  • sp: Classes and methods for spatial data.
  • leaflet: Create interactive web maps with the JavaScript ‘Leaflet’ library.
  • MASS: Functions and datasets to support Venables and Ripley’s MASS book.
  • readr: Read rectangular data (csv, tsv, fwf).
  • magrittr: Provides a mechanism for chaining commands with a new forward-pipe operator.
  • haven: Import and export ‘SPSS’, ‘Stata’ and ‘SAS’ files.
  • tibble: Modern re-imagining of data frames.
  • purrr: Functional programming tools.
  • janitor: Simple tools for examining and cleaning dirty data.
  • forcats: Tools for working with categorical variables (factors).
  • sparklyr: R interface for Apache Spark.
  • odbc: Connect to ODBC compatible databases.
  • curl: A Modern and Flexible Web Client for R.
  • jsonlite: A Simple and Robust JSON Parser and Generator for R.
  • xml2: A modern XML package.
  • RCurl: General network (HTTP/FTP/…) client interface for R.
  • highcharter: A wrapper for the ‘Highcharts’ library.
  • DT: A wrapper of the DataTables JavaScript library.
  • shiny: Easy interactive web applications with R.
  • flexdashboard: R Markdown Format for Flexible Dashboards.
  • DiagrammeR: Create graph diagrams and flowcharts using R.
  • visNetwork: Network visualization using vis.js library.
  • sf: Simple features for R.
  • tmap: Thematic maps.
  • mapview: Interactive viewing of spatial data.
  • plotly: Create interactive web graphics via ‘plotly.js’.
  • dygraphs: Interface to ‘dygraphs’ JavaScript Charting Library.
  • threejs: Interactive 3D scatter plots and globes.

This extended list provides a comprehensive overview of the most widely used Bioconductor and general R packages essential for bioinformatics, data analysis, and visualization. For more detailed information and exploration of these packages, you can visit the Bioconductor website and CRAN.

Leave a Reply

Your email address will not be published. Required fields are marked *