Author Archives: gene_x

Creating a 3D scatterplot using ggplot2 in R

library(ggplot2)
library(plotly)

# Create some random data
set.seed(123)
x <- rnorm(100)
y <- rnorm(100)
z <- rnorm(100)

# Perform PCA
data <- data.frame(x, y, z)
pca <- prcomp(data, scale = TRUE)
scores <- as.data.frame(pca$x)

# Create 3D scatterplot using ggplot2 and plotly
ggplot(scores, aes(x = PC1, y = PC2, z = PC3)) +
  geom_point(size = 3, color = "blue") +
  labs(x = "PC1", y = "PC2", z = "PC3") +
  theme_bw() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        axis.line = element_line(colour = "black", size = 0.5),
        panel.border = element_blank()) +
  scale_x_continuous(limits = c(-4, 4), expand = c(0, 0)) +
  scale_y_continuous(limits = c(-4, 4), expand = c(0, 0)) +
  scale_z_continuous(limits = c(-4, 4), expand = c(0, 0)) +
  coord_cartesian(xlim = c(-4, 4), ylim = c(-4, 4), zlim = c(-4, 4)) +
  ggtitle("3D Scatterplot using PCA") +
  theme(plot.title = element_text(size = 12, face = "bold"))

This code creates a 3D scatterplot using ggplot2 and the plotly package. It first generates some random data and performs principal component analysis (PCA) on it. The resulting PCA scores are then plotted in a 3D scatterplot. The labs() function is used to set the axis labels, and the ggtitle() function is used to set the plot title. The scale_x_continuous(), scale_y_continuous(), and scale_z_continuous() functions are used to set the axis limits, and the coord_cartesian() function is used to ensure that the plot is displayed with the specified limits. The theme() function is used to adjust the appearance of the plot.

draw 2D PCA from rld

library(ggplot2)
data <- plotPCA(rld, intgroup=c("condition", "donor"), returnData=TRUE)
ggplot(data, aes(x=PC1, y=PC2, color=condition, shape=donor)) +
      geom_point(size=8) + labs(x = "PC1", y = "PC2") +
      scale_color_manual(values = c("untreated" = "grey",
                                    "mCh d3"="#a6cee3",
                                    "mCh d8"="#1f78b4",
                                    "GFP+mCh d9/12"="cyan",
                                    "GFP d3"="#b2df8a",
                                    "GFP d8"="#33a02c",
                                    "sT d3"="#fb9a99",
                                    "sT d8"="#e31a1c",
                                    "LT d3"="#fdbf6f",
                                    "LT d8"="#ff7f00",
                                    "LTtr d3"="#cab2d6",
                                    "LTtr d8"="#6a3d9a",
                                    "sT+LT d3"="#ffff99",                        
                                    "sT+LTtr d9/12"="#a14a1a")) 
      xlab(paste0("PC1: ",percentVar[1],"% variance")) +
      ylab(paste0("PC2: ",percentVar[2],"% variance")) + theme(axis.text = element_text(face="bold",size = 21), axis.title = element_text(face="bold",size = 21)) + theme(legend.text = element_text(size = 20)) + theme(legend.title = element_text(size = 22)) + guides(color = guide_legend(override.aes = list(size = 10)), shape = guide_legend(override.aes = list(size = 10)), alpha = guide_legend(override.aes = list(size = 10)))

bubble plots in R using ggplot2

library(ggplot2)
png("bubble_plot.png", 3000, 2000)
ggplot(mydat, aes(y = Term, x = Comparison)) + geom_point(aes(color = Regulation, size = Count, alpha = abs(log10(FDR)))) + scale_color_manual(values = c("up" = "red", "down" = "blue")) + scale_size_continuous(range = c(1, 34)) + labs(x = "", y = "", color="Regulation", size="Count", alpha="-log10(FDR)") + theme(axis.text.y = element_text(face = bold.labels))+ theme(axis.text.x = element_text(angle = 30, vjust = 0.5)) + theme(axis.text = element_text(size = 40)) + theme(legend.text = element_text(size = 40)) + theme(legend.title = element_text(size = 40))+
  guides(color = guide_legend(override.aes = list(size = 20)), alpha = guide_legend(override.aes = list(size = 20)))
dev.off()

library(ggplot2)
library(dplyr)
library(magrittr)
library(tidyr)
library(forcats)

mydat <- read.csv2("GO_DAVID_Summary_R.csv", sep=",", header=TRUE)
#mydat$GeneRatio <- sapply(mydat$GeneRatio_frac, function(x) eval(parse(text=x)))
mydat$FoldEnrichment <- as.numeric(mydat$FoldEnrichment)
mydat$Comparison <- factor(mydat$Comparison, levels=c("sT 3 dpi","sT 8 dpi","LT 3 dpi","LT 8 dpi","LTtr 3 dpi","LTtr 8 dpi","sT+LT 3 dpi","sT+LTtr 9/12 dpi"))    # solution for point 2
mydat$Term <- factor(mydat$Term, levels=rev(c("Cell proliferation","Negative regulation of cell proliferation","Cell division","Mitotic nuclear division","G1/S transition of mitotic cell cycle","DNA replication initiation","DNA replication","Sister chromatid cohesion","DNA repair","Cellular response to DNA damage stimulus","Transcription, DNA-templated","Regulation of transcription, DNA-templated","Positive regulation of transcription, DNA-templated","Positive regulation of transcription from RNA polymerase II promoter","Positive regulation of gene expression","Negative regulation of transcription from RNA polymerase II promoter","rRNA processing","Protein folding","Inflammatory response","Immune response","Innate immune response","Positive regulation of ERK1 and ERK2 cascade","Chemokine-mediated signaling pathway","Chemotaxis","Cell chemotaxis","Neutrophil chemotaxis","Viral process","Response to virus","Defense response to virus","Cellular response to lipopolysaccharide","Type I interferon signaling pathway","Extracellular matrix organization","Cell adhesion","Nervous system development","Angiogenesis","Apoptotic process")))    # solution for point 2
mydat$Regulation <- factor(mydat$Regulation, levels=c("up","down"))
png("bubble.png", width=1166, height=1067)
ggplot(mydat, aes(y = Term, x = Comparison, size = FoldEnrichment)) + geom_point(aes(color = Regulation), alpha = 1.0) + labs(x = "", y = "") + theme(axis.text.x = element_text(angle = 30, vjust = 0.5)) + theme(axis.text = element_text(size = 20)) + theme(legend.text = element_text(size = 20)) + theme(legend.title = element_text(size = 20)) + scale_size(range = c(1, 20)) + guides(color = guide_legend(override.aes = list(size = 10)))    # solution for point 1
dev.off()

libraries with a syntax similar to ggplot2 for creating 3D plots in R

If you are looking for a library with a syntax similar to ggplot2 for creating 3D plots in R, you might want to check out the ggplot2 extension packages ggplot2_3d, ggplot2rayshader, or rayshader. Here is a brief description of each package:

ggplot2_3d: This package provides an extension to ggplot2 that allows you to create 3D plots using the same syntax as ggplot2. The package adds a geom_3d() function and several 3D coordinate systems that can be used to create different types of 3D plots.

ggplot2rayshader: This package extends ggplot2 with the rayshader package to create 3D plots with realistic shading and lighting effects. It provides functions for creating elevation maps, hillshades, and combining them with ggplot2 layers.

rayshader: This package allows you to create 2D and 3D maps and visualizations using a combination of ggplot2 and raytracing techniques. It provides functions for creating elevation maps, hillshades, and adding water features and labels to the plots.

Here is an example code snippet using the ggplot2_3d package to create a 3D scatter plot:

library(ggplot2)
library(ggplot2_3d)

# create a 3D scatter plot
ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, z = Petal.Length, color = Species)) +
      geom_3d(point_size = 3) +
      theme_3d()
library(ggplot2)
library(ggplot2_3d)

# create a 3D scatter plot
ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, z = Petal.Length, color = Species)) +
  geom_3d(point_size = 3) +
  theme_3d()

This will create a 3D scatter plot of the iris dataset with Sepal.Length, Sepal.Width, and Petal.Length as the x, y, and z coordinates, respectively, and Species as the color variable. The geom_3d() function is used to specify the type of plot, and the theme_3d() function is used to add a 3D theme to the plot.

How to use the TxDb.Hsapiens.UCSC.hg38.knownGene package in R?

To use the TxDb.Hsapiens.UCSC.hg38.knownGene package in R, you will need to follow these steps: :joy: https://www.markdownguide.org/basic-syntax/ https://www.markdownguide.org/extended-syntax/

Tux, Logo

Tux, yopK

  • Install the TxDb.Hsapiens.UCSC.hg38.knownGene package from Bioconductor using the following code:
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("TxDb.Hsapiens.UCSC.hg38.knownGene")
if (!requireNamespace("BiocManager", quietly = TRUE))

  install.packages("BiocManager")

BiocManager::install("TxDb.Hsapiens.UCSC.hg38.knownGene")
  • Load the package using the library() function:
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
  • Load the GenomicFeatures package, which is required by TxDb packages:
library(GenomicFeatures)
  • Use the TxDb.Hsapiens.UCSC.hg38.knownGene package to create a transcript database object using the TxDb() function:
txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
  • This will create a TxDb object that contains information about gene models and transcripts from the hg38 version of the human genome, based on the knownGene track in the UCSC Genome Browser.
  • You can then use the GenomicFeatures package to retrieve information about genes and transcripts from the TxDb object. For example, you can use the exonsBy() function to retrieve exon coordinates for a given gene:
    gene <- "ENSG00000139618"
    exons <- exonsBy(txdb, gene)

This will retrieve the exon coordinates for the gene with the Ensembl ID “ENSG00000139618”.

Note that the TxDb objects can be memory-intensive, especially for larger genomes or datasets, so you may need to be careful about the amount of data you load into memory at once. You can also use the saveDb() and loadDb() functions to save and load TxDb objects to/from disk, respectively.

update all packages in R

# Load the utils package
library(utils)

# Get information about all installed packages
pkg_info <- packageStatus()

# Filter the information to show only packages that need to be updated
outdated_packages <- pkg_info[,"needsUpdate"]

# Display the number of outdated packages
cat("Number of outdated packages:", length(outdated_packages))

pkg_info <- packageStatus()
pkg_info

Number of installed packages:

                            ok upgrade unavailable

/usr/local/lib/R/site-library 467 27 161 /usr/lib/R/site-library 0 0 0 /usr/lib/R/library 16 13 0

Number of available packages (each package counted only once):

                                           installed not installed

http://ftp.gwdg.de/pub/misc/cran/src/contrib 506 18771

/usr/local/lib/R/site-library /usr/lib/R/library/

sudo chown -R jhuang:jhuang /usr/lib/R/library/ 
sudo chown -R jhuang:jhuang /usr/share/R/doc/html
#update.packages(ask = FALSE)
update.packages(repos='http://cran.rstudio.com/', ask=FALSE, checkBuilt=TRUE)

draw pca plots with ggplot2 (2D) and plotly (3D)

#TODOs: next week
#- try install ggplot2_3d
#- try install kaleido

TODO: using python to generate the 3D plot! https://pypi.org/project/plotly/

# -- before pca --
#png("pca.png", 1200, 800)
svg("pca.svg")
plotPCA(rld, intgroup=c("replicates"))
#plotPCA(rld, intgroup = c("replicates", "batch"))
#plotPCA(rld, intgroup = c("replicates", "ids"))
#plotPCA(rld, "batch")
dev.off()

#TODO:adding label in the figure, change Donor I in blue and donor II in orange
#https://loading.io/color/feature/Paired-12/
svg("pca2.svg")
#plotPCA(rld, intgroup=c("replicates"))
#plotPCA(rld, intgroup = c("replicates", "batch"))
plotPCA(rld, intgroup = c("donor"))
#plotPCA(rld, "batch")
dev.off()
#TODO: adding label in the figure
svg("pca3.svg")
plotPCA(rld, intgroup=c("replicates2"))
dev.off()

#https://loading.io/color/feature/Paired-12/
#https://support.bioconductor.org/p/66404/
# -- calculate PC3 from rld --
library(genefilter)
ntop <- 500
rv <- rowVars(assay(rld))
select <- order(rv, decreasing = TRUE)[seq_len(min(ntop, length(rv)))]
mat <- t( assay(rld)[select, ] )
pc <- prcomp(mat)
pc$x[,1:3]

#To my case:
mat <- t( assay(rld) )
pc <- prcomp(mat)
#pca <- yourFavoritePCA( mat )

pc$x[,1:3]
#                          PC1         PC2        PC3
#untreated DI      -27.5379705   1.4478299  -6.389731
#untreated DII     -28.3320463   0.6794066   2.073768
#mCh d3 DII          2.8988953  -6.4372647  10.252829
#sT d3 DII           5.1869876   2.6116282  13.816117
#mCh d8 DII        -20.8047275   1.0708861   3.394721
#sT d8 DII          -4.5144119  19.6230473   8.357902
#mCh d3 DI          -4.5690693  -8.8938297  -7.391567
#sT d3 DI           -7.6326832   5.3781061   2.214181
#mCh d8 DI           0.8536828  -5.0593045 -13.325567
#sT d8 DI            1.9232111  24.8795741  -4.162946
#GFP d3 DII        -12.5042914  -3.3424106  15.207755
#LTtr d3 DII         5.2309178  -9.6124712   8.328132
#GFP d8 DII        -13.0652347  -8.2058086  15.078469
#LTtr d8 DII        13.0678654  -2.0677676   9.188943
#GFP d3 DI         -13.9999251  -1.4988226  -3.335085
#LTtr d3 DI          2.6090782  -9.5753559 -10.022324
#GFP d8 DI         -12.4430571  -6.0670545 -14.725450
#LTtr d8 DI          8.9794396   3.4918629 -14.410118
#LT d8 DII          18.8388058   0.2459081   2.334700
#LT d8 DI           15.2986278  -0.6055500 -11.034778
#GFP+mCh d9/12 DI  -17.3162152   3.2939931  -6.917358
#sT+LTtr d9/12 DI    6.8517730  17.9282911  -6.209778
#GFP+mCh d9/12 DII   2.0874834  -6.7379107   8.810602
#sT+LTtr d9/12 DII  19.3883422  19.6033774   4.314808
#LT d3 DI            6.5376031  -8.5766236  -6.500155
#LT d3 DII          17.8400725 -11.7362896   1.117396
#sT+LT d3 DI        16.6029944  -7.7951798  -5.593658
#sT+LT d3 DII       18.5238521  -4.0422674   5.528193

# vs.
#data
#                     PC1        PC2         group condition donor          name
#untreated DI  -27.537970  1.4478299  untreated:DI untreated    DI  untreated DI
#untreated DII -28.332046  0.6794066 untreated:DII untreated   DII untreated DII
#mCh d3 DII      2.898895 -6.4372647    mCh d3:DII    mCh d3   DII    mCh d3 DII

# -- construct a data structure (merged_df) as above with data and pc --
library(ggplot2)
data <- plotPCA(rld, intgroup=c("condition", "donor"), returnData=TRUE)
#calculate all PCs including PC3 with the following codes
library(genefilter)
ntop <- 500
rv <- rowVars(assay(rld))
select <- order(rv, decreasing = TRUE)[seq_len(min(ntop, length(rv)))]
mat <- t( assay(rld)[select, ] )
pc <- prcomp(mat)
pc$x[,1:3]
df_pc <- data.frame(pc$x[,1:3])

identical(rownames(data), rownames(df_pc)) #-->TRUE
## define the desired order of row names
#desired_order <- rownames(data)
## sort the data frame by the desired order of row names
#df <- df[match(desired_order, rownames(df_pc)), ]

data$PC1 <- NULL
data$PC2 <- NULL
merged_df <- merge(data, df_pc, by = "row.names")
#merged_df <- merged_df[, -1]
row.names(merged_df) <- merged_df$Row.names
merged_df$Row.names <- NULL  # remove the "name" column
merged_df$name <- NULL 
merged_df <- merged_df[, c("PC1","PC2","PC3","group","condition","donor")]

# -- draw 3D with merged_df using plot3D --
#https://stackoverflow.com/questions/45052188/how-to-plot-3d-scatter-diagram-using-ggplot
devtools::install_github("AckerDWM/gg3D")
library("gg3D")
png("pca10.png",800,800)
#svg("pca10.svg",10,10)
#methods(class = "prcomp")
#summary(pc) #--> Proportion of Variance  0.3647 0.1731 0.1515
#percentVar <- round(100 * attr(data, "percentVar"))
percentVar <- c(36,17,15)
#scatterplot3d

#Unfortunately, ggplot does not support 3D plotting. It is designed for creating 2D plots and visualizations in R. However, there are other packages available in R for creating 3D plots, such as plot3D, scatterplot3d, and rgl. These packages can be used to create 3D scatter plots, surface plots, and more complex 3D visualizations. You can install and load these packages in R using the following commands:
install.packages("plot3D")
library(plot3D)
install.packages("scatterplot3d")
library(scatterplot3d)
install.packages("rgl")
library(rgl)
#Once you have loaded these packages, you can create 3D plots using their respective functions. For example, you can create a 3D scatter plot using the plot3D package with the following code:

#https://plotly.com/r/3d-scatter-plots/
library(plotly)
data(mtcars)
png("xxx.png", 1200, 800)
plot_ly(mtcars, x = ~mpg, y = ~wt, z = ~qsec, type = "scatter3d", mode = "markers")
dev.off()

#  zlab(paste0("PC3: ",percentVar[3],"% variance")) + 
#scatterplot3d(merged_df[,c("PC1","PC2","PC3")], pch=16, color="blue", main="3D Scatter Plot")
# labs(x = "PC1", y = "PC2", z = "PC3") +
#axes_3D() + stat_3D() +
colors = "Set1",
#marker = list(symbol = ~shapes[group])
#using the corresponding keywords ("square", "triangle-up", "diamond", etc.). 

labs <- list(x = paste0("PC1: ",percentVar[1],"% variance"), y = paste0("PC2: ",percentVar[2],"% variance"), z = paste0("PC3: ",percentVar[3],"% variance"))
#ggplot(merged_df, aes(x=PC1, y=PC2, z=PC3, color=condition, shape=donor)) +

#https://stackoverflow.com/questions/75452609/update-color-in-different-marker-in-plotly-r
dt <- iris
dt$shape_1 <- c("Yes","No")
dt$color_1 <- c("Medium","Large","Small")

library(plotly)
library(kaleido)
fig <- plot_ly(dt,
        x=1:nrow(iris),
        y=~Sepal.Length,
        type="scatter",
        mode='markers',
        color=~Species,
        colors = c("#4477AA","#DDCC77","#CC6677"),
        symbol = ~shape_1,
        symbols = c("triangle-up", "circle"),
        size = 20) %>% add_surface()
# save the chart as an SVG file using Kaleido
kaleido(fig, file = "chart.svg", format = "svg", width = 800, height = 600, 
        scale = 2, output_options = list(bg = "white"))

#        inherit = F,
#        size = ~Sepal.Width, 
#        sizes = c(10, 100) * 10)

plot_ly(dt,
+         x=1:nrow(iris),
+         y=~Sepal.Length,
+         type="scatter",
+         mode='markers',
+         color=~Species,
+         colors = c("#4477AA","#DDCC77","#CC6677"),
+         size = ~Sepal.Width, 
+         symbol = ~shape_1,
+         symbols = c("triangle-up", "circle"),
+         inherit = F,
+         sizes = c(10, 100) * 10))
%>%
+ add_trace(type="scatter",
+           mode = "text",
+           text=~Sepal.Width,
+           textposition = "top right",
+           color = ~color_1,
+           colors = c("black","green","blue"),
+           textfont = list(size = 10)
+ )

factors(merged_df$condition)

"GFP d3"        "GFP d8"        "GFP+mCh d9/12" "LT d3"        
 [5] "LT d8"         "LTtr d3"       "LTtr d8"       "mCh d3"       
 [9] "mCh d8"        "sT d3"         "sT d8"         "sT+LT d3"     
[13] "sT+LTtr d9/12" "untreated"

merged_df$condition <- factor(merged_df$condition, levels=c("untreated","mCh d3","mCh d8","GFP+mCh d9/12","GFP d3","GFP d8","sT d3","sT d8","LT d3","LT d8","LTtr d3","LTtr d8","sT+LT d3","sT+LTtr d9/12"))
merged_df$donor <- as.character(merged_df$donor)
# Define a list of shapes for each group
shapes <- list("circle", "triangle-up")
plot_ly(merged_df, x=~PC1, y=~PC2, z=~PC3, type = "scatter3d", mode = "markers",   color=~condition, colors = c("grey","#a6cee3","#1f78b4","cyan","#b2df8a","#33a02c","#fb9a99","#e31a1c","#fdbf6f","#ff7f00","#cab2d6","#6a3d9a","#ffff99","#a14a1a"), symbol=~donor, symbols = c("triangle-up", "circle"))
  geom_point_3d() +
  scale_x_continuous(name = labs$x) +
  scale_y_continuous(name = labs$y) +
  scale_z_continuous(name = labs$z) +
  geom_point(size=8) + 
  scale_color_manual(values = c("untreated" = "grey",
                                "mCh d3"="#a6cee3",
                                "mCh d8"="#1f78b4",
                                "GFP+mCh d9/12"="cyan",
                                "GFP d3"="#b2df8a",
                                "GFP d8"="#33a02c",
                                "sT d3"="#fb9a99",
                                "sT d8"="#e31a1c",
                                "LT d3"="#fdbf6f",
                                "LT d8"="#ff7f00",
                                "LTtr d3"="#cab2d6",
                                "LTtr d8"="#6a3d9a",
                                "sT+LT d3"="#ffff99",                        
                                "sT+LTtr d9/12"="#a14a1a")) +
                                theme(axis.text = element_text(face="bold",size = 21), axis.title = element_text(face="bold",size = 21)) + theme(legend.text = element_text(size = 20)) + theme(legend.title = element_text(size = 22)) + guides(color = guide_legend(override.aes = list(size = 10)), shape = guide_legend(override.aes = list(size = 10)), alpha = guide_legend(override.aes = list(size = 10)))

#axis.title = element_text(face="bold",size = 20)
#p + theme(axis.text.x = element_text(face="bold",size=14), axis.text.y = element_text(face="bold",size=14))
#+ coord_fixed()
#+ theme(
#    # Set the width to 6 inches
#    fig.width = 6,
#    # Set the height to 4 inches
#    fig.height = 4
#  )

svg("pca4.svg")
plotPCA(rld, intgroup=c("days"))
dev.off()

scatter plot with categorical data using ggplot2

load packages

library(tidyverse)
library(palmerpenguins)
library(ggbeeswarm)
library(ggforce)

#remotes::install_github("allisonhorst/palmerpenguins")
# peek at penguins data
#glimpse(penguins)

# create some example data
x <- c(rep("A", 100), rep("B", 100), rep("C", 100))
y <- rnorm(300)

# create a data frame with x, y, and color columns
df <- data.frame(x = x, y = y, color = ifelse(c(1:300) %in% c(5, 10, 15), "Highlighted", "Normal"))

#  geom_point(size = 3) +
# plot the data with points colored by category and highlight
ggplot(df, aes(x = x, y = y, color = color)) +
  scale_color_manual(values = c("Normal" = "black", "Highlighted" = "red")) +
  geom_beeswarm(cex = 1.5) +
  theme_classic()

 #In this script, we use ggplot2 to create a scatter plot with categorical data and highlight some points. We start by creating an example dataset with a categorical variable x and a continuous variable y. We then create a data frame df with x, y, and color columns. The color column is set to "Highlighted" for the points we want to highlight, and "Normal" for the rest.

 #We then use ggplot2 to plot the data. We set x, y, and color to the corresponding columns in df using the aes() function. We use geom_point() to plot the points, and set the size argument to control the size of the points. We use scale_color_manual() to set the colors for the "Normal" and "Highlighted" categories. Finally, we use theme_classic() to set the theme of the plot to a classic theme.

Install top 24 Python Libraries for Data Science with pip

There are many packages that can be installed using pip, the Python package manager. Some of the commonly used packages that can be installed with pip include:

  • TensorFlow: an open-source machine learning framework developed by Google for building and training machine learning models.
  • NumPy: a library for scientific computing with Python, providing efficient numerical operations for multi-dimensional arrays and matrices.
  • SciPy: a collection of libraries for scientific and technical computing with Python, including tools for optimization, linear algebra, signal processing, and more.
  • Pandas: a library for data manipulation and analysis in Python, providing tools for reading, writing, and manipulating tabular data.
  • Matplotlib: a library for creating visualizations and plots in Python, providing tools for creating various types of charts and graphs.
  • Keras: an open-source neural network library written in Python, designed to enable fast experimentation with deep neural networks.
  • SciKit-Learn: a library for machine learning in Python, providing tools for data preprocessing, feature extraction, supervised and unsupervised learning, and model evaluation.
  • PyTorch: an open-source machine learning framework developed by Facebook for building and training machine learning models.
  • Scrapy: a framework for web scraping and crawling in Python, providing tools for extracting data from websites and APIs.
  • BeautifulSoup: a library for parsing HTML and XML documents in Python, providing tools for extracting and manipulating data from web pages.
  • LightGBM: a gradient boosting framework that uses tree-based learning algorithms, designed to be efficient and scalable for large-scale machine learning tasks.
  • ELI5: a library for explaining and visualizing machine learning models in Python, providing tools for feature importances, model weights, and more.
  • Theano: a library for numerical computation in Python, designed to allow developers to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays.
  • NuPIC: a machine intelligence platform for building intelligent applications, based on the principles of neuroscience and machine learning.
  • Ramp: a library for building predictive models in Python, designed to simplify the process of building and evaluating machine learning models.
  • Pipenv: a tool for managing Python dependencies and virtual environments, designed to make it easier to manage packages and versions.
  • Bob: a toolbox for machine learning in Python, providing tools for face recognition, speaker recognition, and more.
  • PyBrain: a library for building and training neural networks in Python, designed to be modular and flexible for a wide range of tasks.
  • Caffe2: a deep learning framework developed by Facebook for building and training neural networks, designed to be efficient and scalable for large-scale tasks.
  • Chainer: a Python-based deep learning framework for building and training neural networks, designed to be flexible and scalable for a wide range of tasks.
  • Django is a high-level web framework that provides a structured and scalable way to build web applications in Python. It includes built-in tools for handling tasks such as authentication, URL routing, and database schema migrations.
  • Flask is a lightweight web framework that provides flexibility and simplicity to developers. It allows you to build web applications and APIs in Python with minimal boilerplate code and provides support for extensions to add functionality.
  • Bottle is another lightweight web framework that allows you to build web applications and APIs in Python. It is designed to be simple and easy to use, with minimal dependencies.
  • Requests is a package that provides a simple and easy-to-use interface for sending HTTP requests in Python. It supports various HTTP methods such as GET, POST, PUT, DELETE, etc. and also allows you to customize headers, cookies, and other request parameters.

For example, to install the NumPy package, you can use the following command:

pip install numpy
pip install plotly==4.10.0

You can use the following command to check which packages are currently installed in your Python environment using pip:

pip list

This command will display a list of all the packages that have been installed using pip, along with their version numbers. If you want to check the version number of a specific package, you can use the following command:

pip show plotly
pip list | grep plotly

How to correct indent errors in Python?

There are several tools and editors that can help you correct indent errors in Python. Here are a few:

  • Integrated Development Environments (IDEs): Popular IDEs such as PyCharm, Visual Studio Code, and Spyder have built-in features to help you identify and correct indentation errors.

  • Text Editors: Text editors such as Sublime Text and Notepad++ can help you identify and correct indentation errors as well. However, you may need to install third-party plugins or packages to get this functionality.

  • Linters: Linters are tools that can check your code for syntax and formatting errors. Some popular Python linters include Flake8, Pylint, and Pyflakes. These tools can help you identify indentation errors as well as other common mistakes.

  • Online Tools: There are also several online tools available that can help you check your Python code for indentation errors. For example, you can use the Python Indentation Validator at https://python-indentation-validator.com/ to check your code.

Remember that in Python, indentation is important and errors can lead to unexpected results or even cause your code to fail. It’s important to always pay attention to indentation when writing Python code.