catalogs
Literature Seurat Processing Steps
①HDS analysis steps
② Original Literature Data Processing
Loading single-cell Seurat
Converting the proportion of each cell per sample
Comparison of two subgroup ratios
①Proportional Calculation
② Drawing
Differences between subgroups in proportions of single cells
①Add group information
②Cyclic graphing
Literature Seurat Processing Steps
Histone deacetylase-mediated tumor microenvironment characterization and synergistic immunotherapy in gastric cancer (Multi-omics literature study)_acrg Cohort - CSDN Blogs
①HDS analysis steps
Single-cell transcriptome sequencingSteps of analysis: CellRanger was used to identify the cells and construct the Seurat matrix, low quality cells were filtered according to the previous studies, and finally the data were obtained for cluster analysis. First, the top 2000 genes with the highest variance were selected for data normalization, and the data were normalized using theprincipal component analysis (PCA)(PCA) reduces the data dimensionality to 50 principal components and removes batch effects from the sample using harmony. The data were analyzed by tSNEcluster analysisA total of 9 clusters were identified (B cells, CD4+ T cells, CD8+ T cells, NK cells, mast cells, endothelial cells, fibroblasts, myeloid cells, and plasma cells). Then, gene expression matrices were extracted for each cell cluster to identify subpopulations.
The top 2000 genes of variance were used for principal component analysis, and the top 25-30 principal components were used for Harmony batch correction.Wilcoxon rank sum test was used to identify differentially expressed genes between subpopulations. Finally, ligand and receptor information from the CellChat library was used to analyze the communication between the subpopulations of cells.
③ Seurat process for single-cell learning-pbmc_seurat Deletion of discrete cells - CSDN Blogs
②Original Literaturedata processing
Parallel single-cell and bulk transcriptome analyses reveal key features of the gastric tumor microenvironment | Genome Biology | Full Text ()
Single-cell sequencing data processing
The 10X droplet-based single-cell RNA sequencing data were processed using CellRanger toolkit (version 3.0.0) provided by 10X Genomics. Gene expression levels are quantified using GRCh38 reference genome (Ensembl 93 annotation). For each cell identified by CellRanger, we calculated the total number of detected genes, total number of UMI counts, and proportion of mitochondrial reads. A set of quality thresholds was applied to filter out low-quality cells, including detection of 200–7500 genes, 500–75,000 UMI counts, and less than 10% mitochondrial reads, resulting in a total cell number of 117,506 post-filter cells that were used for clustering analysis.
Normalization and batch effect correction
Using theSeurat-SCTransform and harmony integration learning-CSDN Blog
Cells passing quality filter were normalized with SCTransform [84] using the default parameters. Independent component analysis (ICA) was applied on the normalized gene-cell matrix to identify potential batch effects. Out of 128 independent components, an independent component (IC_15) was found to have a highly sample-specific distribution (Additional file 1: Fig. S1b). We further inspected the top weighted genes in this independent component and found this IC populated by a heat-shock protein-related program (Additional file 1: Fig. S1c), potentially derived from enzymatic stimulation during tissue dissociation [85]. The gene expression program driven by IC_15 was then subtracted from the normalized gene-cell matrix to remove this dissociation-derived batch effect.
Loading single-cell Seurat
-
#Load single-cell Seurat data for analysis
-
rm(list = ls())
-
library(dplyr)
-
library(readr)
-
library(BiocParallel)
-
library(Seurat)
-
library(sctransform)#load package
-
library()
-
-
load("")#data reading
-
gcmeta <- read_csv("cell_metadata.csv")# Extract subtypes of relevant cells: extract tumor metadata
-
HDS <- read.csv("2024surname Nian7moon19Day ICI Calculator.csv",sep = ",",header=T)#Transcriptome HDS score
-
table(gcmeta$Type)#View annotated cell type
-
table(gcmeta$Patient,gcmeta$Tissue)#View samples and cancer and paracancer cell counts
#Extracting single-cell gcdata data for relevant tumor subtypes
-
#Extracting single-cell gcdata data for relevant tumor subtypes
-
n_last <- 7
-
HDS$sample <-substr(HDS$X, nchar(HDS$X) - n_last + 1,
-
nchar(HDS$X))
-
samp <- intersect(unique(gcmeta$Sample),HDS$sample)# with21paired single-cell tumor data
-
gcdata2 <- subset(gcdata, subset = Sample %in% samp)# extract21Single-cell matrix of individual tumors
-
table(gcdata2$Sample)#21Single-cell data from individual tumor samples for continued downstream analysis
-
table(gcdata2$Sample,gcdata2$Type)# Inter-sample cell ratio
Converting the proportion of each cell per sample
-
#Convert the proportion of each sample occupied by each cell: plot the total stacks
-
Cellratio <- prop.table(table(gcdata2$Type,gcdata2$Sample),
-
margin = 2)# margin = 2Calculate the proportion of each sample by column
-
Cellratio <- as.data.frame(Cellratio)#Calculate the ratio to plot the stacked graph
-
library(ggplot2)# Plotting cell scale stacks
-
colourCount = length(unique(Cellratio$Var1))
-
p1 <- ggplot(Cellratio) +
-
geom_bar(aes(x =Var2, y= Freq, fill = Var1),
-
stat = "identity",width = 0.7,size = 0.5,colour = '#222222')+
-
theme_classic() +
-
labs(x='Sample',y = 'Ratio')+
-
#coord_flip()+ # To do a flip
-
theme( = element_rect(fill=NA,color="black",
-
size=0.5, linetype="solid"))
-
p1
-
dev.off()
head(Cellratio) Var1 Var2 Freq 1 B 171012T 0.3945686901 2 CD4+ T 171012T 0.1560170394 3 CD8+ T 171012T 0.1256656017 4 Endothelial 171012T 0.0766773163 5 Epithelial 171012T 0.0005324814 6 Fibroblast 171012T 0.0431309904
Comparison of two subgroup ratios
①Proportional Calculation
②Single Cell Learning - Intergroup and Sample Cell Proportion Analysis_Differences in Cell Percentage Between Single Cell Groups-CSDN Blog
②-II single-cell learning - intergroup and sample cell ratio analysis (supplement) _ single-cell data to calculate the number of a particular cell - CSDN blog
-
library(tidyverse)
-
library(reshape)
-
clusdata <- as.data.frame(table(gcdata2$Type,gcdata2$Sample))
-
# Perform aspect data conversion
-
clusdata1 <- clusdata %>% pivot_wider(names_from = Var2,
-
values_from =Freq )
-
clusdata1 <- as.data.frame(clusdata1)
-
rownames(clusdata1) <- clusdata1$Var1
-
clusdata2 <- clusdata1[,-1]
-
-
#Calculated separately for each group for each cell and
-
HDS1 <- HDS[order(HDS$),]
-
HDS2 <- HDS1[HDS1$sample %in% samp,]
-
rownames(HDS2) <- HDS2$sample
-
-
low <- c(rownames(HDS2)[1:10])# Sample of low scoring groups
-
clusdata2$lowsum <- rowSums(clusdata2[,low])
-
high <- c(rownames(HDS2)[11:21])# Sample of highly rated groups
-
clusdata2$highsum <- rowSums(clusdata2[,high])# and then plot the stacked graphs
-
-
clus2 <- clusdata2[,c(22,23)]#Cell notes data
-
clus2$ID <- rownames(clus2)
-
clus3 <- melt(clus2, = c("ID"))## Change to long data based on grouping
② Drawing
-
p <- ggplot(data = clus3,
-
aes(x=ID,y=value,fill=variable))+
-
#geom_bar(stat = "identity",position = "stack")+ ## Show original value
-
geom_bar(stat = "identity",position = "fill")+ ## Displayed to scale: the vertical coordinate of the1
-
scale_y_continuous(expand = expansion(mult=c(0.01,0.1)), ## Show vertical coordinate percentage value
-
labels = scales::percent_format())+
-
scale_fill_manual(values = c("lowsum"="#a56cc1","highsum"="#769fcd"), ## Colorways:"lowsum"="#98d09d","highsum"="#e77381"
-
limits=c("lowsum","highsum"))+ ##limitAdjustment of the order of the legends
-
theme( = element_blank(), ## Theme Settings
-
axis.line = element_line(),
-
= "top")+ #"bottom"
-
labs(title = "single cell",x=NULL,y="percent")+ ## X, Y axis settings
-
guides(fill=guide_legend(title = NULL,nrow = 1,byrow = FALSE))
-
p
-
dev.off()
Differences between subgroups in proportions of single cells
①Add group information
-
# Add HDS grouping information
-
HDS2$group <- c(1:nrow(HDS2))
-
HDS2$group1 <- ifelse(HDS2$group>10,"high","low")
-
HDS3 <- HDS2[rownames(cellper),] #adjust order
-
identical(HDS3$sample,rownames(cellper))#[1] TRUE Data checking
-
cellper$sample <- HDS3$sample
-
cellper$group <- HDS3$group1
②Cyclic graphing
-
### Graphic presentation ###
-
pplist = list()## Loop over graphs to create empty table
-
library(ggplot2)
-
library(dplyr)
-
library(ggpubr)
-
library(cowplot)
-
sce_groups = c(colnames(cellper)[1:12])# Cell lines
-
for(group_ in sce_groups){
-
cellper_ = cellper %>% select(one_of(c('sample','group',group_))))#Select a set of data
-
colnames(cellper_) = c('sample','group','percent') # Name the selected data column
-
cellper_$percent = as.numeric(cellper_$percent)# Numeric data
-
cellper_ <- cellper_ %>% group_by(group) %>% mutate(upper = quantile(percent, 0.75),
-
lower = quantile(percent, 0.25),
-
mean = mean(percent),
-
median = median(percent))# upper and lower quartiles
-
print(group_)
-
print(cellper_$median)
-
-
pp1 = ggplot(cellper_,aes(x=group,y=percent)) + #ggplot plotting
-
geom_jitter(shape = 21,aes(fill=group),width = 0.25) +
-
stat_summary(fun=mean, geom="point", color="grey60") +#stat_summary add average value
-
theme_cowplot() +
-
theme( = element_text(size = 10), = element_text(size = 10), = element_text(size = 10),
-
= element_text(size = 10), = element_text(size = 10,face = 'plain'), = 'none') +
-
labs(title = group_,y='Percentage') +
-
geom_errorbar(aes(ymin = lower, ymax = upper),col = "grey60",width = 1)
-
-
### Between group t-test analysis
-
labely = max(cellper_$percent)
-
compare_means(percent ~ group, data = cellper_)
-
my_comparisons <- list( c("low", "high") )
-
pp1 = pp1 + stat_compare_means(comparisons = my_comparisons,size = 3,method = "")
-
pplist[[group_]] = pp1
-
}
-
# batch plotting colnames(cellper)[1:12] Cell lines
-
plot_grid(pplist[['B']],
-
pplist[['CD4+ T']],
-
pplist[['CD8+ T']],
-
pplist[['Endothelial']],
-
pplist[['Epithelial']],
-
pplist[['Fibroblast']],
-
pplist[['Glial']],
-
pplist[['Innate lymphoid']],
-
pplist[['Mast']],
-
pplist[['Mural']],
-
pplist[['Myeloid']],
-
pplist[['Plasma']],
-
#nrow = 5, # Columns
-
ncol = 4) # of rows
Original Literature (there are some differences in the scoring calculation section, so there are some differences here as well)
Literature:
Histone deacetylase-mediated tumor microenvironment characteristics and synergistic immunotherapy in gastric cancer ()
Parallel single-cell and bulk transcriptome analyses reveal key features of the gastric tumor microenvironment | Genome Biology | Full Text ()