Author, Wind Chaser
Tissue is an ordered unity in which cells do not exist in a compartmentalized manner but in which there are frequent interactions, such interactions, through the expression of ligands by ligand cells, and the action of ligands on recipients and recipient cells, which cause biological changes in the recipient cells. It is this frequent interaction that exemplifies the ordered unity of the cells and the effects on this ordered state and the changes in communication between the cells in response to external stimulus conditions such as disease. Many methods have been developed to analyze cellular interactions, among which CellphoneDB [1], CellChat [2], and NicheNet [3] are the most classical and have greatly facilitated the study of cellular communication.
Introduction to Cellular Communication
Multicellular organisms are "societies" of different types of cells, and they are open "societies" in which each cell must coordinate its behavior. For this purpose, it is necessary for cells to establish communication links. Such as the growth and development of organisms, differentiation, the formation of various tissues and organs, the maintenance of the organization, and the coordination of various physiological activities, all require a high degree of precision andefficientMechanisms of intercellular communication.
Intercellular communication is an extremely complex process, usually referring to the transmission of information from one cell through a medium to another cell to produce a corresponding response. There are two basic concepts in cellular communication: cell signaling and signal transduction, the former emphasizes the generation and transmission between cells, while signal transduction is the way pathway and result of signal conversion after receiving and reception. Cells have three modes of communication:The first is through chemical signaling molecules, which is the most commonly employed mode of communication in animals and plants; the second is through neighboringcell surfaceadhesion of molecules; the third through cellular adhesion toextracellular matrixThe clinging.
The basic process of cellular communication: ① synthesis of signaling molecules: cells in general synthesize signaling molecules, and endocrine cells are the main source of signaling molecules. ② signaling molecules released from the signal-generating cells to the surrounding environment: this is a rather complex process, especially protein-based signaling molecules, to be synthesized, processed, sorted and secreted by the endomembrane system, and finally released to the outside of the cell. ③ Transportation of signaling molecules to target cells: there are various ways of transportation, hormones are mainly transported to target cells through the blood circulatory system, while signaling analysis of dense tissues is released into the surrounding environment to affect the surrounding cells. ④ Recognition and detection of signaling molecules by target cells: mainly through selective recognition and binding of receptor proteins located in the cell membrane or inside the cell. ⑤ Cells transduce extracellular signals across membranes to produce intracellular signals. ⑥ Intracellular signals act on effector molecules to carry out a cascade reaction of gradual amplification, causing a series of changes in cellular metabolism, growth, gene expression, and so on.
After the cell completes the signaling response, signal deregulation is performed to terminate the cellular response, mainly by modifying, hydrolyzing or binding the signaling molecules to reduce the level and concentration of the signaling molecules in order to terminate the response. The schematic diagram is as follows:
Fig. 1 Schematic diagram of cellular communication
And communication between cells becomes a single-celldata analysisVery important piece.
Single-cell communication analysisMost commonly used methods----CellphoneDB[1]
CellPhoneDB [1], a cellular communication analysis method published by Efremova M [1] et al. in 2020, is the most common means of analyzing cellular communication in single cells, with a much higher citation rate than other methods, and has been updated to version 3.0, which allows for the analysis of spatial transcriptome eco-loci communication, and the analysis of communication about spatial transcriptomes is placed in the section on spatial transcriptomes for sharing.
comprehensive database
CellPhoneDB [1] is equipped with a detailed ligand database, which integrates previous public databases and also performs manual corrections to obtain more accurate ligand annotations. In addition, annotations are made for cases where the ligand has multiple subunits. The graph below shows how many species of secreted and membrane proteins, protein complexes, and receptor-ligand relationships are included in the database that CellPhoneDB [1] is equipped with, and what databases they come from. The only drawback of the database is that the species is human, and gene conversion is required for mice or other species to be analyzed.
Fig. 2 Schematic diagram of CellPhoneDB database
CellPhoneDB[1]Inferring the principles of cellular communication
CellPhoneDB[1] analyzes cellular communication with two prerequisites ---- expression matrix and cell annotation, for the ligand-receptor interaction, calculate the expression mean value of ligand in clusterA, and the expression mean value of receptor in the other clusterB, and the mean value of the two is MEAN; after randomly changing the label of the cell, calculate the expression mean value of ligand in "clusterA" and receptor in "clusterB" based on the new label, the expression mean value of ligand in "clusterA" and receptor in "clusterB" is MEAN. After randomly changing the label of the cell, based on the new label, calculate the mean value of the expression of ligand in "clusterA" and the mean value of the expression of receptor in "clusterB", and then find a mean value of MEAN, and this process is repeated many times, then we can get a mean distribution, i.e., null distribution. distribution, i.e., null distribution, but here we need to pay attention to a problem, the direction of communication mentioned above is clusterA→clusterB, when we study clusterB→clusterA, i.e., clusterB expresses ligand, and the cluster expresses receptor. distribution where it is located and more extreme locations, constitutes a percentage of the p-value (definition of p-value). So CellPhoneDB hypothesizes a significantly enriched receptor-ligand relationship between the two cell types, theStill essentially based on receptor expression inside one cell type, and ligand expression inside another cell type. In addition, if a relationship is ubiquitous (evident between all cell types), it is not possible to recognize whether it is significant or not. where the multiplication of the mean value of the ligand receptor across cell types is the strength of communication between cells.
Figure 3 CellPhoneDB analyzes cellular communication principles
CellPhoneDB[1]Considerations for Analysis
(1) When too many cells are analyzed, a down-sampling analysis is performed and only 1/3 of the cells are analyzed.
(2) When a complex is present, the multisubunit considers the one with low expression
(3) The percentage of ligand receptor expression reaches a certain threshold to be included in the analysis, which is 10% by default.
CellPhoneDB[1]Visual presentation of the analysis
CellPhoneDB has built-in methods for visual presentation, examples in the literature are shown below:
But there are two problems with this presentation, 1) the heatmap on the right indicates the number of interactions between cell types two-by-two, but it is symmetrical along the diagonal left and right, which means that the number of interactions is the same for A-B vs. B-A, which is clearly unreasonable; 2) on the left, there is a bubble plot of interactions for specific receptor-ligand pairs, cell pairs, with the size of the dots indicating the level of significance, and the color is The means of the average expression level of interacting molecule 1 in cluster 1 and interacting molecule 2 in cluster 2, without saying which one is the receptor and which one is the ligand.
The reason is all related to the built-in ligand-receptor interworking relationship pairs in CellPhoneDB [1].The default presentation method in CellPhoneDB [1] does not distinguish between receptor or ligand.For the ligand-receptor pair gene1-gene2, it can be either gene1 ligand gene2 receptor or gene1 receptor gene2 ligand.Ignoring directionality.. Specific analyses are made with a directionality in mind.
single-cell communicationThe most "beautiful" way of visualization----CellChat[1]
Outlier Impact and CellChat[2]Methodological improvements
In the process of analyzing cellular communication, analyzing the intensity of cellular communication exchanges is susceptible toexceptionsInterference of values. Due to the sparseness of single-cell data, the simple and crude method of multiplying the mean values is not reasonable, and a large proportion of 0 values and abnormally high expression values can significantly affect the analyzed ligand-receptor communication strength; to solve this problem, CellChat adopts the strategy of 4-quartile values, which is calculated as follows:
where Q1, Q2, and Q3 are the first, second, and third quartiles of signaling gene expression levels in the cell set. This approach somewhat counteracts the effect of outliers.
CellChat[2]Database features
The authors of CellChat [2] manually selected 2021 validated cellular communication relationships to construct a new cellular communication reference database ----CellChatDB. 1) It not only takes into account the multisubunit receptor situation, but also includes other important signaling cofactors: soluble agonists, antagonists, co-stimulatory and co-inhibitory membrane-bound receptors; 2) 48% of these interactions involve heterodimeric molecular complexes, and 25% of the interactions were researched from the most recent literature; 3) In addition, each ligand-receptor pair was manually categorized as one of 229 functionally relevant signaling pathways according to the literature; 4) CellChatDB contains information on signaling molecule interactions from the KEGG Pathway database, as well as information from recent experimental studies.
CellChat[2]Characteristics of the communication analysis method
(1) Signaling genes that were differentially expressed in all cell groups in a given scRNA-seq dataset were first identified using the Wilcoxon rank sum test with a significance level of 0.05 (default).
(2) The weighted average of quartiles is used to circumvent the effect of noise.
(3) Calculation of intercellular communication probability。
(4) Calculate the significance of intercellular communication (permutation test, same as cellphoneDB [1]).
Cellular communication analysis is not only affected by ligands and receptors, but also mediators and key influencers, and this part of the analysis is complemented by CellChat [2], which combines the interactions of ligands and receptors and their cofactors to simulate inter-cellular communication. The following figure shows the schematic diagram of the communication analysis of CellChat [2]:
Figure 5 CellChat communication analysis
CellChat[2]Powerful visualization capabilities
CellChat [2] first builds on cellphoneDB by taking into account the directionality of the communication, so that the heatmap and sum-selection graphs are drawn with directionality, as shown in the following figure:
Fig. 6 Directed chord diagram of cellular communication
The same cell type can both send a signal (sender of the signal, sender) and receive a signal (receiver of the signal, receiver).CellChat [2] utilizes out-degree, in-degree (out-degree: the sum of the probabilities that a cell, as a sender of the signal, sends out the signal) in the analysis of communication networks to infer the strength of different cell groups as senders and receivers during cellular communication; in-degree: cells as the receiver of the signal, the sum of the probability of receiving the signal) to infer the strength of different cell groups as the sender, receiver of the signal in the process of cellular communication. The figure below shows the scatter plot of signal intensity for communication analysis, the size of the points is proportional to the number of ligands and receptors inferred from each cell group, and the x-axis and y-axis indicate the intensity of the cell groups as signal senders and receivers, respectively.
Fig. 7 Scatter plot of CellChat cellular communication intensity
CellChat [2] can utilize thepattern recognitionPredict coordinated responses between cells. The output of this analysis is a set of so-called communication patterns that connect cell groups to signaling pathways. In the figure below, cell groups and signaling represent cell groups and signaling pathways, respectively, and the thickness of the flow indicates the contribution of cell groups or signaling pathways to each communication pattern. e diagram shows how cells coordinate with each other as signal senders in outgoing patterns and how they coordinate with certain signaling pathways to drive communication; f diagram represents the sources of signaling for cells as signal receivers under receiving patterns (incoming patterns) and the types of signals they primarily receive.
Fig. 8 Cellular communication pattern recognition
Of course, CellChat's visualization capabilities go far beyond that, with fiddle charts, bubble charts, sum selection charts, and heat maps left for you to explore.
CellChat[2]multisample analysis strategy
If we get samples under different conditions, the difference in communication is often the focus of attention, CellChat adopts a multi-sample (multiple sample conditions) analysis strategy is to separate single samples for communication analysis, and then study the difference in communication between cell types, which has been supported by many literatures. The idea of such a sample is similar to the analysis strategy of NicheNet.
CellPhoneDB[1]With CellChat[2]An important common denominator in the analysis
When doing communication inference on the dataset, it was found that CellChat [2] and CellPhoneDB [1] consistently capture stronger interactions in spatially neighboring cells than in distant cells in terms of the number of ligand-receptor pairs and communication probabilities, and this underlies the study of spatial proximity communication, which is described in the section about spatial proximity in the section on spatial transcriptomes .
single-cell communicationMethods for considering downstream genetic changes----NicheNet[3]
As mentioned at the beginning, as in Figure 1, the end result of cellular communication is to cause intracellular changes in the recipient cells, which regulate their own gene expression in response to such signals, but both CellPhoneDB [1] and CellChat [2] only consider the expression changes of the ligand receptor without reflecting the response of the downstream target genes, and it is a deep question to consider whether this kind of communication between the cells really plays the role of communication between cells is a question worth pondering.
NicheNet [3] is an R package that computes the interactions between different cellularLigand-receptor interactions between interacting cells are predicted by the combination of cellular expression data with known signaling and gene regulatory networks.By applying NicheNet [3] to tumor and immune cell microenvironment counts, it is possible to infer the role of active ligands and their gene regulation on interacting cells.
NicheNet [3] requires human or mouse gene expression data from interacting cells as input and compares it with data obtained through theIntegration of signaling pathways while constructing models that combine. Contrary to existing approaches, NicheNet [3] models more than just ligand-receptor interactions and integrates theIntracellular signaling. Thus, NicheNet [3] can predict which ligands affect expression in another cell, which target genes are affected by each ligand and which signaling mediators may be involved.
Figure 9 Schematic diagram of NicheNet communication network
The content of the above figure basically summarizes the advantages of the software, combining ligands, receptors, and target genes for analysis.
NicheNet[3]characteristics
(1) Species analyzed: human and mouse (human or mouse gene expression data)
(2) The previous model of NicheNet [3] went beyond ligand-receptor interactions to also include intracellular signaling (Broader a priori knowledge)。
(3) NicheNet[3] can predict which ligands affect expression in another cell, which target genes are affected by each ligand and which signaling mediators may be involved (Interconnection of ligands and target genes)。 NicheNet[3]a priori model
NicheNet-based a priori modeling suggests that existing knowledge supports the idea that ligands may regulate the intensity of target gene expression, and in order to calculate the regulatory potential of ligand target genes, comprehensive biological knowledge about ligand-to-target signaling pathways is as follows:
-
Multiple complementary data sources covering ligand-receptor, signaling, and gene regulatory interactions were collected (cpdb, evex_expression, evex_signaling, kegg, and 19 other sources (complexes included)).
-
The weight network is calculated based on a priori knowledge to verify the communication network relationship between ligand-receptor-target genes, and the authors have done a lot of optimization to maximize the improvement of this network, which is, of course, definitely more accurate.
-
The regulatory potential score between all ligand and target gene pairs was calculated. To calculate this, a network propagation approach on an integrated network was used to propagate signals from ligands through receptors, signaling proteins, and transcriptional regulators to end at target genes, with the model architecture shown below:
Figure 10 Interaction networks inferred from ligand receptor, signaling, and gene regulation data sources
NicheNet[3]Cellular Interaction Algorithm Features
For a given ligand, the signal importance score for each gene is determined by applying a personalized PageRank (PPR, a measure of importance ranking on the ligand signaling network) to thearithmetic) is calculated, where the ligand of interest is used as the seed node. Second, a cutoff is applied to the PPR vectors of the ligands so that only highly "enriched" genes in the graph neighborhood of a ligand are given nonzero importance scores compared to the complete graph, and the n × m matrix of the signal importance scores of the genes in the ligands is calculated using n for the number of ligands of interest and m for the total number of genes in all networks. The n × m matrix of gene signaling importance scores is calculated using n for the number of ligands of interest and m for the total number of genes in all networks.
Combining the n × m ligand-gene signaling importance matrix with the m × m weighting of the integrated gene regulatory networkadjacency matrixMultiplication yields an n × m ligand-target matrix L, where n is the number of ligands considered and m is the number of inferred target genes, where lij is a regulatory potential score that corresponds to the confidence that a particular ligand i can regulate the expression of a particular target gene j.
To summarize, the best use is to calculate the differential genes of the same cell type under different conditions, the differential genes are used as target genes of the network, and the highly active ligands are inferred based on the model to analyze the effective communication between cells. The figure below shows the results of inferred ligand activity.
Figure 11 Analytical prediction of NicheNet ligand activity
NicheNet[3]With CellPhoneDB[1]Or CellChat.[2]joint utilization
Since NicheNet [3] can infer highly active ligands, selection in combination with the results of ligand-receptor analysis from CellPhoneDB [1] or CellChat [2] can be used for the purpose of optimizing cell communication analysis. This combined approach has been well utilized in the article Skin Lymphoma [4].
Fig. 12 NicheNet combined with CellPhoneDB to analyze the effect of cellular communication
Summary of ligand receptor analysis methods
Cell Communication Software | specificities | Database species |
---|---|---|
cellphoneDB | Includes a database of ligands, receptors and their interactions; takes into account the structural composition of ligands and receptors, and is currently a highly cited method for analyzing cellular communication | man |
Cellchat | Includes 229 signaling pathways, classified into three categories: Cell-Cell Contact, ECMReceptor, and Secreted Signaling, with obvious advantages in visualization. | Human, Mouse |
iTALK | Annotation of ligand-receptors into 4 broad categories: cytokines, growth factors, immune checkpoints, and others, focusing only on tumor and normal cell communication | man |
NicheNet | Integrate the database of receptor-ligand relationship, signaling pathway, transcriptional regulation relationship and other sources, and be able to directly output the interrelationship of ligand-receptor-target gene. | Human, Mouse |
rnamagnet | Considering the physical distance between cells and the communication between cell types | mice |
NATMI | Analyzing the specificity of communication, changes in communication intensity | Human, Mouse |
ICELLNET | Global, generic, biologically validated and easy-to-use framework for profiling cellular communication from single or multiple cell-based transcriptomics | Human, Mouse |
scMLnet | Cellular Communication Joint TF Factor | Human, Mouse |
Write it on the back.
Cellular communication is a very important part of single-cell analysis, which is of great significance for the study of cell development and disease. Especially in the study of cancer samples, the interactions between microenvironments are extremely important for studying tumor formation, development, and immune cell responses. However, it cannot be ignored that the real cellular communication process includes ligand---receptor---intracellular signaling method---TF factor---target gene, and the process of exercising plays a role mainly in signaling proteins, the whole signaling conversion process, ligand-receptor analysis is ignored in many places.
The three most used methods, CellPhoneDB, CellChat and NicheNet, have been optimized for the analysis of single-cell communication, which can meet almost all the requirements in the analysis, but at the same time, it should be noted that the study of cellular communication is based on cellular annotation, and the preliminary annotation analysis is still a very important foundation.
literatures
[1] Efremova M, Vento-Tormo M, Teichmann S, Vento-Tormo R et : Inferring cell-cell communication from combined expression of multi-subunit receptor-ligand Protoc. 2020 Apr;15(4):1484-1506.
[2] Suoqin Jin,Christian F. Guerrero-Juarezet al. Inference and analysis of cell-cell communication using CellChat. NATURE COMMUNICATIONS.(2021) 12:1088.
[3] Robin Browaeys,Wouter Saelens,Yvan Saeys. NicheNet: modeling intercellular communication by linking ligands to target methods. 17, pages 159–162 (2020).
[4] [AL Ji](/s?wd=author:(AL Ji) &tn=SE_baiduxueshu_c1gjeupa&ie=utf-8&sc_f_para=sc_hilight=person),[AJ Rubin](/s?wd=author:(AJ Rubin) &tn=SE_baiduxueshu_c1gjeupa&ie=utf-8&sc_f_para=sc_hilight=person),[K Thrane](/s?wd=author:(K Thrane) &tn=SE_baiduxueshu_c1gjeupa&ie=utf-8&sc_f_para=sc_hilight=person), et al .Cell,Volume 182, Issue 6, 17 September 2020.
Well, it's been shared, life is good, and it's better with you