Quantitative single-cell transcriptome analysis using cell ranger

Welcome to the Sangshin Training Manual!

In all quantitative analyses of RNA_seq data, the reads are first compared to the reference genome, and then the quantitativehardwarePerforming quantification, such as the classic hisat+stringTie analysis strategy, works on the same principle for single-cell transcriptomes, except that due to the introduction of theUMIThe design of the label needs to consider the same when quantifying theUMIThe tags are from the same transcript, and it would be inappropriate to use traditional analysis software directly.

The official cell ranger software provides not only data splitting, but also quantitative and other analytical content.

The premise of quantification is to compare the reads to the reference genome, the first step for comparison is to index the reference genome, the official website provides the reference genome of human and mouse for download, the URL is as follows

https://support./single-cell-gene-expression/software/downloads/latest

For other species, we only need to have the fasta file of the genome and the gtf file of the transcript to customize the reference genome in the following steps

1. Filtering of GTF documents

In the original GTF file, there will be very many types of genes that can bemkgtfsubcommand, which screens for genes of interest, is used as follows




cellranger mkgtf \



 \



 \



--attribute=gene_biotype:protein_coding

pass (a bill or inspection etc)attributeattribute to filter, only records corresponding to protein coding genes were filtered in the above example.

2. Creation of indexes

pass (a bill or inspection etc)mkrefsubcommand to build an index, as follows




cellranger mkref \



--genome=output_genome \



--nthreads=10 \



--fasta= \



--genes=

genomeparameter specifies the directory of the output results, the directory structure after building the index is as follows




.



├── fasta



│   ├── 



│   └── 



├── genes



│   └── 



├── pickle



│   └── 



├── 



└── star

It can be seen that cell ranger establishes the genomicSTARThe index of theSTARCompare reads to a reference genome.

Quantitative analysis was performed bycountThe subcommand is implemented with the following usage




cellranger count \



--id=sample345 \



--transcriptome=database_path \



--fastqs=fastq_path \



--sample=mysample \

idparameter specifies the name of the output directory, thetranscriptomeparameter specifies the directory where the genome index is located.fastqsindicate clearly and with certaintymkfastqdirectory where the sequence file generated by the command is located.sampleparameter specifies the samples to be analyzed in thefastq_pathunder which corresponds to a subdirectory.

countThe subcommands not only perform quantitative analysis, but also provide theclusteringThe output results are recorded in a number of files for the purpose of outputting the results, and we will explain the output results of this command in detail in the following sections.

·end·

-If you like it, share it with your friends!

Scan and follow the micro-signal, more exciting content waiting for you!