最近有些老师反馈文章发表时要求提供GEO登录号,如:GSEXXXX,问要怎么获取这种登录号?这时就需要把数据上传至GEO数据库了。还在等什么,跟着小编了解下GEO数据库,手把手教您上传数据至GEO数据库。
GEO数据库全称GENE EXPRESSION OMNIBUS,成立于2000年,是由美国国立生物技术信息中心NCBI创建并维护的基因表达数据库,主要收录高通量基因表达数据。除SRA数据库之外,GEO数据库也是目前文章投递数据上传的数据库之一。
接受数据类型:原始数据或者经过处理的数据(符合“有关芯片试验的最小信息(minimum information about a microarray experiment,MIAME)”标准)
存储数据格式:web格式、spreadsheets格式、XML格式和纯文本格式
数据上传实操
01# 账号注册及登录
1)注册账号:go into/ web page,Find the top right corner of the pageSign in ,go into页面后,Click on the pageSign up,To your liking.,Select the corresponding account to register,Follow the prompts to fill in the account、cryptographic、E-mail and other information。
This step can be ignored if an NCBI account already exists.
(2) Login account: Enter your account number and password (which account you used to register, and choose the corresponding login channel to log in), click Log in, then click on the large NCBI icon in the upper left corner to return to NCBI's homepage, and click on the Submit button in the figure to enter the page of submitting data.
02# Go to the GEO data upload page
1) Click the Submit button in the homepage to enter the submit data page and select the EGO database. Enter the GEO data upload page and select the corresponding uploaddata typeIf you want to upload high-throughput sequencing data, click the second row of "Data Type" to enter the data upload page.
03# Prepare the document
Prepare 3 types of files according to the webpage prompts, 1. metadata spreadsheet, 2. processed data files, and 3. raw data files.
1)metadata spreadsheetClick on "metadata spreadsheet" to download.templatesFill in the blanks.
The downloadable file is in the form of an Excel spreadsheet, which is filled out with relevant information about the sample and experiments throughout the study.
Specific fields are filled in for reference:
SERIESThis piece is some of the information related to your experiment is introduced, summary column can be written in paragraph mode, can also be written in a paragraph mode, similar to the abstract in the scientific research paper.
SERIES section
SAMPLESThis section is for specific experimental grouping information, as well as the name of the samples in each group.
SAMPLES Column
PROTOCOLSThis piece focuses on how the samples were processed and how the sequencing library was built, which is generally provided in the service provider's results report, or you can obtain information on this section by communicating with your collaborative service provider partner.
PROTOCOLS section
2)processed data files:One to several files, which are some data extracted from the analysis based on your original files; the processed data in this part is a necessary part of the GEO submission, and GEO will review the processed data uploaded by the customer as a way to check the truthfulness and reliability of the conclusions of the related articles. For example, RNA-seq can upload gene expression files, and ChIP-seq can upload WIG, bigWig, bedGraph, etc. However, since they are intermediate files, there is no fixed format for this part.
3)raw data files:One to multiple files, which are the raw files obtained from your sequencing or microarray. The raw data for sequencing is generally in FASTQ format, and other formats accepted by the SRA database are also possible (/sra/docs/submitformats/).
04# Data upload
Click on "Uploading your submission" on this page to go to the data upload section:
Jump to "My GEO Profile", fill in the basic personal information, fill in the jump to get the IP, user name and password required for FTP login, login through FileZilla, start uploading data.
Recommended by the GEO websitehardwareFileZilla, download link:.
Go to the Project Data Transfer page:
Right click on the mouse and create three sub-folders under this path: 1. metadata spreadsheet, 2. processed data files, and 3. raw data files, and then upload the corresponding files to this folder. Note that the raw data files are very large, so you need to be patient when uploading them.
05# Confirmation of data upload completion
Once the upload is complete, you can click Notify GEO to alert the GEO backend staff that the upload is complete and ready for review.
Tap Notify GEO to go in and this is the interface, you need to fill in the name of the folder you created, when you expect the data to be made public, and further instructions, and so on.
GEO will notify you by email if the upload is successful or if there are problems with the data. Usually it takes about 2 or 3 working days, after reviewing the data, GEO will notify the data GSM (sample number) and GSE (research project number) by email.
Tip:After submitting that data, you will receive an email from the GEO database. (The email address is the one you left when you registered with NCBI.) If there is a problem with your data, you will be notified to go and upload it again or modify it. You can upload the data again.
After your file is uploaded, you will receive an email from GEO within 5 working days, which will give you a GEO number, similar to GSEXXX. when you receive this email, it means that your data has been uploaded successfully, and you can attach this number to the article when you write it.