Table Browser & SARS-CoV-2 Viral Genome
Transcription
Introduction:
The UCSC Table Browser tool provides text based access to a large collection of genome assemblies stored in the Genome Browser Database. It is utilized to retrieve the sequence data underlying the Genome Browser tracks for entire genome, applying a filter, correlating genomes through an intersection and organizing the output data into different formats like tab-separated, sequence (FASTA), Gene Transfer Format (GTF), Browser Extensible Data Format (BED). We’ll take a look at the genome of SARS-CoV-2 because it is pandemic throughout the world and most searched virus now a days.
Steps:
-
Open the table browser tool through UCSC Genome Browser Home.
-
As we are retrieving SARS-CoV-2 viral genome, we’ve to select the particular parameters according to the query on table browser interface.
-
Select ‘Viruses’ in clade and ‘SARS-CoV-2’ in genome option. Make sure to select the latest assembly for better information.
-
There are multiple options in the down drop menu of group in case of different organisms. Select ‘Genes and Gene predictions’ as per requirement.
-
Select the track as ‘NCBI Genes’ because it’s experimental (It can also be predicted) rather than AUGUSTUS and Genscan Genes which are predicted genes only.
-
Table will be automatically selected as ‘ncbiGene’ due to track.
-
Region has sub options like genome (to retrieve the whole genome) and position (to retrieve a particular position). We’ll select ‘genome’.
-
The crucial step in table browser is the selection of output format. There are several different types of output format as mentioned above, which are chosen according to requirement. In our case, as per demand of gene sequence, we’ll select ‘Sequence’.
-
In case of output file, you can leave it empty to keep the output in browser or put in ‘SARS-CoV-2-Genes-FASTA’ to download it to your computer.
-
Select summary/statistics to take a look on various information about gene data like gene number, bases, count and many more.
-
Select get output and submit the genomic data which lead to Sequence Retrieval Region Options, where you can select the required data of gene. So, uncheck the Upstream and Downstream data and leave checked rest of the options. Choose the ‘One FASTA record per gene’ to make each file for different gene records rather than splitting all genes record in one file.
-
In Sequence Formatting Options, leave it by default so you can differentiate between different sequence regions.
-
Select get sequence to download the file.
-
Open code editor to visualize the output. You’ll see each gene sequence of the viral genome in FASTA format.
Summary:
As the video was particularly about the SARS-CoV-2 viral genome, we learned how we can utilize UCSC Table Browser tool to retrieve the viral genome and its different genes sequence.