UniProt is a free database for protein sequence and functional information of proteins, that is used to access the information of proteins in a comprehensive manner. The information of biological function of proteins and all its annotated data is stored in this central resource and these have been derived from the research literature and genome sequencing projects.it is freely and easily accessible to the researchers and scientists and easy to use.
UniProt accepts primary sequences of proteins derived from peptide sequencing experiments. It stores and interconnects the huge form of data from different sources.
For the very fast and continuously ongoing growth of
predicted protein sequences by high-throughput genome sequencing for many and increasingly diverse organisms, the expansion of large-scale proteomics (e.g. gene expression profiling and protein–protein interactions) and the emergence of structural genomics have combined to provide a wealth of data to analyze and use for the researchers. There was a widely recognized need for a centralized repository of protein sequences with comprehensive coverage and a systematic approach to protein annotation, incorporating, integrating and standardizing data from these various sources.
Expert curators have collected the detailed, comprehensive and curated annotations from the literature for over a half million of these proteins.
UniProt is developed under the highly scientific expertise and the extensive bioinformatics infrastructure at EMBL-EBI. The long-term preservation of the UniProt databases is under the safe hands of the UniProt consortium and host institutions EMBL-EBI, SIB and PIR because they have committed to protect it. Besides this, more than 100 people and project institutions are involved in this commitment through different tasks such as curation of data, software development and support etc.
Four contrasting components have been developed under the UniProt for different uses;
1.The UniProt Knowledgebase (UniProtKB)
It is an competently curated database, a central access point for integrated protein information with cross-references to multiple sources.
2. The UniProt Archive (UniParc)
It is a comprehensive sequence repository, showing the history of all protein sequences between different species.
3. UniProt Reference Clusters (UniRef)
It merges closely related sequences based on sequence identity to speed up searches.
4. The UniProt Metagenomic and Environmental Sequences (UniMES) database
It is a repository developed for the newly expanding area of metagenomics and environmental data
Benefits of using Uniprot
Finding protein sequences
Inferring evolutionary information
Know about the name and taxonomy of proteins
Incorporated sites related to our proteins
Study of biological processes
Study of molecular function
Study of post translational modifications
Their interaction with other molecules
Their location in cells and organisms
Blast the sequences
Success of UniProt is still going on because of;
Growth of sequences in UniProt
Reference proteomes
Expert curation progress
Automatic annotation progress
An important aspect of UniProt is to connect the papers with the relevant entries. This is on working task and the previous and newcomer users should be gained training via web , the API download from the FTP site for the better use of UniProt.