1. Classify Your Protein
In the following example, we start with the following unidentified protein sequence:
MGRSIRLFATFFLIAMLFLSTEMGPM
TSAEARTCESQSHRFHGTCVRESNCA
SVCQTEGFIGGNCRAFRRRCFCTRNC
We then entered it in Classify Your Protein's
"Paste your protein sequence" field:
2: Cluster Card Page
We then submitted the sequence, which brought us to the Cluster Card Page for the Cluster that best
matches the sequence. In this case, the Cluster is cluster number 273515.
This cluster contains 74 proteins
+1 your "Hypothetical protein".
3: View Proteins of Cluster
The cluster's proteins can be viewed by clicking on View Proteins of
Cluster on the upper-left hand corner of the Cluster Card page.
Clicking on the "sort by Escores with your protein"
button next to the unidentified sequence
[in this case "Hypothetical Protein"]
will sort the other proteins
in the cluster according to their BLAST E-score with the sequence.
4: Keywords Appearances
To obtain the biological characteristics of these proteins, we go to the Get Keywords Appearances section.
In this case, we choose to view InterPro Family keywords:
In this case, there are first two relevant InterPro Family keywords.
For example: Gamma Purothionin - we see that 50 of this cluster's proteins have this keyword, and only 1 other
protein in the database have this keyword.
This seems to indicate that the unidentified sequence may be a Gamma Purothionin- type protein.
5: Cluster Similarity Distribution (Not Supported in Current Version 6.1)
In order to explore this sequence's biological properties more deeply, we decide to view the Cluster Similarity Distribution by
clicking on the
icon at the top of the Cluster Card page.
This opens a new window:

Clicking "proceed" next to the Display similarity distributions section brings us to the similarity matrix for the proteins in
this cluster (including the unidentified protein sequence we added):
6: BLAST
The color in each square of the matrix indicates the BLAST E-score for the pair of proteins appearing in the correspoding row
and column of the square.
Clicking on a specific square will open a window displaying the BLAST results and alignment for the corresponding pair of proteins.
In this example, by clicking on the uppermost of the low-E-score squares, shows that the similarity
between THG_PETIN [ProtoNet ID is P-70672].
7: Browse Cluster in tree
Clicking on the Browse Cluster in tree button
at the top of the main Cluster Card page.
In order to confirm some of our above-mentioned conjectures, we decide to "climb" the Protonet tree, by using the ProtoBrowser option
by clicking on the button found in the Browse Cluster in Tree section.
This opens a new window of the ProtoBrowser:
To climb the tree, we click on the "up arrow" at the top of the tree.
This moves us up in the Protonet tree display.
8: Cluster Card page for the newly-chosen cluster
Clicking on a cluster will load the Cluster Card page for the newly-chosen cluster.
In this case, we choose Cluster 286462.
On the Cluster Card page for Cluster 286462, we see that there are 153 proteins in the cluster.
9: Keywords Appearances for the newly-chosen cluster
To obtain the biological characteristics of this larger cluster of proteins,
we go to the Get Keywords Appearances section.
In this case, we choose to view Interpro keywords (keywords regarding functional domains and families):
In this case, there are 3 relevant Interpro keywords:
However none of them are assigned to a majority of the cluster's proteins.
Therefore, it is difficult to reach any conclusions regarding this set of proteins.
/ UP /
|