D-I-TASSER P46439-D2

D-I-TASSER results for P46439-D2

[Click result.zip to download all results on this page]

Input Sequence in FASTA format

>P46439 (129 residues)
TEEEKIRVDILENQVMDNHMELVRLCYDPDFEKLKPKYLEELPEKLKLYSEFLGKRPWFA
GDKITFVDFLAYDVLDMKRIFEPKCLDAFLNLKDFISRFEGLKKISAYMKSSQFLRGLLF
GKSATWNSK

Predicted Secondary Structure

Sequence	20 40 60 80 100 120 \| \| \| \| \| \| TEEEKIRVDILENQVMDNHMELVRLCYDPDFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGDKITFVDFLAYDVLDMKRIFEPKCLDAFLNLKDFISRFEGLKKISAYMKSSQFLRGLLFGKSATWNSK
Prediction	CHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCSSSCCCCCHHHHHHHHHHHHHHHHCHHHHHHCCCHHHHHHHHHHCHHHHHHHCCCCCCCCCCCCCCCCCCCC
Confidence	889999999999999999999998734777489999999996899999999967898355785547899999999999997801443093289999999929789999737656889988987633899
	H:Helix; S:Strand; C:Coil

Predicted Solvent Accessibility

Sequence	20 40 60 80 100 120 \| \| \| \| \| \| TEEEKIRVDILENQVMDNHMELVRLCYDPDFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGDKITFVDFLAYDVLDMKRIFEPKCLDAFLNLKDFISRFEGLKKISAYMKSSQFLRGLLFGKSATWNSK
Prediction	875430301010100231233203102344256216502640361043015206743111043312000000100330442326106512403401530362540352154762363213243142458
	Values range from 0 (buried residue) to 8 (highly exposed residue)

Predicted Contact, Hydrogen and Distance Map Used in D-I-TASSER simulation

Contact Map

Distance Map

Hydrogen bond networks

D-I-TASSER simulation is guided by the consensus contact map (left figure), distance map (middle figure) and Hydrogen bond network (right figure) derived based on confidence scores of AttentionPotential. In the contact, distance map and hydrogen bond networks, the axes mark the residue index along the sequence. For the contact map, each dot represents a residue pair with predicted contact, while for the distance map and hydrogen bond network, a color scale represents a distance of 1-20+ angstroms or a angle of 0-180 degree.

Top 10 threading templates used by D-I-TASSER

Rank

PDB
hit

ID1

ID2

Cov

Norm.
Zscore

Download
alignment

                  20                  40                  60                  80                 100                 120

                   |                   |                   |                   |                   |                   |

SS
Seq

CHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCSSSCCCCCHHHHHHHHHHHHHHHHCHHHHHHCCCHHHHHHHHHHCHHHHHHHCCCCCCCCCCCCCCCCCCCC
TEEEKIRVDILENQVMDNHMELVRLCYDPDFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGDKITFVDFLAYDVLDMKRIFEPKCLDAFLNLKDFISRFEGLKKISAYMKSSQFLRGLLFGKSATWNSK

1b8xA

0.37

0.36

10.83

1.50

DEthreader

GCKERAEISMLEGAVLDIRYGVSRIAYSKDFETLKVDFLSKLPEMLKMFEDRLCHKTYLNGDHVTHPDFMLYDALDVVLYMDPMCLDAFPKLVCFKKRIEAIPQIDKYLKSSYIAWP-LQGWQATFDLV

2f3mA

0.82

23.19

1.29

SPARKS-K

TEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK

19gsA

0.28

0.27

8.28

0.53

MapAlign

-QQEAALVDMVNDGVEDLRCKYISLIY-TNYEAGKDDYVKALPGQLKPFETLLGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ----

19gsA

0.27

0.26

8.08

0.36

CEthreader

DQQEAALVDMVNDGVEDLRCKYISLIYTN-YEAGKDDYVKALPGQLKPFETLLSQNTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ----

2f3mA

0.82

23.19

1.26

MUSTER

TEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK

1c72A

0.60

17.33

0.95

HHsearch

TEVEKQRVDVLENHLMDLRMAFARLCYSPDFEKLKPAYLELLPGKLRQLSRFLGSRSWFVGDKLTFVDFLAYDVLDQQRMFVPDCPELQGNLSQFLQRFEALEKISAYMRSGRFMKAPIFWYTALWNNK

1c72A

0.60

17.33

1.94

FFAS-3D

TEVEKQRVDVLENHLMDLRMAFARLCYSPDFEKLKPAYLELLPGKLRQLSRFLGSRSWFVGDKLTFVDFLAYDVLDQQRMFVPDCPELQGNLSQFLQRFEALEKISAYMRSGRFMKAPIFWYTALWNNK

1fheA

0.43

12.51

0.82

EigenThreader

TPEERARISMIEGAAMDLRIGFGRVCYNPKFEEVKEEYVKELPKTLKMWSDFLGDRHYLTGSSVSHVDFMLYETLDSIRYLAPHCLDEFPKLKEFKSRIEALPKIKAYMESKRFIKWPLNGWAASFGAG

2c4jA

0.72

20.47

1.15

CNFpred

SEKEQIREDILENQFMDSRMQLAKLCYDPDFEKLKPEYLQALPEMLKLYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVFEPSCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK

2f3mA

0.81

0.80

22.54

1.50

DEthreader

TEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSRFLPRP-VFSKMAVWGN-

(a)	ID1 is the number of template residues identical to query divided by number of aligned residues.
(b)	ID2 is the number of template residues identical to query divided by query sequence length.
(c)	Cov is equal the number of aligned template residues divided by query sequence length.
(d)	Norm. Zscore is the normalized Z-score of the threading alignments. A Normalized Z-score >1 means a good alignment and is highlighted in bold.
(e)	Download alignment lists the threading program used to identify the template, and provide the 3D structure of aligned regions of threading templates (threading[1-10].pdb.gz).
(f)	Template residues identical to query sequence are highlighted in color.

Top 1 final models from D-I-TASSER

Click to view	Rank^a	Download	Estimated TM-score^b
	1	model1.pdb.gz	0.82

(a)

D-I-TASSER simulations generate a large ensemble of structural conformations, i.e. decoys. These decoys are clustered by SPICKER based on pairwise structure similarity to report up to five final models from the five largest clusters. Models are ranked in descending order of cluster size. If the simulations converge well, it is possible to have less than 5 models generated, which is usually an indication of good model quality.

(b)

The model confidence is quantitatified by estimated TM-score (eTM-score), calculated based on significance of threading template alignments, contact map satisfaction rate, mean absolute error between distance of model and distance of AttentionPotential, and convergence of D-I-TASSER simulations. eTM-score is typically in the range of [0, 1], with higher eTM-score signifies higher model confidence.

Proteins with similar structure

Top 10 structural analogs in PDB (as identified by TM-align)

Rank	PDB Hit	TM-score	RMSD^a	IDEN^a	Cov.	Download Alignment
1	2f3mA	0.96	0.94	0.822	1.000	model1_2f3mA.pdb.gz
2	1c72A	0.93	1.44	0.605	1.000	model1_1c72A.pdb.gz
3	5an1A	0.92	1.23	0.441	0.985	model1_5an1A.pdb.gz
4	1b8xA	0.92	1.38	0.364	1.000	model1_1b8xA.pdb.gz
5	4q5qA	0.91	1.43	0.354	0.985	model1_4q5qA.pdb.gz
6	3isoA	0.91	1.43	0.349	1.000	model1_3isoA.pdb.gz
7	1fheA	0.90	1.50	0.426	1.000	model1_1fheA.pdb.gz
8	19gsA	0.90	1.28	0.274	0.961	model1_19gsA.pdb.gz
9	4q5nA	0.89	1.64	0.271	1.000	model1_4q5nA.pdb.gz
10	1tu7A	0.86	1.87	0.282	0.961	model1_1tu7A.pdb.gz

(a)	Query structure is shown in cartoon, while the structural analog is displayed using backbone trace.
(b)	Ranking of proteins is based on TM-score of the structural alignment between the query structure and known structures in the PDB library.
(c)	RMSD^a is the RMSD between residues that are structurally aligned by TM-align.
(d)	IDEN^a is the percentage sequence identity in the structurally aligned region.
(e)	Cov. represents the coverage of the alignment by TM-align and is equal to the number of structurally aligned residues divided by length of the query protein.

Predicted Gene Ontology (GO) Terms

Molecular Function (MF)

GO term	Cscore^GO	Name
GO:0003824	0.97	catalytic activity
GO:0004364	0.95	glutathione transferase activity
GO:0042803	0.52	protein homodimerization activity

Download full result of the above consensus prediction.

Click the graph to show a high resolution version.

(a)

Cscore^GO is the confidence score of predicted GO terms. Cscore^GO values range in between [0-1]; where a higher value indicates a better confidence in predicting the function using the template.

(b)

The graph shows the predicted terms within the Gene Ontology hierachy for Molecular Function. Confidently predicted terms are color coded by Cscore^GO:

[0.4,0.5)

[0.5,0.6)

[0.6,0.7)

[0.7,0.8)

[0.8,0.9)

[0.9,1.0]

Biological Process (BP)

GO term	Cscore^GO	Name
GO:0009987	0.99	cellular process
GO:0008152	0.97	metabolic process
GO:0044237	0.95	cellular metabolic process
GO:0071704	0.90	organic substance metabolic process
GO:0034641	0.88	cellular nitrogen compound metabolic process
GO:0006749	0.87	glutathione metabolic process
GO:0044699	0.70	single-organism process
GO:0044710	0.55	single-organism metabolic process
GO:0006805	0.53	xenobiotic metabolic process

Download full result of the above consensus prediction.

Click the graph to show a high resolution version.

(a)

Cscore^GO is the confidence score of predicted GO terms. Cscore^GO values range in between [0-1]; where a higher value indicates a better confidence in predicting the function using the template.

(b)

The graph shows the predicted terms within the Gene Ontology hierachy for Biological Process. Confidently predicted terms are color coded by Cscore^GO:

[0.4,0.5)

[0.5,0.6)

[0.6,0.7)

[0.7,0.8)

[0.8,0.9)

[0.9,1.0]

Cellular Component (CC)

GO term	Cscore^GO	Name
GO:0044464	0.99	cell part
GO:0044424	0.97	intracellular part
GO:0044444	0.91	cytoplasmic part
GO:0005829	0.69	cytosol
GO:0005737	0.65	cytoplasm

Download full result of the above consensus prediction.

Click the graph to show a high resolution version.

(a)

Cscore^GO is the confidence score of predicted GO terms. Cscore^GO values range in between [0-1]; where a higher value indicates a better confidence in predicting the function using the template.

(b)

The graph shows the predicted terms within the Gene Ontology hierachy for Cellular Component. Confidently predicted terms are color coded by Cscore^GO:

[0.4,0.5)

[0.5,0.6)

[0.6,0.7)

[0.7,0.8)

[0.8,0.9)

[0.9,1.0]

Predicted Enzyme Commission (EC) Numbers

Top 5 enzyme homologs in PDB

Rank	Cscore^EC	PDB Hit	TM-score	RMSD^a	IDEN^a	Cov.	EC Number	Predicted Active Site Residues
1	0.139	2dc5A	0.955	1.01	0.744	1.000	2.5.1.18	NA
2	0.131	1b4pA	0.954	1.04	0.721	1.000	2.5.1.18	NA
3	0.131	13gsA	0.895	1.29	0.274	0.961	2.5.1.18	NA
4	0.128	1c72A	0.929	1.44	0.605	1.000	2.5.1.18	NA
5	0.125	1hnaA	0.935	1.28	0.713	1.000	2.5.1.18	NA

	Click on the radio buttons to visualize predicted active site residues.
(a)	Cscore^EC is the confidence score for the Enzyme Commission (EC) number prediction. Cscore^EC values range in between [0-1]; where a higher score indicates a more reliable EC number prediction.
(b)	TM-score is a measure of global structural similarity between query and template protein.
(c)	RMSD^a is the RMSD between residues that are structurally aligned by TM-align.
(d)	IDEN^a is the percentage sequence identity in the structurally aligned region.
(e)	Cov. represents the coverage of global structural alignment and is equal to the number of structurally aligned residues divided by length of the query protein.

Predicted Ligand Binding Sites

Template proteins with similar binding site:

Rank	Cscore^LB	PDB Hit	TM-score	RMSD^a	IDEN^a	Cov.	BS-score	Lig. Name	Download Complex	Predicted binding site residues
1	0.18	6gsxB	0.951	1.03	0.729	1.000	0.17	GPS	complex1.pdb.gz	20,21,38,42,73
2	0.12	2f3mC	0.961	0.93	0.822	1.000	0.16	GTD	complex2.pdb.gz	68,72,98
3	0.11	6gsyB	0.938	1.16	0.729	1.000	0.20	GSH	complex3.pdb.gz	61,68,71
4	0.05	1aqxA	0.891	1.33	0.274	0.961	0.18	GTD	complex4.pdb.gz	5,10,11,62,71
5	0.05	1aqxB	0.888	1.36	0.274	0.961	0.18	GTD	complex5.pdb.gz	6,7,62,63

	Click on the radio buttons to visualize predicted binding site and residues.
(a)	Cscore^LB is the confidence score of predicted binding site. Cscore^LB values range in between [0-1]; where a higher score indicates a more reliable ligand-binding site prediction.
(b)	BS-score is a measure of local similarity (sequence & structure) between template binding site and predicted binding site in the query structure. Based on large scale benchmarking analysis, we have observed that a BS-score >1 reflects a significant local match between the predicted and template binding site.
(c)	TM-score is a measure of global structural similarity between query and template protein.
(d)	RMSD^a the RMSD between residues that are structurally aligned by TM-align.
(e)	IDEN^a is the percentage sequence identity in the structurally aligned region.
(f)	Cov. represents the coverage of global structural alignment and is equal to the number of structurally aligned residues divided by length of the query protein.

[Click result.zip to download all results on this page]

References:
1.	Wei Zheng, Chengxin Zhang, Yang Li, Robin Pearce, Eric W. Bell, Yang Zhang. Folding non-homology proteins by coupling deep-learning contact maps with I-TASSER assembly simulations. In preparation, 2020.