Home Research COVID-19 Services Publications People Teaching Job Opening News Forum Lab Only
Online Services

I-TASSER I-TASSER-MTD C-I-TASSER CR-I-TASSER QUARK C-QUARK LOMETS MUSTER CEthreader SEGMER DeepFold DeepFoldRNA FoldDesign COFACTOR COACH MetaGO TripletGO IonCom FG-MD ModRefiner REMO DEMO DEMO-EM SPRING COTH Threpp PEPPI BSpred ANGLOR EDock BSP-SLIM SAXSTER FUpred ThreaDom ThreaDomEx EvoDesign BindProf BindProfX SSIPe GPCR-I-TASSER MAGELLAN ResQ STRUM DAMpred

TM-score TM-align US-align MM-align RNA-align NW-align LS-align EDTSurf MVP MVP-Fit SPICKER HAAD PSSpred 3DRobot MR-REX I-TASSER-MR SVMSEQ NeBcon ResPRE TripletRes DeepPotential WDL-RF ATPbind DockRMSD DeepMSA FASPR EM-Refiner GPU-I-TASSER

BioLiP E. coli GLASS GPCR-HGmod GPCR-RD GPCR-EXP Tara-3D TM-fold DECOYS POTENTIAL RW/RWplus EvoEF HPSF THE-DB ADDRESS Alpaca-Antibody CASP7 CASP8 CASP9 CASP10 CASP11 CASP12 CASP13 CASP14

C-QUARK logo


C-QUARK is a de novo protein structure prediction pipeline. Starting from a query sequence, MSAs are generated by DeepMSA, which performs sequential searches through two whole-genome sequence databases (UniClust30 and UniRef90) and a metagenome sequence database (Metaclust). Next, contact predictions are generated using ten state-of-the-art contact predictors: NeBcon, ResPRE, DeepPLM, DeepCov, Deepcontact, DNCON2, MetaPSICOV2, GREMLIN, CCMpred and FreeContact. The query sequence is also scanned through a non-redundant set of 6,023 high-resolution PDB structures by gapless threading to create a set of position-specific fragment structures with continuous lengths ranging from 1 to 20 residues. A histogram of distances dij for each residue pair (i and j) of the query is derived from the top 200 fragments at i-th and j-th positions if the fragments are from the same PDB structure. The histogram that has a peak at the position of dij <9 Å is converted to a distance profile for the residue pair. The distance profile and contact-map restraints are combined with the inherent knowledge-based and physical energy terms and used to guide the fragment assembly into full-length models by REMC simulations. For each target, five REMC simulations starting from different random numbers are run in C-QUARK. Forty replicas are implemented in each simulation, and the conformations in adjacent replicas are swapped following Metropolis criterion after a cycle of MC movements. Five hundred cycles are performed in the simulations by default. Next, “Decoy” conformations from the simulation trajectories are clustered by SPICKER to identify the largest clusters, which correspond to the lowest free-energy states. The cluster centroids are further refined by ModRefiner and FASPR to obtain the final models. The flowchart is depicted in Fig. 1.



Figure 1. Pipeline of C-QUARK.


C-QUARK: Understanding the result output.
    The full modeling result for each job is available as a tarball download at the top of output webpage. Here is the description of format for each file in the tarball:
  • seq.txt: Input sequence in FASTA format.
  • model1.pdb - model10.pdb: Top 10 final structure model in PDB format. If the structure modelling has high confidence, there might be less than 10 final models.
  • seq.dat.ss: Three-state secondary structure prediction in the following format:
             position (starting from 1), amino acid type, predicted secondary structure type, confidence as random coil, helix, β-strand.
  • turn.txt: β turn prediction in the following format:
    position (starting from 0), no use, confidence value c (in [-1,1] and predicted as a beta-turn when c>0).
  • contact.txt: Distance profile derived from fragments in the following format:
             position i (starting from 0), position j (starting from 0), avg., std., number of fragment pairs, 0.5Å, 1.0Å, 1.5Å, 2.0Å, 2.5Å, 3.0Å, 3.5Å, 4.0Å, 4.5Å, 5.0Å, 5.5Å, 6.0Å, 6.5Å, 7.0Å, 7.5Å, 8.0Å, 8.5Å, 9.0Å, peak position in array, number of fragment pairs in peak, atomic distance for fragment pairs in peak.
  • topdh.topdh: Clustered torsion angle pairs from fragments in the following format:
             position (starting from 0), cluster number.
             amino acid, Cα x, Cα y, Cα z, secondary structure, Cα φ, Cα-Cα, Cα-Cα-Cα, ψ, CN, Cα-CN, ω, N-Cα, CN-Cα, φ, Cα-C, N-Cα-C, position in template, template name, accumulated probability.
  • phi.txt, psi.txt: Predicted φ and ψ backbone torsion angles in the following format:
             position (starting from 1), predicted value t in [-180°,180°]
  • sol.txt: Predicted solvent accessibility in the following format:
             positon, no use, predicted value s (in [-1,1])
  • alldecoy.pdb: All 5000 decoy structures from C-QUARK simulation in multi-model PDB format. Only Cα is included.
  • contact.map: Selected residue contacts used as restraints for C-QUARK simulation. This file is a consensus combination of in-house programs (ResPRE, NeBcon, RICMAP, DeepPLM, DeepPRE2) and third-party predictors (DeepCov, DeepContact, DNCON2, GREMLIN, MetaPSICOV, MetaPSICOV2).
  • cscore.txt: Confidence score (C-score) and estimated TM-score for C-QUARK models. Higher C-score and estimated TM-score usually indicate better model quality. Note that, in general, structure quality estimation is only accurate for the first model. The first model is on average the most reliable and should be considered if without special reasons (e.g., from biological common sense or experimental data).

C-QUARK: Advanced Options.
  • Assign contact map: User can assign their own contact maps for C-QUARK simulation in a two-column format, with each line being the residue indexes of a pair of residues being in contact (i.e. Cβ distance between two residues <8Å).
  • Exclude fragments: C-QUARK models are built from small fragments (1-20 residues long) taken from known PDB structures (template). If "remove fragments from protein sharing >30% sequence identity with target" was choosen, fragments will not be generated from template structures that are highly homologous to target sequence. In general, excluding homologous templates will make structure prediction harder. So this option is only for benchmarking purposes.


References:

[back to server]

zhanglabzhanggroup.org | +65-6601-1241 | Computing 1, 13 Computing Drive, Singapore 117417