MegaMotifbase is a database of structural motifs from protein structures at the family and superfamily level. Such motifs among structurally aligned proteins are recognized by the conservation of amino acid preference and solvent inaccessibility and are examined for the conservation of other important structural features like secondary structural content, hydrogen bonding pattern and residue packing.
Motif identification for multiple structures
Structural motifs are identified by the presence of at least three consecutive solvent-buried (inaccessible) residues that have higher amino acid exchange scores. Conservation of more structural parameters like secondary structural content, hydrogen bonding, and residue packing (Ooi number; Nishikawa and Ooi, 1986) are also examined among structurally aligned multiple proteins. A structural feature is considered conserved at an alignment position if it is present in all or all but-one member within the alignment.
Solvent accessibility is measured using the PSA program from JOY4.0 suite (Mizuguchi et al., 1998). Residues that have accessible surface area less than 7% are treated as solvent buried or inaccessible. At every alignment position, all possible pairs of proteins and their observed amino acids are scored using a standard 20x20 substitution matrix (Johnson and Overington, 1993) derived from structure-based sequence alignments of homologous protein families. SSTRUC program that is part of JOY4.0 suite of programs is used to identify secondary structural positions. The HBOND program, also part of JOY4.0 suite, has been used to identify hydrogen bonds. Residue packing has been measured in terms of Ooi number that provides the number of residues surrounding each Ca atom of residues in a protein. Higher Ooi numbers correspond to high residue packing and suggest that the residue is in a well-packed environment.
Reference:
Johnson, M. S. and Overington, J. P. (1993). A Structural Basis for Sequence Comparisons. An Evaluation of Scoring Methodologies. J. Mol. Biol., 233, 716-738.
Mizuguchi, K. et al. (1998) JOY: protein sequence-structure representation and analysis. Bioinformatics., 14, 617-623
Nishikawa, K. and Ooi, T. J. (1986) Radial locations of amino acid residues in a globular protein: correlation with the sequence. J. Biochem. (Tokyo)., 100, 1043-1047.
Motif identification for single structure
If there are no structural homologs, then sequence homologs are obtained from SWISSPROT database and are aligned using MALIGN program. Each sequentially conserved region is mapped and filtered based on higher structural feature content score with the underlying assumption that the regions that are highly conserved in terms of sequence similarity as well as rich in important structural feature would also be conserved in structural features.
Spatial Distance between motifs
The structural motif regions are transformed into a vector representation by the least-squares fit method (Chou et al., 1984; Srinivasan et al., 1991). Spatial distances for all the motifs are calculated and represented in the form of a matrix.
Average Torsion Angle between the Motifs
The structural motif regions are transformed into a vector representation by the least-squares fit method (Chou et al., 1984; Srinivasan et al., 1991). Virtual torsion angles for all the motifs are calculated and represented in the form of a matrix.
Reference:
Chou,K.C., Memethy,G. and Scheraga,H.A. (1984) J. Am. Chem. Soc., 106, 3161- 3170.
Srinivasan,N., Sowdhamini,R., Ramakrishnan,C. and Balaram,P. (1991) In Balaram,P. and Ramaseshan,S. (eds), Molecular Conformation and Biological Interactions. Indian Academy of Sciences, Bangalore, pp. 59-73.
View Motifs on the structures
Chime view
1. Click on the link "Chime view".
2. Chime window will display the 3D structure of the domain. The domain will be colored according to protein chains.
3. Use right click on the chime window to access the other chime options.
Rasmol view
* Click on the link "Rasmol".
* Save file to disk.
* Make sure that the file (for example, rasmol.cgi) has been saved properly.
How to view (Linux)
First type rasmol in command line.Rasmol prompt will appear on the screen. Now type script Rasmol.cgi
$ rasmol
RasMol>
Rasmol> script rasmol.cgi
Scan motifs for similar sequence search:
This database provides option to scan structural motifs for similar sequence search using SCANMOT algorithm. Please click here for more information.
Scan motifs in 3D structure:
Options are provided for scanning multiple structural motifs along with their spatial orientation in a given query protein structure. This could be very useful in protein classification and assignment of family or superfamily relationship to newly solved protein structures with unknown function.
Query structure is assigned to A or B or C grade based on the similarity score between distance matrices of the query and representative structures and similarity score between torsion angle matrices of the query and representative structures
Grades:
TA=Similarity between the torsion angle matrices of the query and representative structure(s)
SD=Similarity between the distance matrices of the query and representative structure(s)
Condition Grade
if(TA>=80%) and (SD>=80%) A
if(TA>=60% and TA<80%) and (SD>=60% and SD<80%) B
if(TA>=60% and TA<80%) and (SD>=60%) B
if(TA>=60%) and (SD>=60% and SD<80%) B
if(TA>=40% and TA<60%) and (SD>=40% and SD<60%) C
if(TA>=40% and TA<60%) and (SD>=40%) C
if(TA>=40) and (SD>=40% and SD<60%) C
Search the database:
The database can be browsed efficiently using several search facility. The database may be queried using the SCOP superfamily code, SCOP superfamily name, HOMSTRAD family code, HOMSTRAD family name, PDB code and keywords.
Search by SCOP superfamily/HOMSTRAD family code code :
To search by superfamily/family code, enter a valid superfamily code ( 5 digit code as mentioned in SCOP), select the "Superfamily code" option from radio button given), then submit the form. The result page contains superfamily code (with link) and Superfamily name/family name . User can reach the choosen entry by using link provided in superfamily/family code.
Search by superfamily/family name :
To search by superfamily name, enter a valid superfamily name (as mentioned in SCOP) or Family name (as mentioned in HOMSTRAD), select the "Superfamily name/Family name" option from radio button given), then submit the form. The result page contains superfamily/family code (with link) and Superfamily/Family name . User can reach the choosen entry by using link provided in superfamily/family code.
Search by PDB code :
To search by PDB code, enter a valid PDB code ( starts with number), select the "PDB Code" option from radio button given), then submit the form. The result page contains superfamily/family code (with link) and Superfamily/Family name . User can reach the choosen entry by using link provided in superfamily/family code.
Search by Keyword :
To search by keyword, enter keyword and select the "Key word" option from radio button given ,then submit the form. The result page contains superfamily/family code (with link) and superfamily/family name. User can reach the choosen entry by using link provided in superfamily/family code.
Contact:
Dr. R. Sowdhamini (mini@ncbs.res.in)
Dr. Saikat Chakrabarti
Dr. PN. Suganthan
G. Pugalenthi