Home
About pass2
Superfamily list
Align query
Useful tools
Search
Help
Lab page
Protein Alignments organised as Structural Superfamilies

Information about each superfamily:

Protein code : Each superfamily member has a 6 letter unique identifier starts with number. One can get the details of this protein in various databases by using the link provided in 6 letter code. First 4 letter :pdb code(start with number) Fifth digit :Chain identifier.(- indicates that no chains in that protein) Sixth dogit :Domain number (as given in SCOP.(- indicates that the whole protein is considered as single domain) For example 1. 1fpoa1 1fpo -----> pdbcode a -----> Chain A 1 -----> DOmain 1 of 1fpo 2. 1gh6a- 1gh6 -----> pdbcode a -----> Chain A - -----> Single domain Start residue : Number assigned (in PDB file) to the first residue of the domain region Start chain : Chain identifier(as mentioned in PDB file) of the domain region End residue : Number assigned (in PDB file) to the last residue of the domain region End chain : Chain identifier(as mentioned in PDB file) of the domain region Protein size : Size of the domain Protein name : The common name of the protein and/or its short description(as given PDB file). Source : The organism producing the protein Superfamily code : This is a unique identifier(5 digit code) for each protein described within the SCOP database Superfamily name : Name of the superfamily as in SCOP Fold name : Name of the fold as in SCOP Class name : Name of the class as given SCOP Number of members: Total Number of domains having less than 40% sequence identity among themselves Average size : Average size of domains in a particular superfamily Resolution : R-factor :

Predict closest superfamily of your query:

Association of structure to respective superfamily could be performed using this tool. Superfamily assignment is initially done using the HMMPFAM with liberal E-value. To test structural compatibility,representive member of the superfamily are then aligned and superposed using the uploaded structure to test structural compatibility. To find the closest superfamily, 1. Upload the pdbfile 2. Press submit button Output: The output contains the following information 1.Protein code of the representative member,Protein name,Source,Resolution,superfamily code(One can reach the details about the superfamily by using the link provided in superfamily code) and Superfamily name. 2.Structural superposition of Query with representative member using LSQMAN 3.User can download the superposed coordinates using the "Download superposed coordinates" option

Predict closest superfamily of your query sequence based on HMMs:

Association of query sequence to respective superfamily could be performed using this tool. Superfamily assignment is done using the HMMPFAM with the E-value of 0.1. The query is matched to the PASS2 HMMs using the above mentioned program.

Align your query:

Sequence Alignment using MALIGN One can align their sequence with the members of a given superfamily using MALIGN. To align sequence, 1.Enter superfamily code 2.Paste sequence in PIR format 3.Click "Submit Query" button Structure Based Alignment using JOY One can align their sequence with the members of a given superfamily using JOY.The user's sequences are aligned with the members of the superfamily using MALIGN. Then JOY4.0 is used to create the final alignment in JOY format. To align using JOY, 1.Enter superfamily code 2.Paste sequence in PIR format 3.Click "Submit Query" button Structural Alignment using LSQMAN One can align and superpose their structure with the representative member of a given superfamily using LSQMAN. 1.Enter superfamily code 2.Upload pdb file 3.Click the "Submit form" button

Phylogeny of the members:

Structure and sequence based phylogenetic trees are provided for superfamilies having more than two superfamily members. Sequence based tree has been built using the sequence dissimilarity matrix while the structure based phylogenetic tree is based on the RMSD matrix derived from MNYFIT.

Hidden Markov Models based on the Structural Alignments:

Structural alignments of evolutionary related proteins have been used for build sensitive HMMs for the different superfamilies. HMM build have been used to build global and local models. The basic strengths of profile HMMs are: 1. the model can be used to search a database and/or parse sequences for the presence of similar domains 2. profile HMMs can be used to maintain alignments of huge numbers of sequences, starting from carefully constructed "seed" alignments of a representative set of sequences. Usr can download the profiles by using the links provided in pass_fs and pass_ls

Interacting Motifs for PASS2 Members:

Interacting Motifs represents equivalent conserved interacting regions of the superfamily members.Conserved regions of proteins have been determined comparing the proteins to there homologues. Interactions is between conserved stretches have been estimated studying the electrostatic potential, dipole interactions Vander Waal interactions and hydrophobic interactions. The conserved interacting region is plotted back onto the structural alignments of the superfamily.Regions sharing similar conserved interacting form the interacting motifs for a superfamily. Both conserved motifs in the protein and superfamily level have been tablulated and reported. The motifs represent structural and functionally important regions for a superfamily. Output Protein:1vii-- 0 SXEDF start: 2 end: 7 1 FGMTR start: 10 end: 15 2 PXWKQ start: 21 end: 26 3 KKEKG start: 29 end: 34 where 1vii-- ---------> Protein code 0   1   2   3 ---------> Motif number SXEDF ---------> Interacting Motif start ---------> Starting position of the motif in the COMPARER alignment end ---------> End position of the motif in the COMPARER alignment

Improved search facility:

The database can be browsed efficiently using several search facility. The database may be queried using the superfamily code, superfamily name, Fold name, PDB code and keywords.
1.The search pattern should contain atleast four character. 2.Protein code should contain atleast four character. 3.Select the search option. Search by superfamily code : To search by superfamily code, enter a valid superfamily code ( 5 digit code as mentioned in SCOP), select the "Superfamily code" option from radio button given),then submit the form.The result page contains superfamily code (with link) and Superfamily name . User can reach the choosen entry by using link provided in superfamily code.
Search by superfamily name : To search by superfamily name, enter a valid superfamily name (as mentioned in SCOP), select the "Superfamily name" option from radio button given),then submit the form.The result page contains superfamily code (with link) and Superfamily name . User can reach the choosen entry by using link provided in superfamily code.
Search by fold name : To search by fold name, enter a valid Fold name (as mentioned in SCOP), select the "Fold name" option from radio button given),then submit the form.The result page contains superfamily code (with link) and Superfamily name . User can reach the choosen entry by using link provided in superfamily code.
Search by Class name : To search by class name, enter a valid Class name (as mentioned in SCOP), select the "Class name" option from radio button given),then submit the form.The result page contains superfamily code (with link) and Superfamily name . User can reach the choosen entry by using link provided in superfamily code.
PASS2 contains the superfamily from the following 7 major classes 1.All alpha proteins
2.All beta proteins
3.Alpha and beta proteins (a/b)
4.Alpha and beta proteins (a+b)
5.Multi-domain proteins (alpha and beta)
6.Membrane and cell surface proteins and peptides
7.Small proteins

Search by PDB code : To search by PDB code, enter a valid PDB code ( starts with number), select the "PDB Code" option from radio button given), then submit the form.The result page contains superfamily code (with link) and Superfamily name . User can reach the choosen entry by using link provided in superfamily code.
Search by Keyword : To search by keyword, enter keyword and select the "Key word" option from radio button given ,then submit the form. The result page contains superfamily code (with link) and Superfamily name. User can reach the choosen entry by using link provided in superfamily code. Search facility in all pages : In each page in the database, there is a search box. You can search any word that appears in a PASS2 (titles, comments, pdb entries,Superfamily code , Superfamily name,fold name and class name).

Genome Distribution:

A search for homologues have been performed using PSI-BLAST and Hmmsearch. Both methods are sensitive in identify distant homologues. Domains assigned have been aligned using clustalW . The first 10 hits have been aligned with the structural members of PASS2 and have been displayed in the JOY format as shown below.One may download the hits obtained from different genomes.

1grj-1 ( 2 ) qaipmtlrgAeklreeldflksvrrpeIiaaIaeArehgdlkenaeyhaa    Structural member

NP-212 ( 2 ) LNKKQKELKYLKEVEIPENSKEIGKARELGDLKENAEYHSA    Genome hit1

NP-218 ( 2 ) RGLVVTAKMLNAKKKELQDLLDVRIPENSREIGRALELGDLRENAEYKAA    Genome hit2

NP-224 ( 2 ) TSESFSRMKAKLQSLVGKEMVDNAKEIEDARSLGDLRENSEYKFA    Genome hit3

NP-296 ( 2 ) TSESFTRMKAKLQSLIGKEMVDNAKEIEDARALGDLRENSEYKFA    Genome hit4

NP-220 ( 2 ) TSDSFTRMKNKLQSLVGKEMVENAKEIEDARALGDLRENSEYKIA    Genome hit5

AAL576 ( 2 ) REVKLTKAGYERLMKQLEQ-ERERLQEATKILQELMESSDDYDDSGLEAA    Genome hit6

AAL576 ( 2 ) REVKLTKAGYERLMQQLER-ERERLQEATKILQELMESSDDYDDSGLEAA    Genome hit7

NP-295 ( 2 ) KQVRLTREGFERLEKALEQ-EQNRLNEATRILQEQMETSADNEDTGLEDA    Genome hit8

NP-296 ( 2 ) QPLPLTPEGLSRLQAALER-EQARREEARRVVQEQME-ANENESLDLAAA    Genome hit9

AAL576 ( 2 ) KPVYLTPEGFRRLQEELNHLKTTKRQEISADFEQALEEGDLRENAGYDEA    Genome hit10

bbbbaaaaaaaaaaaaaaaa aaaaaaaaaaaaa 333 aaaaaa


Your suggestions and further clarifications are most welcome.
please send a mail to mini@ncbs.res.in PASS2 group : Dr.R.Sowdhamini Anirban Bhaduri Pugalenthi G