SMotif
SMotif is a server to identify set of structural motifs from protein structures. Such motifs among structurally aligned multiple members of protein superfamilies are recognized by the conservation of amino acid preference and solvent inaccessibility and are examined for the conservation of other features like secondary structural content, hydrogen bonding and residue packing.
Input 1: Paste Alignment (PIR/FASTA format)
1. Input alignment should be in PIR/FASTA format.
2. Alignment should be structure based sequence alignment
3. First line should start with >P1; (in PIR format) or > (in fasta format) followed by pdbcode. For example,
>P1;1h7w or
>1h7w
4. Chain identifiers can be specified in the firstline as given below. This is optional
>P1;1h7w :A: or
>1h7w :A:
If the sequence has more than one chain, then the chain identifiers should be seperated by comma
>P1;1gg6 :A,B,C: or
>1gg6 :A,B,C:
PIR format:
Example 1
>P1;1vin
sequence
---DIHTYLREMEVKCK-PKVGYMKKQP--------DIT----NSMRAILVDWLVEVGEE
Y--KLQNETLHLAVNYIDRFLSSMSVLRGKLQLVGTAAMLLASKFEEIYPPEVAEFVYIT
D-----DTYTKKQVLRMEHLVLKVLAFDLAAPTINQFLTQYFLHQQ---PA---NCKVES
LAMFLGELSLIDADPYLKYLPSVIAAAAFHLALYTV-TGQS-WPESLVQKT-----GYTL
ETLKPCLLDLHQTYLRAPQHAQQSIREKYKNSKYHGVSLLNPPETLNL*
>P1;1bu2a
sequence
---RVLNNLKLRELLLP-KFTSLWEIQT--------EVT----VDNRTILLTWMHLLCES
F--ELDKSVFPLSVSILDRYLCKKQGTKKTLQKIGAACVLIGSKIRTVKPMTVSKLTYLS
C-----DCFTNLELINQEKDILEALKWDTEAVLATDFLIPLCNALK--IPE-DLWPQLYE
AASTTICKALIQP-NIALLSPGLICAGGLLTTIETDNTNCRPWTCYLEDLS-----S-IL
NFSTNTVRTVKDQVSEAFSLYD--------------------------*
>P1;1jkw
sequence
WTFSSEEQLARLRADANRKFRCKAVANGKVLPNDPVFLEPHEEMTLCKYYEKRLLEFCSV
FKPAMPRSVVGTACMYFKRFYLNNSVMEYHPRIIMLTCAFLACKVDEFN-VSSPQFVGNL
RESPLGQEKALEQILEYELLLIQQLNFHLIVHNPYRPFEGFLIDLKTRYPILENPEILRK
TADDFLNRIALTD-AYLLYTPSQIALTAILSSASRA-GITME--SYLSESLMLKENRTCL
SQLLDIMKSMRNLVKKYE--PPRSEEVAVLKQKLDRCHSA----ELAL*
Example 2
>P1;1csz
structure
GSRRASVGSHEKMPWFHGKISREESEQIVLIGSKTNGKFLIRARD-NNGSYALCLLHEGK
VLHYRIDKDKTGKLSIPEGKKFDTLWQLVEHYSY--KADGL----LRVLTVPCQKIGTQ*
>P1;1aya
structure
-----------MRRWFHPNITGVEAENLLLT-RGVDGSFLARPSKSNPGDFTLSVRRNGA
VTHIKIQNTG-DYYDLYGGEKFATLAELVQYYMEHHGQLKEKNGDVIELKYPLN-----*
>P1;1mil
structure
------GSQLRGEPWFHGKLSRREAEALLQ----LNGDFLVRESTTTPGQYVLTGLQSGQ
PKHLLLVDPE-GVVRT-KDHRFESVSHLISYHMDNHLPIISA-GSELCLQQPVERKL--*
Example 3
>P1;1ptoa-
structure
---------------------------------DPPATVYRYDS-RPPEDVFQ-NGFTAW
GNNDNVLEHL----------TGRSCQVGSSNSAFVSTSSSRRYTEVYLEHRMQEAVEAER
AGRGTGHFIGYIYEVRADN--NFYGAASSYFEYVDTYG----------------------
DNAGRILAGALATY-----QSEYLAHR--RIPPENIRRV-TRVYHNGITGETTTTEYSNA
RYVSQQTRANPNPYTSRRSVASIVGTLVRMAPVVGACMARQAESSEEAMVLVYYESIAYS
F*
>P1;1ltaa-
structure
----------------------------------NGDRLYRADS-RPPDEIKRSGGLMPR
GHNEYFDRGTQMNINLYDHARGTQTGFVRYDDGYVSTSLSLRSAHLAGQSI---------
---LSGYSTYYIYVIATAP--NMFNVNDVLG-----------------------------
-------VYSPHPY-----EQEVSALG--GIPYSQIYGW-YRVNFGVIDERLHRNREYRD
RYYRNLNIAPAEDGYRLAGFPPDHQAWREEPWIHHAPQGCG-------------------
-*
>P1;1ddt-3
structure
--------------------GADDVVDSSKSFVMENFSSYHGTKPGYVDSIQ--KGIQKP
KSGTQ--------------------GNYDDDWKGFYSTDNKYDAAGYSVDNE--------
--NPLSGKAGGVVKVTYPG--LTKVLALKVDNAETIKKELGLSLTEPLMEQVGTEEFIKR
FGDGASRVVLSLPFAEGSSSVEYINNWEQAKAL-SVELE-INFETRGKRGQDAMYEYMAQ
ACA---------------------------------------------------------
-*
>P1;1dmaa-
structure
FLGDGGDVSFSTRGTQNWTVERLLQAHRQLEERGYVFVGYHGTFLEAAQSIVF-GGVRAA
------------------------------IWRGFYIAGDPALAYGYAQDQEP-------
-DARGRIRNGALLRVYVPRSSLPGFYRTSLTLAAPEAAGEVE--------------RLIG
HPLPLRLDAITGPEEE-GGRLETILGWPLAERT-VVIPSAIPTDPRNVGGDLDPSSIPDK
EQAISALPDYASQPGKPPR-----------------------------------------
-*
FASTA format
Example 1
>1csz
GSRRASVGSHEKMPWFHGKISREESEQIVLIGSKTNGKFLIRARD-NNGSYALCLLHEGK
VLHYRIDKDKTGKLSIPEGKKFDTLWQLVEHYSY--KADGL----LRVLTVPCQKIGTQ
>1aya
-----------MRRWFHPNITGVEAENLLLT-RGVDGSFLARPSKSNPGDFTLSVRRNGA
VTHIKIQNTG-DYYDLYGGEKFATLAELVQYYMEHHGQLKEKNGDVIELKYPLN-----
>1mil
------GSQLRGEPWFHGKLSRREAEALLQ----LNGDFLVRESTTTPGQYVLTGLQSGQ
PKHLLLVDPE-GVVRT-KDHRFESVSHLISYHMDNHLPIISA-GSELCLQQPVERKL--
Example 2
>1vin
---DIHTYLREMEVKCK-PKVGYMKKQP--------DIT----NSMRAILVDWLVEVGEE
Y--KLQNETLHLAVNYIDRFLSSMSVLRGKLQLVGTAAMLLASKFEEIYPPEVAEFVYIT
D-----DTYTKKQVLRMEHLVLKVLAFDLAAPTINQFLTQYFLHQQ---PA---NCKVES
LAMFLGELSLIDADPYLKYLPSVIAAAAFHLALYTV-TGQS-WPESLVQKT-----GYTL
ETLKPCLLDLHQTYLRAPQHAQQSIREKYKNSKYHGVSLLNPPETLNL
>1bu2a
---RVLNNLKLRELLLP-KFTSLWEIQT--------EVT----VDNRTILLTWMHLLCES
F--ELDKSVFPLSVSILDRYLCKKQGTKKTLQKIGAACVLIGSKIRTVKPMTVSKLTYLS
C-----DCFTNLELINQEKDILEALKWDTEAVLATDFLIPLCNALK--IPE-DLWPQLYE
AASTTICKALIQP-NIALLSPGLICAGGLLTTIETDNTNCRPWTCYLEDLS-----S-IL
NFSTNTVRTVKDQVSEAFSLYD--------------------------
>1jkw
WTFSSEEQLARLRADANRKFRCKAVANGKVLPNDPVFLEPHEEMTLCKYYEKRLLEFCSV
FKPAMPRSVVGTACMYFKRFYLNNSVMEYHPRIIMLTCAFLACKVDEFN-VSSPQFVGNL
RESPLGQEKALEQILEYELLLIQQLNFHLIVHNPYRPFEGFLIDLKTRYPILENPEILRK
TADDFLNRIALTD-AYLLYTPSQIALTAILSSASRA-GITME--SYLSESLMLKENRTCL
SQLLDIMKSMRNLVKKYE--PPRSEEVAVLKQKLDRCHSA----ELAL
Input 2: Paste sequence of your structure(s) in PIR/FASTA format
1. Input sequence(s) should be in PIR/FASTA format.
3. First line should start with >P1; (in PIR format) or > (in fasta format) followed by pdbcode. For example,
>P1;1h7w or
>1h7w
4. Chain identifiers can be specified in the firstline as given below. This is optional
>P1;1h7w :A: or
>1h7w :A:
If the sequence has more than one chain, then the chain identifiers should be seperated by comma
>P1;1gg6 :A,B,C: or
>1gg6 :A,B,C:
Example 1
>P1;1itha
structure:1itha
GLTAAQIKAIQDHWFLNIKGCLQAAADSIFFKYLTAYPGDLAFFHKFSSVPLYGLRSNPA
YKAQTLTVINYLDKV VDALGGNAGALMKAKVPSHDAMGITPKHFGQLLKLVGGVFQEEF
SADPTTVAAWGDAAGVLVAAMK*
>P1;1ew6b
structure:1ew6b
GFKQDIATIRGDLRTYAQDIFLAFLNKYPDERRYFKNYVGKSDQELKSMAKFGDHTEKVF
NLMMEVADRATDCVP LASDANTLVQMKQHSSLTTGNFEKLFVALVEYMRASGQSFDSQS
WDRFGKNLVSALSSAGMK*
>P1;1ash
structure:1ash
ANKTRELCMKSLEHAKVDTSNEARQDGIDLYKHMFENYPPLRKYFKSREEYTAEDVQNDP
FFAKQGQKILLACHV LCATYDDRETFNAYTRELLDRHARDHVHMPPEVWTDFWKLFEEY
LGKKTTLDEPTKQAWHEIGREFAKEINK*
Example 2
>P1;1csza
structure
GSRRASVGSHEKMPWFHGKISREESEQIVLIGSKTNGKFLIRARDNNGSYALCLLHEGKV
LHYRIDKDKTGKLSIPEGKKFDTLWQLVEHYSYKADGLLRVLTVPCQKIGTQ*
>P1;1ayaa
structure
MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGDFTLSVRRNGAVTHIKIQNTGDY
YDLYGGEKFATLAELVQYYMEHHGQLKEKNGDVIELKYPLN*
>P1;1mil
structure
GSQLRGEPWFHGKLSRREAEALLQLNGDFLVRESTTTPGQYVLTGLQSGQPKHLLLVDPE
GVVRTKDHRFESVSHLISYHMDNHLPIISAGSELCLQQPVERKL*
Example 3
>P1;1vin
sequence
DIHTYLREMEVKCKPKVGYMKKQPDITNSMRAILVDWLVEVGEEYKLQNETLHLAVNYID
RFLSSMSVLRGKLQLVGTAAMLLASKFEEIYPPEVAEFVYITDDTYTKKQVLRMEHLVLK
VLAFDLAAPTINQFLTQYFLHQQPANCKVESLAMFLGELSLIDADPYLKYLPSVIAAAAF
HLALYTVTGQSWPESLVQKTGYTLETLKPCLLDLHQTYLRAPQHAQQSIREKYKNSKYHG
VSLLNPPETLNL*
>P1;1bu2a
sequence
RVLNNLKLRELLLPKFTSLWEIQTEVTVDNRTILLTWMHLLCESFELDKSVFPLSVSILD
RYLCKKQGTKKTLQKIGAACVLIGSKIRTVKPMTVSKLTYLSCDCFTNLELINQEKDILE
ALKWDTEAVLATDFLIPLCNALKIPEDLWPQLYEAASTTICKALIQPNIALLSPGLICAG
GLLTTIETDNTNCRPWTCYLEDLSSILNFSTNTVRTVKDQVSEAFSLYD*
>P1;1jkw
sequence
WTFSSEEQLARLRADANRKFRCKAVANGKVLPNDPVFLEPHEEMTLCKYYEKRLLEFCSV
FKPAMPRSVVGTACMYFKRFYLNNSVMEYHPRIIMLTCAFLACKVDEFNVSSPQFVGNLR
ESPLGQEKALEQILEYELLLIQQLNFHLIVHNPYRPFEGFLIDLKTRYPILENPEILRKT
ADDFLNRIALTDAYLLYTPSQIALTAILSSASRAGITMESYLSESLMLKENRTCLSQLLD
IMKSMRNLVKKYEPPRSEEVAVLKQKLDRCHSAELAL*
Example 4
>P1;1h7wa1
structure
HCEKLENNFDDIKHTTLGERGALREAMRCLKCADAPCQKSCPTHLDIKSFITSISNKNYY
GAAKMIFSDNPLGLTCGMVCPTSDLCVGGCNLYATEEGSINIGGLQQFASEVFKAMNIPQ
IRNPCLPSQEKMP*
>P1;1l0vb1
structure
MTHFIESLEAIKPYIIGNSRTADQGTNIQTPAQMAKYHQFSGCINCGLCYAACPQFGLNP
EFIGPAAITLAHRYNEDSRDHGKKERMAQLNSQNGVWSCTFVGYCSEVCPKHVDPAAAIQ
QGKVESSKDFLIATLKPR*
>P1;1qlab1
structure
TGNWFNGMSQRVESWIHAQKEHDISKLEERIEPEVAQEVFELDRCIECGCCIAACGTKIM
REDFVGAAGLNRVVRFMIDPHDERTDEDYYELIGDDDGVFGCMTLLACHDVCPKNLPLQS
KIAYLRRKMVSVN*
Single structure:
User may also submit sequence of single structure. For more information, please click here
Example 5
>P1;1h7wa1
structure
HCEKLENNFDDIKHTTLGERGALREAMRCLKCADAPCQKSCPTHLDIKSFITSISNKNYY
GAAKMIFSDNPLGLTCGMVCPTSDLCVGGCNLYATEEGSINIGGLQQFASEVFKAMNIPQ
IRNPCLPSQEKMP*
Example 6
>P1;1bro
sequence
PFITVGQENSTSIDLYYEDHGTGQPVVLIHGFPLSGHSWERQSAALLDAGYRVITYDRRGFGQSSQPTTGYDYDT
FAADLNTVLETLDLQDAVLVGFSMGTGEVARYVSSYGTARIAKVAFLASLEPFLLKTDDNPDGAAPQEFFDGIVA
AVKADRYAFYTGFFNDFYNLDENLGTRISEEAVRNSWNTAASGGFFAAAAAPTTWYTDFRADIPRIDVPALILHG
TGDRTLPIENTARVFHKALPSAEYVEVEGAPHGLLWTHAEEVNTALLAFLAK*
Input 4: Enter PDB code, chain identifier
PDBcode Chain Start End
1h7w A 51 183
1l0v B 106 243
1qla B 107 239
Contact:
Dr. R. Sowdhamini (mini@ncbs.res.in)
Dr. Saikat Chakrabarti
Dr. PN. Suganthan
G. Pugalenthi