Gendis Dowloads

  • Combined SF-HMM and SQ-HMM files for all superfamilies

  • The SF-HMMs are constructed from structure-based sequence alignments of PASS 2.4 superfamily members. All PASS2.4 member sequences are also converted into HMMs and used for validation.

  • All sequence homologues identified in GenDiS+

  • This file contains all the validated homologues of 1,961 superfamilies in SCOP v1.75.

  • Taxonomic details file

  • This file provides details for all true positive sequence accession identifiers in different organisms with the superfamily details. Our validated sequences belonged to 67,377 organisms out of ~160,000 organisms listed in NCBI Taxonomy.

  • SCOP Domain Architecture details file

  • This file has the details of all the SCOP DA assigned to true positive sequences identified in the study. We identified 37,760 DA out of which 35,811 (94.84%) are multi-domain DA. We have assigned numerical identifiers for each DA. With the future updates, the newly identified DA will be included in the list.

  • Pfam Domain Architecture details file

  • This file includes Pfam DA details for all true positive sequences. We identified 1,20,644 DA consisting of 9853 (60.71%) families out of the 16,230 families in Pfam 28.0.

  • File containing the domain sequences extracted from the true positive sequences

  • The domain region from the homologues matching the HMM co-ordinates of the SF-HMM and SQ-HMM have been extracted.

  • List of specific superfamilies

  • This file lists the taxa specific superfamilies, superfamilies with 100 or more proteins predicted to have discontinuous domains, and viral superfamilies with protein domains in cellular organisms i.e. cases of horizontal transfer from virus to cellular hosts.