Search for OR genes in a genome



Input

Mandatory

  • Exonerate alignment file : Max size - 700MB (Sample exonerate alignment) - If bad gateway error is encountered then try splitting genome and run exonerate for each part so that max exonerate file size is 300 MB
  • Genome sequence (FASTA) : Max size - 700MB (Sample genome sequence file) - InsectOR output (along with the user-provided annotations) will be shown in a web-embedded genome viewer.
  • Protein query sequence (FASTA) file : Max size - 700MB (Sample query sequence file) - Same OR query sequence file which was used for exonerate alignment.

Optional

  • User provided gene annotations (GFF format) : Max size - 700MB (Sample gene annotation file) - InsectOR output will be compared with the user-provided annotations.
  • Alignment cluster cutoff - Alignment clusters are identified based on this cutoff. This is the minimum number of alignments needed at a nucleotide position for its inclusion into an alignment cluster. (Default: 1)
  • Predicted protein length completion cutoff - Predicted proteins can be classified as complete or partial based on this cutoff. (Default: 300 amino acids)
  • HMMSEARCH against 7tm_6 - Search for insect olfactory receptor signature (PFAM 7tm_6 family) as additional validation.
  • TMHMM TMH search - Search for Transmembrane Helices (TMH) using TMHMM as additional validation.
  • HMMTOP TMH search - Search for Transmembrane Helices (TMH) using HMMTOP as additional validation.
  • Phobius TMH search - Search for Transmembrane Helices (TMH) using Phobius as additional validation.
  • (If all three TMH predictiors are selected, consensus TMH prediction will be performed.)
  • MAST motif search - Search for known OR motifs in the predicted proteins.
  • Motif file (PSPM format of MEME) for MAST : Max size - 700MB - Users can provide their own motifs for MAST search into the predicted OR proteins. (If MAST option is checked, Default: AfOR motifs) (Sample motif file)

Output files for download and display

  • Summary file
  • Table providing detailed information on each gene/fragment predicted by the algorithm
  • Protein sequence file of predicted ORs/fragments in FASTA format
  • Protein sequence file of predicted ORs/fragments without any pseudogenizing elements in FASTA format
  • Nucleotide sequence file of predicted OR CDSs in FASTA format
  • Detailed gene annotations in GFF format
  • (Only if gene annotation file from another resource is provided) Comparison gene annotation file showing genes/gene fragments predicted by this tools along with overlapping user provided gene annotations. A shorter version of user provided GFF file containing only overalapping annotations with those from this tool. This will help to manually edit the results further.
  • (Only if genome sequence file is provided) Genome viewer (Dalliance) displaying genes/gene fragments predicted by this tool (and from user provided gene annotation file).
  • (Only if corresponding check-box is checked) TMH prediction by TMHMM, HMMTOP and Phobius. If all three are checked, consensus TMH prediction from the three and an html file showing comparison of the four TMH prediction methods is also provided.