STIF - Genome-scale prediction of plant stress responsive transcription factor binding sites:
Computational Transcription Factor Binding Site (TFBS) prediction is a mature domain in the field of Bioinformatics. Various algorithms, stand-alone software and web servers, web servers, and application-programming interfaces(API)s are available for the effective prediction of transcription start from sequence information using knowledge-based and motif-based methods. A wide array of TFBS prediction programs are available based on different biological contexts and needs to find transcription factor binding sites with high accuracies for reconstruction of gene-regulatory networks. STIF is a plant genome specific rapid, transcription factor identification algorithm designed to find biotic and abiotic stress-responsive factor bindings sites in plant genomes. We provide STIF algorithm as both web server and via API for rapid annotation of upstream and untranslated regions in plant genomes for putative transcription factor binding sites. STIF program was used to generate several large-scale stress-regulatory datasets for Arabidopsis and rice genomes (See STIFDB and STIFDB2). Data compiled using STIF algorithm was utilized by plant biologist for various biochemical, genomics, computational and mechanistic studies to understand biotic and abiotic stress regulome in plant genomes.
STIF - Methodology:
TIF is a Hidden Markov Model (HMM) based probabilistic model designed for prediction of stress-responsive transcription factor binding sites (TFBS) in the upstream or untranslated regions of plant genes. Briefly, STIF algorithm employs consensus sequence (S) of length (L) compiled from the literature and the probabilistic score (P(S)), and log-odd score were calculated by using the following equations.
Probability of consensus:
P(S)=F*T (1)
Where P(S) – Probability of consensus
F – frequency (i.e. number of particular nucleotide/ Total no in column) T – transition probability
Log odd-score for consensus
(S) = log P(S) – L (AT) log 0.375 + L(GC) log 0.125
To accommodate GC bias in the plant genome, we assigned a higher weighting factor to AT than GC in the log-odd score.
Briefly, STIF has been developed to search for potential transcription factor binding sites of stress specific transcription factors, starting from Hidden Markov Models of nucleotide binding site patterns of cis- elements that are well-known to respond to stress signals in plants. The 19 models of cis- elements, based on abiotic stress transcription factor families, were built as Hidden Markov Models and were validated using Jackknifing method. We had applied our HMM-based search algorithm; STIF to various datasets and benchmarked using experimentally verified datasets. STIF uses a simple input of gene sequence with upstream region and untanslated regions (UTRs) and provides various statistical estimates and visualization options by grouping results by genes and PO terms. The interface is developed using HTML, CSS and JavaScript.
If you use STIF algorithms or data derived from STIF algorithm, please cite us:
Khader Shameer, Oommen K. Mathew, Mahantesha BN Naika and Ramanathan Sowdhamini.(2015) Rapid prediction of plant stress responsive transcription factor binding sites from plant genomes using STIF algorithm. Manuscript in preparation
STIF - References:
- S. Ambika, Susan Mary Varghese, K. Shameer, M. Udayakumar and R. Sowdhamini (2008) STIF: Hidden Markov Model-based search algorithm for the recognition of binding sites of Stress-upregulated TranscrIption Factors and genes in Arabidopsis thaliana. Bioinformation, 2(10):431-437 [PMCID: PMC2561162]
- K. Shameer, S. Ambika,Susan Mary Varghese,N. Karaba,M. Udayakumar and R. Sowdhamini1.(2009) STIFDB-Arabidopsis Stress Responsive Transcription Factor DataBase.Int J Plant Genomics. 2009; 2009: 583429 [PMCID: PMC2763139]
- Khader Shameer, Mahantesha BN Naika, Oommen K. Mathew, and Ramanathan Sowdhamini.(2014) STIF: Automated Plant Phenomic Analysis Using Plant Ontology. Bioinformatics and Biology Insights 2014:8 209-214 10.4137/BBI.S19057. [PMID: 25574136]
- Mahantesha Naika, Khader Shameer, Oommen K. Mathew , Ramanjini Gowda and Ramanathan Sowdhamini. (2012) STIFDB2: An updated version of plant Stress Responsive TranscrIption Factor DataBase with additional stress singnals, stress responsive transcription factor binding sites and stress responsive genes in Arabidopsis and rice. Plant and Cell Physiology Plant Cell Physiol. 2013 Feb;54(2):e8. doi: 10.1093/pcp/pcs185.
- Mahantesha Naika, Khader Shameer and Ramanathan Sowdhamini (2013) Comparative analyses of stress-responsive genes in Arabidopsis thaliana: Insights from genomic data mining, functional enrichment, pathway analysis and phenomics. Mol. BioSyst.,9(7), 1888-1908.
| |