3dswap-pred - Prediction of 3D domain swapping from protein sequence:
3dswap-pred is a webserver developed for the prediction of the protein structural phenomenon '3D domain swapping' from protein sequence data. 3DSwap-Pred use a RandomForest based algorithm for the prediction. 3D domain swapping is a protein structural phenomenon that mediates the formation of the higher oligomers in a variety of proteins with different structural and functional properties. This phenomenon plays important role in mediating functions ranging from oligomerization to pathological conformational diseases. 3D swapping can be observed only when a protein structure is solved in the swapped conformation in the oligomeric state. This is a limiting step to understand this important structural phenomenon in a large scale. 3Dswap-pred algorithm is designed to classify a given protein sequence as "swapping" or "non-swapping" based on the Random Forest based classifier. We have used literature curated sequences of proteins involved in 3D swapping as positive dataset. Negative data set is derived using a new sequence mining method to improve the accuracy of the algorithm. A set of 126 sequence based features are employed as vectors in the classifier. Using an independent validation dataset of 68 positive sequences and 313 negative sequences, 3DSwap-Pred achieved an accuracy of 63.78 in testing and accuracy of 62.34 during training.
3dswap-pred - Methodology:
3D domain swapping is a protein structural phenomenon that evolved as a mechanism for oligomeric assembly. A protein structure is reported in swapped conformation, when a minimum of two chains of an oligomeric structure share a structural segment between the chains to forms a stable structure.
3DSwap-Pred webserver can be used to predict the 3D domain swapping phenomenon from protein sequence using a machine learning algorth based on the ensemble classifier method "Random Forest". Random Forest is an ensemble decision tree classifier, which incorporates two effective machine learning techniques (bagging and selection of feature from random subspace) in to a single method. Random forest is a collection of decision trees, where each tree is grown using a subset of the possible attributes in the input feature vector. Instead of using all features in all trees, Random Forest randomly selects a subset of features to split at each node when growing a tree and the final decision is derived by combining results from all the trees generated during a simulation. It has been shown that combining multiple decision trees produced in randomly selected subspaces can improve the generalization accuracy. 3DSwap-pred server is providing a prior method to understand the 3D domain swapping to develop approaches that enable to scan sequences for identifying putative members which can be involved in 3D domain swapping.
3dswap-pred - References:
Bennett MJ, Choe S, Eisenberg D (1994) Domain swapping: entangling alliances between proteins. Proc Natl Acad Sci U S A 91: 3127-3131.
Bennett MJ, Choe S, Eisenberg D (1994) Refined structure of dimeric diphtheria toxin at 2.0 A resolution. Protein Sci 3: 1444-1463.
Bennett MJ, Schlunegger MP, Eisenberg D (1995) 3D domain swapping: a mechanism for oligomer assembly. Protein Sci 4: 2455-2468.
Bennett MJ, Eisenberg D (2004) The evolving role of 3D domain swapping in proteins. Structure 12: 1339-1341.
Bennett MJ, Sawaya MR, Eisenberg D (2006) Deposition diseases and 3D domain swapping. Structure 14: 811-824.
Ho TK (2002) A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors. Pattern Analysis & Applications 5: 102-112.
Breiman L (2001) Random Forests. Machine Learning 45: 5-32.
Andy L, Matthew W (2002) Classification and Regression by randomForest. R News 2: 18-22.
Ho TK (1998) The Random Subspace Method for Constructing Decision Forests. IEEE Trans Pattern Anal Mach Intell 20: 832-844.
Mitchell TM (1997) Machine Learning McGraw-Hill.
McGuffin, L. J., et al. (2000) The PSIPRED protein structure prediction server. Bioinformatics 16:404-405.
Contact: Prof. R. Sowdhamini (Contact : mini@ncbs.res.in)
3dswap-pred - Team:
K. Shameer, G. Pugalenthi & Prof. R. Sowdhamini