Academia.eduAcademia.edu

Protein Structure Prediction

description4,620 papers
group11,810 followers
lightbulbAbout this topic
Protein Structure Prediction is the computational and theoretical approach to determining the three-dimensional structure of a protein based on its amino acid sequence. This field employs algorithms and models to predict how proteins fold and interact, which is crucial for understanding their function and role in biological processes.
lightbulbAbout this topic
Protein Structure Prediction is the computational and theoretical approach to determining the three-dimensional structure of a protein based on its amino acid sequence. This field employs algorithms and models to predict how proteins fold and interact, which is crucial for understanding their function and role in biological processes.

Key research themes

1. How have machine learning and deep learning approaches advanced protein secondary structure prediction accuracy?

This theme investigates the development and impact of machine learning (including deep learning) methods on predicting protein secondary structure - a crucial intermediate step toward elucidating full protein 3D structures. The focus is on how evolutionary information integration, sophisticated classification and prediction algorithms, and new neural network architectures have incrementally improved the prediction accuracy over classical propensity-based and template methods, with accuracy now approaching theoretical limits. These methods are important because they enable large-scale, cost-effective inference of protein structure from sequence data, which is invaluable given the experimental limitations.

Key finding: Demonstrated that evolutionary information from divergent protein profiles combined with neural networks boosted secondary structure prediction accuracy from ~60% to above 75%, with incremental improvements due to enhanced... Read more
Key finding: Proposed a novel multi-component prediction framework (MCP) which directly processes amino acid sequences without intermediate feature engineering. By combining Support Vector Machines with Fuzzy K-Nearest Neighbor methods... Read more
Key finding: Provided an extensive review summarizing the three generations of secondary structure prediction methods, detailing how the integration of sequence profiles, sophisticated deep neural network architectures, and improved data... Read more
Key finding: Analyzed steady improvements over decades culminating in current three-state accuracy of 82-84%, reaching close to a theoretical limit of 88-90%. Identified the role of large protein databases, template-based incorporation,... Read more

2. How can integrating structural and sequence information via advanced neural architectures improve protein tertiary structure and local backbone conformation prediction?

This theme explores approaches that leverage both the linear amino acid sequence and three-dimensional structural information via deep learning and knowledge-based methods to improve prediction of protein tertiary structure or detailed local conformations (referred to as structural alphabets). These approaches address challenges in capturing long-range interactions and conformational flexibility, offering more accurate and interpretable models compared to sequence-only or physics-based simulations. The theme includes innovations on geometric deep learning, principal component analysis for dimension reduction, and structure-informed neural networks improving functional insights and accuracy.

Key finding: Highlighted the complexity in predicting precise per-residue secondary structure, showing that proteins with similar folds can differ up to 12% in secondary structure state per residue. Proposed shifting prediction goals... Read more
Key finding: Developed PB-kPRED, a method predicting local protein backbone conformations as sequences of Protein Blocks (PBs) using a knowledge-based algorithm with pentapeptide fragment databases, achieving ~81% accuracy on these... Read more
Key finding: Presented a novel approach applying PCA to derive a low-dimensional conformational space from a set of low-energy models, allowing efficient sampling and optimization via particle swarm optimization. Showed that as few as 10... Read more
Key finding: Introduced LM-GVP, combining pretrained protein language models and Geometric Vector Perceptron (GVP) graph neural networks to jointly learn from 1D sequences and 3D structures, trained end-to-end for protein property... Read more

3. What is the current landscape and role of AI/deep learning in comprehensive protein structure prediction, including static and dynamic conformations and protein complex modeling?

This theme captures the recent breakthroughs and ongoing challenges in applying AI and deep learning models, especially deep neural networks and large language models, for predicting both static and dynamic protein structures, as well as multimeric protein complexes. It integrates insights from methods predicting inter-residue distances and orientations, template-free and template-based modeling, and conditional structure generation towards capturing conformational ensembles and drug-target interactions. This theme highlights how AI-based predictions facilitate research beyond tertiary structure, including protein dynamics, function, and drug discovery.

Key finding: Presented a combined deep learning and mechanistic modeling framework that uses coevolutionary-derived residue-residue distance predictions to infer protein conformational ensembles. Filtered predicted models by energy and... Read more
Key finding: Evaluated AI-based models (AlphaFold and RoseTTAFold) for predicting the challenging NLRP3 protein, including binding pocket conformation assessment. Found these models produce reliable single-domain info but face limitations... Read more
Key finding: Demonstrated large-scale improvements in protein complex structure prediction using AlphaFold-inspired methods in CASP15-CAPRI, with quality of high-accuracy models rising from 8% to about 40%. The results show AI-based... Read more

All papers in Protein Structure Prediction

Recently, predicting proteins three-dimensional (3D) structure from its sequence information has made a significant progress due to the advances in computational techniques and the growth of experimental structures. However, selecting... more
Praktikum ini bertujuan untuk mengenal sifat-sifat protein serta memahami perubahan kimia yang terjadi akibat berbagai perlakuan fisik dan kimia, melalui beberapa uji yaitu uji Biuret, koagulasi oleh alkohol, pengendapan oleh garam,... more
Praktikum ini bertujuan untuk mengenal sifat-sifat protein serta memahami perubahan kimia yang terjadi akibat berbagai perlakuan fisik dan kimia, melalui beberapa uji yaitu uji Biuret, koagulasi oleh alkohol, pengendapan oleh garam,... more
The assumption that cellulose degradation and assimilation can only be carried out by heterotrophic organisms was shattered in 2012 when it was discovered that the unicellular green alga, (Cr), can utilize cellulose for growth under... more
‘Inner mitochondrial membrane peptidase 2 like’ (IMMP2L) is a nuclear-encoded mitochondrial peptidase that has been conserved through evolutionary history, as has its target enzyme, ‘mitochondrial glycerol phosphate dehydrogenase 20... more
The assumption that cellulose degradation and assimilation can only be carried out by heterotrophic organisms was shattered in 2012 when it was discovered that the unicellular green alga, Chlamydomonas reinhardtii (Cr), can utilize... more
Radio wave propagation of millimeter wave is highly influenced by the movements of objects (humans); simple movements of the objects can changes entire propagation paths of transmitted waves inside the buildings. Ray tracing is one of the... more
Recent trends show that it is a major challenge to predict the exact propagation path accurately and efficiently for three-dimensional (3D) indoor environments. Therefore, this study introduces a new 3D ray tracing algorithm based on a... more
The experiments show that our approach appears more efficient results than state of art method.
The experiments show that our approach appears more efficient results than state of art method.
Rift Valley fever virus (RVFV) is a bunyavirus endemic to Africa and the Arabian Peninsula that infects humans and livestock. The virus encodes two glycoproteins, Gn and Gc, which represent the major structural antigens and are... more
This paper presents a novel learning algorithm for structured classification, where the task is to predict multiple and interacting labels (multilabel) for an input object. The problem of finding a large-margin separation between correct... more
High-pressure phase transformations of Ca are studied using the metadynamics method to explore the anharmonic free-energy surface, together with a genetic algorithm structural search method to identify lowest enthalpy structures.... more
The prediction of Compound-Protein Interactions (CPI) is an essential step in drugtarget analysis for developing new drugs. Therefore, it needs a good incentive to develop a faster and more effective method to predicting the interaction... more
The first and probably the most important step in predicting the tertiary structure of proteins from its primary structure is to predict as many as possible secondary structures in a protein chain. Secondary structure prediction problem... more
Biological membrane channels and pore-forming proteins display a level of sophistication in managing molecular scale transport that is typically unmatched by inorganic analogs, and efforts to develop nanopores that approach the transport... more
Since enzymes are essential for the ripening of fruit and have a physiological role in that process, in this study, the most significant factors that contribute to the roles of invertase and cellulase in the ripening process were... more
Using information-theoretic concepts, we examine the role of the reference state, a crucial component of empirical potential functions, in protein fold recognition. We derive an informationbased connection between the probability... more
Managing text-based information is crucial when trying to extract valuable information from documents. Assigning a numerical value to the text-based (unstructured) information is one of the ways to extract value. This research studied the... more
Background There is slight evidence on the effectiveness of relaxation techniques to improve quality of life of the old people, and no comparative studies have particularly investigated this population. Hence, the present study was... more
This paper introduces a groundbreaking computational paradigm for protein structure prediction through novel OmegaFold architecture that fundamentally transforms traditional multiple sequence alignment dependent methodologies. The... more
Mycobacterium tuberculosis (M.tb) remains a formidable global health threat. The increasing drug resistance among M.tb clinical isolates is exacerbating the current tuberculosis (TB) burden. In this study we focused on identifying novel... more
Currently, the data mining and machine learning fields are facing new challenges because of the amount of information that is collected and needs processing. Many sophisticated learning approaches cannot simply cope with large and complex... more
Protein secondary structure prediction from its amino acids is purposely used to evaluate and improve the accuracy of performance as well as drug design and cell functionality. Various approaches for predicting protein secondary structure... more
Methods to reliably estimate the quality of 3D models of proteins are essential drivers for the wide adoption and serious acceptance of protein structure predictions by life scientists. In this paper, the most successful groups in CASP12... more
This supplement extends the VPR (Vortex-Pattern-Resonance) model by applying its informational energy framework to resonant phenomena observed in biological macromolecules and astrophysical signals. We introduce VPR+-a focused extension... more
Predicting the effects of mutations on protein function is an important issue in evolutionary biology and biomedical applications. Computational approaches, ranging from graphical models to deep-learning architectures, can capture the... more
Six empirical force fields were tested for applicability to calculations for automated carbohydrate database filling. They were probed on eleven disaccharide molecules containing representative structural features from widespread classes... more
Six empirical force fields were tested for applicability to calculations for automated carbohydrate database filling. They were probed on eleven disaccharide molecules containing representative structural features from widespread classes... more
This paper comprehensively presents the holistic medical perspective that "self-healing power is the truth," challenging the modern overemphasis on immunity.
by Tang N
Genetic defects on 6-pyruvoyl-tetrahydropterin synthase (PTPS) are the most prevalent cause of hyperphenylalaninaemia not due to phenylalanine hydrolyase deficiency (phenylketonuria). PTPS catalyses the second step of tetrahydrobiopterin... more
We participated in the fold recognition and homology sections of CASP5 using primarily in-house software. The central feature of our structure prediction strategy involved the ability to generate good sequence-to-structure alignments and... more
In this study we investigate the extent to which techniques for homology modeling that were developed for water-soluble proteins are appropriate for membrane proteins as well. To this end we present an assessment of current strategies for... more
Background The TlyA protein has a controversial function as a virulence factor in Mycobacterium tuberculosis (M. tuberculosis). At present, its dual activity as hemolysin and RNA methyltransferase in M. tuberculosis has been indirectly... more
Using an information theoretic formalism, we optimize classes of amino acid substitution to be maximally indicative of local protein structure. Our statistically-derived classes are loosely identifiable with the heuristic constructions... more
We present a new method for multiple sequence alignment (MSA), which we call MSACSA. The method is based on the direct application of a global optimization method called the conformational space annealing (CSA) to a consistency-based... more
We present the sixth report evaluating the performance of methods for predicting the atomic resolution structures of protein complexes offered as targets to the community-wide initiative on the Critical Assessment of Predicted... more
A structured folding pathway, which is a time ordered sequence of folding events, plays an important role in the protein folding process and hence, in the conformational search. Pathway prediction, thus gives more insight into the folding... more
Summary: A structured folding pathway, which is a time ordered sequence of folding events, plays an important role in the protein folding process and hence, in the conformational search. Pathway prediction, thus gives more insight into... more
This study presents a unified machine learning-based system for predicting multiple diseases diabetes, Parkinson's disease, and heart disease through a single interface. Support Vector Machine (SVM) is used for diabetes and Parkinson's... more
Dimethyl gold complexes bonded to partially dehydroxylated MgO powder calcined at 673 K were synthesized by adsorption of Au(CH3)2(acac) (acac is C5H7O2) from n-pentane solution. The synthesis and subsequent decomposition of the complexes... more
Bacterial cell division is driven by the divisome, a ring-shaped protein complex organized by the bacterial tubulin homolog FtsZ. Although most of the division proteins in Escherichia coli have been identified, how they assemble into the... more
One of the main challenges in bioinformatics is predicting the structures of macromolecules, particularly nucleic acids and proteins. In this study, we propose a hybrid approach integrating K-Nearest Neighbors (KNN), Support Vector... more
As for secondary structure prediction, we believe that tertiary structure prediction should also be automatic and hence renroduc&. A j&lly automatic protocol HOM-FOLD for modelling proteins by homology, using a fragment based approach, is... more
The computational design and simulation of the properties of proteins requires powerful computers, colour graphics and interactive software. A brief description of such a system will be presented. One of the key requirements in the design... more
age' secondary structure prediction for the family. The prediction of Benner et al. was in the event disappointing in some respects 8 . But other predictions have been better. The recent publication of tertiary structures for SH2 domains... more
Download research papers for free!