So what is it all the same, "protein folding"?

So what is it all the same, "protein folding"?

In the current COVID-19 pandemic, there are many problems that hackers have been happy to pounce on. From 3D printed face shields and homemade medical masks to replacing a complete mechanical ventilator, this stream of ideas has inspired and delighted the soul. At the same time, there were attempts to advance in another area: in research aimed at combating the virus itself.

By all appearances, the approach that tries to get to the very root of the problem has the greatest potential for stopping the current pandemic and getting ahead of all subsequent ones. This β€œknow your enemy” approach is espoused by the Folding@Home computing project. Millions of people have signed up for the project and are donating some of their CPU and GPU processing power, creating the largest [distributed] supercomputer in history.

But what exactly are all these exaflops used for? Why is it necessary to throw such computing power at folding [laying] of proteins? What kind of biochemistry works here, why do proteins need to fit in at all? Here's a brief overview of protein folding: what it is, how it happens, and why it's important.

To begin with, the most important thing: why do we need proteins?

Proteins are vital structures. They not only provide building material for cells, but also serve as enzyme catalysts for almost all biochemical reactions. Squirrels, be they structural or enzymatic, are long chains amino acidsarranged in a certain sequence. The functions of proteins are determined by which amino acids are located in certain places on the protein. If, for example, a protein needs to bind to a positively charged molecule, the junction must be filled with negatively charged amino acids.

To understand how proteins get the structure that determines their function, you need to go over the basics of molecular biology and the information flow in the cell.

production, or expression proteins starts with a process transcriptions. During transcription, the double helix of DNA, which contains the genetic information of the cell, partially unwinds, giving access to the nitrogen bases of the DNA to an enzyme called RNA polymerase. The task of RNA polymerase is to make an RNA copy, or transcription, of a gene. This copy of a gene called messenger RNA (mRNA), is a single molecule ideal for managing intracellular protein factories, ribosomeswho are engaged in production, or broadcast proteins.

Ribosomes behave like assembly devices - they capture the mRNA template and match it to other small pieces of RNA, transfer RNA (tRNA). Each tRNA has two active regions, a three-base section called anticodon, which must match the corresponding mRNA codons, and a site for binding an amino acid specific to this codon. During translation, tRNA molecules in the ribosome randomly try to bind to mRNA using anticodons. If successful, the tRNA molecule adds its amino acid to the previous one, forming the next link in the chain of amino acids encoded by mRNA.

This sequence of amino acids is the first level of the structural hierarchy of the protein, which is why it is called primary structure. The entire three-dimensional structure of a protein and its functions are directly derived from the primary structure, and depend on the various properties of each of the amino acids and their interaction with each other. Without these chemical properties and interactions of amino acids, polypeptides would remain linear sequences without a three-dimensional structure. This can be seen every time food is cooked - in this process, thermal denaturation three-dimensional structure of proteins.

Long-range bonds of protein parts

The next level of the three-dimensional structure, which goes beyond the primary, was given a clever name secondary structure. It includes hydrogen bonds between amino acids of relatively close action. The main essence of these stabilizing interactions comes down to two things: alpha helixes ΠΈ beta list. The alpha helix forms a tightly coiled region of the polypeptide, while the beta sheet forms a smooth and wide region. Both formations have both structural and functional properties, depending on the characteristics of their constituent amino acids. For example, if the alpha helix is ​​mainly composed of hydrophilic amino acids, like arginine or lysine, then it will most likely participate in water reactions.

So what is it all the same, "protein folding"?
Alpha helices and beta sheets in proteins. Hydrogen bonds are formed during protein expression.

These two structures and their combinations form the next level of protein structure βˆ’ tertiary structure. Unlike simple fragments of the secondary structure, the tertiary structure is mainly affected by hydrophobicity. The centers of most proteins contain highly hydrophobic amino acids, such as alanine or methionine, and water is excluded from there due to the "oily" nature of the radicals. These structures often appear in transmembrane proteins embedded in the lipid double membrane surrounding cells. The hydrophobic regions of proteins remain thermodynamically stable inside the fatty part of the membrane, while the hydrophilic regions of the protein are exposed to the aqueous medium from both sides.

Also, the stability of tertiary structures is provided by long-range bonds between amino acids. A classic example of such connections is disulfide bridge, often occurring between two cysteine ​​radicals. If you smelled a bit like rotten eggs in a hairdressing salon during a permanent waving procedure for a client's hair, then this was a partial denaturation of the tertiary structure of the keratin contained in the hair, passing through the reduction of disulfide bonds with the help of sulfur-containing thiol mixtures.

So what is it all the same, "protein folding"?
The tertiary structure is stabilized by long-range interactions such as hydrophobicity or disulfide bonds.

Disulfide bonds can form between cysteine radicals in one polypeptide chain, or between cysteines from different complete chains. Interactions between different chains form quaternary protein structure level. An excellent example of a quaternary structure is hemoglobin in your blood. Each hemoglobin molecule consists of four identical globins, protein parts, each of which is held in a certain position within the polypeptide by disulfide bridges, and is also associated with a heme molecule containing iron. All four globins are connected by intermolecular disulfide bridges, and the entire molecule binds with several air molecules at once, up to four, and is able to release them as needed.

Modeling structures in search of a cure for disease

Polypeptide chains begin to fold into their final shape during translation, when the growing chain emerges from the ribosome, much like a piece of memory alloy wire can take on complex shapes when heated. However, as always in biology, things are not so simple.

In many cells, transcribed genes undergo extensive editing prior to translation, significantly altering the basic structure of the protein compared to the pure base sequence of the gene. At the same time, translational mechanisms are often enlisted with the help of molecular escorts, proteins that temporarily bind to the nascent polypeptide chain and prevent it from taking any intermediate form, from which they will then be unable to move to the final one.

This is all to the fact that predicting the final shape of a protein is not a trivial task. For decades, the only way to study the structure of proteins was through physical methods such as X-ray crystallography. It was not until the late 1960s that biophysical chemists began to build computational models of protein folding, mainly concentrating on modeling secondary structure. These methods and their descendants require huge amounts of input data in addition to the primary structure - for example, tables of amino acid bond angles, lists of hydrophobicity, charged states, and even the preservation of structure and function over evolutionary timescales - all in order to guess what will happen. look like the final protein.

Today's computational methods for predicting the secondary structure, working in particular in the Folding@Home network, work with about 80% accuracy - which is quite good, given the complexity of the problem. Data generated by predictive models for proteins such as the SARS-CoV-2 spike protein will be compared with data from physical studies of the virus. As a result, it will be possible to obtain the exact structure of the protein and, possibly, to understand how the virus attaches to the receptors. angiotensin converting enzyme 2 person, located in the respiratory tract leading into the body. If we can understand this structure, we may be able to find drugs that block binding and prevent infection.

Protein folding research is at the heart of our understanding of so many diseases and infections that even when we use the Folding@Home network to figure out how to beat COVID-19, which we've been seeing an explosion of late, the network won't be down for long without work. It's a research tool that's great for studying the protein models that underlie dozens of protein misfolding diseases, such as Alzheimer's disease or the often-inaccurately referred to as mad cow disease, a variant of Creutzfeldt-Jakob disease. And when the next virus inevitably appears, we will be ready to start fighting it again.

Source: habr.com

Add a comment