Our last post entitled, ‘Biopharmaceutical Protein Structure – Part 1‘, looked briefly at the interplay of different chemical bonds common to proteins and their respective contributions to the maintenance of protein structure. We will now apply those principles to the current discussion and explore in greater detail the four levels of protein structure with particular emphasis on the structural complexity of proteins.
The Structural Complexity of Biologics
Variation in structure from one protein to another is enormous. That said, irrespective of the protein’s origin (whether animal, plant, bacterial or otherwise), they are constructed from a standard set of 20 amino acids (AA’s) found in nature. In general, proteins have four (4) levels of structure, ‘primary’, ‘secondary’, ‘tertiary’ and ‘quaternary’. Despite this, it’s worth mentioning that all proteins will not necessarily possess or exhibit all four types. While the previous statement may seem obscure at this point, rest assured that it will become clearer as each level of protein structure is described in more detail below.
The ‘sequence’ in which amino acids are joined, head to tail, is unique for a given protein and is referred to as its, ‘Primary StructureThe relatively strong covalent bonds mentioned in our last post (‘Biopharmaceutical Protein Structure – Part 1‘), are responsible for ‘tying’ amino acids together in a linear fashion resembling a chain of beads (See ‘Primary structure’ in Figure 1 below). The ‘sequence‘ in which these amino acids are joined, head to tail, is unique for a given protein and is referred to as its, ‘Primary Structure‘.
At a structural level, AA’s appear like branched formations with varying side groups at position ‘R’ (See Figure 2 below). The nature of the chemical group at position ‘R’ determines which of the 20 naturally occurring AA’s, the one in question is. In other words, the side group at position ‘R’ is unique in each of the 20 known AA’s. Some AA’s have a negatively charged group at position R, others a positively charged group, yet others have side groups that are attracted to water, making them Hydrophilic and still others have R-groups that are attracted to fatty / oily substances and by extension are repelled by water or watery environments. These are described as Hyrdophobic or Lipophilic.
Inclusion of the various AA’s in a protein chain means that the protein adopts the characteristics of the respective amino acids to some degree. The result is that the protein takes on a unique shape based on interactions of the R-groups that protrude along the length of the protein chain. These interactions will ‘play out’ between the side-groups themselves but also between the side groups and the various components of the external environment.
As an example, if two amino acids are in close enough proximity to interact with each other and are oppositely charged, the protein chain will be pulled together at their point of interaction much like a thread with two oppositely charged magnets attached. If on the other hand two amino acids come into close proximity with each other but are similarly charged (e.g. two positively charged AA’s), they will repel each other.
Now imagine how a protein chain 200 AA’s long for example, with these charged and uncharged groups protruding all along its length, each interacting with one another, with liquids, with dissolved salts and other particles in the surrounding environment will likely behave. Hopefully by now your mind’s eye has created a mental picture and with it a better appreciation of how and why proteins will spontaneously twist into elaborate shapes, taking on highly specific three dimensional configurations.
Share this Post
The take home message here is that a protein’s shape is significantly influenced by its environment because its environment affects the charged side groups found along the protein chain. Having discussed primary structure, we should not lose sight of the fact, that even the relatively strong covalent bonds which maintain the protein backbone, are susceptible to harsh environmental conditions.
With proteins varying in size as widely as mentioned in our last article, ‘Biopharmaceutical Protein Structure – Part 1‘, and with 20 possible options at each AA site along the length of the chain, the number of protein variants is infinite. That said, some basic structural formations are repeated throughout the ‘protein universe’.
Collectively, the coiled (alpha-helical) and undulating (beta-pleated) patterns found in protein chains is referred to as the protein’s secondary structureAt points along the length of proteins, coiling of the chain is invariably present. These coiled areas are known as alpha-helices (See ‘Secondary structure’ in Figure 1 above). Another common structural formation resulting from folding of the protein chain in an undulating manner resembling hills and valleys, is the beta-pleated sheet (Also visible in ‘Secondary structure’ from Figure 1 above). Collectively, the coiled (alpha-helical) and undulating (beta-pleated) patterns described above, are referred to as the protein’s secondary structure (Figure 1 above). As confirmation of the multitude of variations possible in protein structure, a protein may exhibit only one type of secondary structural feature while another may have a single alpha helices and a single beta-pleated sheet. Yet others, may have multiple occurrences of each. In the end, the unavoidable fact is that from protein to protein, there are infinite numbers of structural combinations with a given type of protein consistently demonstrating a fixed set of structural features under similar conditions.
Tertiary structure refers to the manner in which…portions of the protein chain fold back upon themselves making the protein’s appearance less linear…and more compact or globularTertiary structure (See ‘Tertiary structure’ in Figure 1 above) refers to the manner in which the alpha-helical and/or beta-pleated portions of the protein chain mentioned above, fold back upon themselves making the protein’s appearance less linear and chain-like; and more compact or globular. Both secondary and tertiary structure are maintained principally by non-covalent bonds as described in the preceding paragraphs. Recall however that ‘Cysteine Bridges’ which are a special type of covalent bond, may also play a role in secondary and tertiary structure depending on the type and location of the amino acids present in the protein, particularly Cysteine.
The final level of protein structure is termed Quaternary Structure (See ‘Quaternary structure’ in Figure 1 above) and refers to the interaction of multiple chains of amino acids (known as polypeptides) with one another. Each one of these polypeptides has the propensity to adopt its own primary, secondary and tertiary configuration. These polypeptide chains however, then go a step further to interact with each other forming a final functional entity in which quaternary structure is realized. Such a protein could therefore be reasonably described as a ‘multi-unit‘ protein.
Quarternary structure refers to the interaction of multiple chains of amino acids (known as polypeptides) with one anotherDescribed another way, quaternary structure refers to the interaction of multiple polypeptide chains uniting to form a single functional protein. Logically speaking therefore, quaternary structure can only be realized in proteins composed of more than one polypeptide (where a polypeptide is a sequence of AA’s joined back-to-back in a chain). This last point illustrates one reason why not all proteins possess the four levels of structure since a pre-requisite for exhibiting quaternary structure is the presence of more than one chain of AA’s.
In very general terms, the relationship between protein structure and function can be summed up in one phrase, ‘The amino acid composition, the sequence in which those amino acids are joined and the environment in which the protein exists all determine the protein’s final shape. The proteins shape in turn, largely determines its function’. If we therefore change the protein’s shape, we also change (lose) its original function.
We have taken a very simplistic approach to describing what is a very complex and intricate relationship, especially considering that integration of non-protein components plays a role in the function of some proteins. For the sake of simplicity however, this explanation provides a reasonably accurate approximation of the ‘Bonding-structure-function‘ relationship
Reiterating the point as I close, the nature and identity of protein side-chains are critical to protein structure. Without the appropriate amino acids being both present and joined in the correct sequence, the right mix of electrostatic and other forces needed to create and mainatin an appropriately-shaped protein, will not be present. This of course will mean that maintaining the structure of these large, complex molecules will be impossible. Given the inextricable link between protein structure and function, altering even a single side chain (or its corresponding AA) can therefore render a protein ineffective or worse harmful. Changes to a protein’s environment threrefore can easily alter its characteristics and behavior with expected but often unpredictable consequences.
In short, proteins are extremely delicate and so changing the environment in which they exist can result in permanent damage to their structure and by extension their function. This is why unlike conventional medicines, biologics / biopharmaceuticals allow little, if any room for deviation from their recommended storage, preparation and administration conditions.
1. Kuhlmann, M. and Covic, A. (2006), ‘The Protein science of biosimilars’, Nephrology Dialysis Transplant, (2006) 21 [Suppl 5], pp.v4-v8. DOI: 10.1093/ndt/gfl474