Tag Archives: Molecules & the Basis of Life

Enzymes & their Roles

Quite simply:

Enzyme + Substrate = Product

  1. The substrate (AKA reagent) binds with the enzyme’s active site to form Enzyme-Substrate complex.
  2. Enzyme reacts with substrate to produce a Product, which is then released.
  3. Product released, and enzyme ready for more substrate.

Enzymes are:

  • One of more polypeptide chains – so a protein – but that form a structure with an ‘Active Site’.
  • The active site is ‘specific’, which essentially means it only has a few substrates that will fit it, and bind with it.

DNA Mutations and Genetic Diseases

As mentioned, chromosomes select characteristics such as sex (Men have different copies of the sex chromosome, X and Y wheras females have two X chromosomes) but also cause diseases through chromosomal abnormalities:

  • Downs Syndrome – Caused by 3 copies of chromosome 21. This is referred to as trisomy.
  • Turner Syndrome (women) – only 1 X chromosome.
  • Klinefelter Syndrome (men) – YXX (trisomy) rather than YX.
  • Cystic Fibrosis – 3 nucleotides removed in DELTAF508 gene – stopping production of phenylalanine.
  • Sickle Cell Anaemia – A changed to T in gene for haemaglobin.

Materials can be traslocated from one chromosome to another, nucleotides added or removed or bases substituted. These changes can cause diseases and other genetic problems. Usually these are seen during protein synthesis.

– Down’s Syndrome

Downs Syndrome is a genetic disease caused by an extra copy (which may be complete or partial) or chromosome 21 (trisomy 21). The disease is often associated with lessened cognitive ability & physical development and features a common set of facial characteristics. Further implications of Down’s Syndrome vary greatly from one individual to another. Fertility is another affected function, with very few males able to successfully reproduce and only some females when mating with unaffected males. Incidence rates of the disease in their children are much greater at approximately half.

While treatment can be provided to improve a sufferers quality of life there is no cure.

Fig 2 - Trisomy 21 Causing Down's Syndrome (Female Karyotype)

It is estimated 1 in 800-1000 people are born with the disease, with several factors contributing to the likelyhood of a child having it. The most notable of these seems to be the age of the mother, with the chance of the disease increasing as a mother gets older.

The Mutation in Down’s Syndrome

There are several ways Down’s Syndrome has been discovered to occur. About 95% of all cases occur via the first route, Trisomy-21.

  1. Trisomy 21 – 95% of cases – Where the extra chromosome 21 is added to a gamete in nondisjunction (where either homologous chromosomes fail to come apart in meiosis 1 or sister chromatids fail to come apart during meiosis 2 or mitosis) event during production in the parent; then joining with a gamete from the other parent to produce an embryo with 47 chromosomes. The vast majority (~88%) of this mutation occurs in the mother.
  2. Mosiac Down’s Syndrome – 1-2% of cases – Where some of the cells in the embryo (and later body) have Trisomy-21 and some are normal. This can occur as Trisomy-21 above followed by a reversion to normal cells during cell division in the embryo; or the other way around where cell division in a normal embryo somehow change to Trisomy-21.
  3. Robertson Translocation – 2-3% of cases – In the karyotype of one of the parents, the long arm of chromosome 21 is attached to another chromosome (often 14) and following normal disjunctions during cell replication there is a high possibility of a child receiving the extra chromosome. This is also known as familial Down’s syndrome, as it is passed directly down and the parents show a normal phenotype – with this type there is no age effect and males are as likely as females to cause the disease in their offspring.

A final, very rare occurance is the duplication of a portion of chromosome 21, meaning that there are copies of some of the genes. If these are the genes responsible for the effects seen in Down’s syndrome then these effects will be expressed but otherwise the phenotype will be normal.

– Sickle Cell Anaemia

Sickle cell anaemia affects the red blood cells in the body, by producing cells which hold a rigid sickle shape rather than the usual doughnut. As this is a genetic disease based on a recessive allele there is a possibility for offspring to be carriers, suffer the disease or not carry it at all, depending on their parents. Sickle cell disease is caused by having both recessive alleles (SS) while people can also have sickle cell trait which means they are a carrier but do not show the effects of the disease (HbS).

As the cells are more rigid than normal, and combined with their unusual shape there are many complications which can occur within the body. These include blockages of blood vessels, increased destruction of blood cells (and so reduced oxygen capacity), problems with the spleen and a host of other blood & circulation related problems.

A sickled red blood cell sits among normal cells

It is interesting to note that the disease is found in higher levels in areas where Malaria is more common, as being a carrier (so the sickle cell trait rather than sickle cell disease) is a benefit as sickling of blood cells as they are attacked by malaria halts its spread.

Sickle cell disease is caused by a mutation on the haemoglobin gene – where A is changed to T at position 17 in a base substitution (mis-sense). This changes a glutamic acid on the protein (GAG) to a valine (GTG).

– Types of Mutation in DNA

Fig 1 - Showing different types of chromosomal mutation

Wild Type = Normal Sequence of DNA

  • Point Mutations – Single nucleotide changes in the DNA strand which result in different codons.
    • Miss-sense = Resulting in a different amino acid.
    • Non-sense = Resulting in a STOP codon and possible termination of protein chain.
    • Silent = Codon codes for the same amino acid as wild type so the protein is the same.
  • Frameshift Base Insertions or Deletions = One nucleotide added or removed, resulting in the change of most of the following amino acids.

Protein Purification

– Protein Identification

Protein purification begins with the need to identify the protein we want to purify! There are several methods that can be used to rapidly identify the protein:

  • Enzyme Assay (by catalytic activity) – with certain enzymes we can use colorimetry to detect a product as a reaction progresses. The higher the more enzyme present, the faster the colour or light absorbance will change. An example is testing for Alcohol Dehydrogenase, which will lead to a change in the levels of NADH and NAD+ as Ethanol is converted to Ethanal. This change can be detected by colorimetry at ~340nm.
  • SDS-PAGE Electrophoresis (by size) – this method seperates protein chains by size by electrophoresis. This method denatures the proteins.
    The sample is run at the same time as a molecular mass marker sample, containing proteins of known mass. The marker sample will provide a scale for the mass of your sample. Once you’ve run the gel you will be able to plot the results as above, draw a best fit line and read off the Molecular Mass of your sample protein.
  • Immuno-Assay (by specific antibodies) – Antibodies that fit specifically to the protein you are looking for are added to the sample. When these bind to the target protein they will instigate a colour change or some other noticable change. The presence and concentration of the target protein can be assessed by the extent of the changes – if it was a colour change then the darker the colour goes, the more target enzyme must be present.
  • Western Blotting – A combination of electrophoresis and immuno-assay. The immuno-assay technique is run by electrophoresis. This will be useful if your sample contains several different proteins and you need to identify the target protein. The band of colour (or change) will show you the correct protein, and then you simply need to calculate the approximate molecular mass using the molecular mass markers.

– Protein Purification

I’ve broken the purification methods here into 8 different headers, each a physical-chemical property or biological activity.

  1. Stability (Heat). Some proteins are more heat tolerant than others and can survive heating while others denature. If your target protein is heat stable at above 60C and your contaminants are not, then simply heating your mixture to 60C for 30 minutes will denature most of the contaminents. This will leave you with a much higher concentration of your target protein in your mixture.
  2. Solubility (Seperate by pI). Proteins are least soluble at the pH equal to their isoelectronic point. When helped by the addition of salts to the solution this can lead to their precipitation. As the salt concentration increases, different proteins will precipitate.
  3. Size. Proteins can be seperated by Gel Permeation Chromatography (Gel filtration). The proteins are run through a buffered, porous, cross linked resin. While small molecules are able to fit into the pores in the resin, the larger proteins cannot and so travel ahead, with the small molecules lagging behind.
    This is due to a larger volume of buffer available to the smaller molecules, meaning more buffer must pass down the column for them to elute, compared to the relatively smaller volume of buffer required to elute the larger, excluded proteins.
  4. Density (Centrifuge). By centrifuging the sample in a test tube containing a sucrose density gradient, the centrifugal forces will force the proteins down the tube until they reach a concentration where the density of the sucrose solution is the same as their own. This level is known as it’s isopycnic level.
  5. Charge. There are several different methods of purification by charge:
    1. Gel Electrophoresis – Based on movement of a protein through a cross linked gel called polyacrylamide. This would occur at a pH where the protein has a charge (not at it’s pI). The size of the pores can be altered by changing the concentration of cross linking reagent, and the speed at which a protein travels is equal to its charge:mass ratio. (This method does not tell us anything about the protein’s molecular weight).
    2. SDS PAGE – This cannot really be used for purification because SDS detergent (Sodium Dodecylsulphate) is used which denatures the protein. It unfolds the protein and surrounds it with -ve charge sulphate groups which means all the proteins have a uniform charge:mass ratio. SDS has a 12 carbon hydrocarbon chain, and then a hydrophilic sulphate group. The Sulphate groups surrounding the protein form a miscelle.
      The sample is now allowed to run on a gel, from -ve to +ve, and as they all have the same mass to charge ratio, their rate is determined only by their size. The smaller protein molecules move faster and the larger molecules move slower through the gel.
    3. Isoelectric Focusing – Very similar to (1) above, but instead of an electric charge, there is a pH gradient along which the proteins can move until they are at a point where they  have no net charge (at their pI).
    4. Ion Exchange Chromatography. Essentially, both columns and proteins become charged at different pH’s, and by altering the pH we can hold on to some proteins while others are eluted.
      Diethylaminoethyl-Cellulose (DEAE-Cellulose) has a +ve charge below pH 9.5 wheras CarboxyMethyl-Cellulose (CM-Cellulose) has a -ve charge above pH 3.0. Therefore:
      – Proteins with a +ve charge at pH7 will bind to a column of CM-Cellulose, while
      – Proteins with a -ve charge at pH7 will bind to a column of DEAE-Cellulose.
      We can then alter the pH of the solution to release certain proteins or to pick up others. Another way of dispersing ionic interactions between the column and proteins is to increase the salt concentration.
  6. Hydrophobicity. Proteins nearly always feature hydrophobic areas or side chains and these allow the proteins to bind to resins with hydrophobic groups attached. This means the proteins can be eluted with a gradient of buffer (eg an organic solvent such as ethanol). The proteins forming the strongest interactions with the resin column will require higher concentrations of ethanol to elute.
  7. Biological Function. If a protein has a high affinity for a substrate (eg. ADH has a high affinity for NAD+) then we can use affinity chromatography. If we immobilise the substrate (eg. NAD+) then the protein will bind to that substrate, immobilising itself – allowing other proteins to run free of the column. By releasing free NAD+ throught he column the substrate will gradually release the immobilised NAD+ in favour of the free NAD+ and run free of the column.
    This method can purify a protein in one step, and works best if the protein has a high affinity for the bound ligand.
  8. Fusion Proteins. This involves the addition of a gene to a protein that essentially ‘tags’ the protein. An example would be a tag containing histidine residues, which would bind to metal ions in the column.
    Here, the imidazole rings on the histidine residues stick to the immobilised metai ions allowing other proteins to elute the column. Then, like the method above, add free imidazole to release the fusion proteins and then use a protease to cut the tag away. Run the column again and only the tags will bind, allowing the protein of interest to run free.

Proteins – Quaternary Structure & Overview

The quaternary structure of a protein involves the association of folded polypeptide chains into a mature, active protein.

  • This can be a single polypeptide chain (monomer), 2 chains (dimer), 3 chains (trimer), 4 chains (tetramer) and so on…
  • The associated chains can be identical or different.

Some quaternary structure require additional polypeptide chains (which were removed during production) in order to achieve a working protein state (eg. Mature insulin). There are also structures which will revert to their original shape once broken, as the order is set in the primary structure of Amino Acids.

With Insulin, a helper amino acid strand is used to ‘hold’ two sequences in place, allowing the formation of disulphide bridges. I’ve tried to illustrate before and after:

S is the signal chain, while B acts as a support structure during disulphide bridge formation between A and C.

– An Overview

  • Primary Structure – The sequence of Amino acids on a chain.
  • Secondary Structure – The 3D relationship between Amino Acids – leading to α helix, β pleated sheet etc.
  • Tertiary Structure – The 3D relationship between parts of the above structure.
  • Quaternary Structure – The number of and relationship between amino acid chains (seperate tertiary structures).

Proteins – Stabilising Forces

There are several different types of forces acting on/within a protein molecule. These include:

  1. Covalent Bonds:
    1. Peptide bonds between Amino Acids (C-N). Can be broken down into individual amino acids by hydrolysis with 6M acid/alkali, or by proteases/proteolytic enzymes.
    2. Disulphide bridges form between cysteine to form cystine. (Cysteine has -SH which forms disulphide bridge -S-S- with another HS-). Bridges are broken down by reduction with β-mercaptoethanol to form cysteines once again.
  2. Non-Covalent Forces/Bonds:
    1. Hydrogen Bonds – these bonds are throughout the protein. The bonds in the middle of the protein structure contribute most to stability as they are furthest away from water (which would disrupt them). These can also be disrupted by heat.
    2. Van Der Waals forces/interactions – short range dipole-dipole (δ+ & δ-) interactions between close atoms. Easily disrupted by heat or denaturing agents.
    3. π-π overlap – π electron clouds delocalised over rings & bonds. Are disrupted by heat.
    4. Electrostatic bonds, Ionic interactions and Salt bridges between residues. All broken by changes in pH or high ionic strength. (Eg, positive residues include Lys, Arg, His while negative residues include Asp, Glu, Tyr & Cys).

– Zwitterions

Zwitterions are amino acids in free solution that are doubly charged. Their net charge will depend on the pH of the solution. Each amino acid has an isoelectric point at which it has no net charge.

Below the isoelectric point (also known as pI), they have a net positive (+ve) charge and above the pl they have a net negative (-ve) charge.

When amino acids become part of a polypeptide/protein, they lose their NH2 and OH groups so only the side chains can carry charges.

Proteins themselves can have isoelectronic points – and this will depend on the number and type of different amino acid residues.

– Hydrophobic Interactions

This is the prime driving force for protein folding (AKA hydrophobic collapse).

Essentially the protein chain will fold in such a way as to minimise the exposure of hydrophobic residues within the chain. This leads to the residues with hydrophilic (polar) side chains being situated on the outside of the molecule.

Proteins – Tertiary Structures

There are two notable tertiary structures – α (ALPHA) helix and BETA pleated sheet.

α Helix

  • Right handed helix much like that of a DNA helix.
  • Each amino acid side chain (R group) is 100 degrees relative to the last side chain, outside of the helix. This means there are 3.6 residues per turn and 5.4 angstroms per turn/level. On the sketch below, each R stands for a different amino acid side chain.

A couple of alterations:

  • Glycine residues will disrupt the α helix as it has no chiral carbon. The lack of a chiral carbon in Glycine makes it very flexible.
  • Proline has a cyclic side chain which restricts the rotation of phi to ~50°. There is also no H atom on the N end of the amino acid so Hydrogen bonding does not occur between residues.

Amphipathic Helices:

  • Helixes can end up with hydrophobic residues on one side and polar (hydrophilic) on the other – essentially giving the helix two faces. The image below illustrates R1, R4, R7 and R8 as hydrophobic, and R2, R4, R5, and R6 as hydrophilic.
  • This means helices can be constructed to generate lipid (hydrophobic) or water (hydrophilic) soluble proteins.

– β Pleated Sheet

There are two types of pleated sheet – Parallel and Anti-Parallel.

  • Parallel sheet has successive polypeptide strands in the same direction.
  • Anti-Parallel sheet has successive polypeptide strands in opposite directions.

These strands are typically 5-10 amino acids long, and the pleated sheet is formed by a continuous series twisted into these strands.

It has been suggested that the anti-parallel configuration is more stable.

Proteins – Primary & Secondary Structures

As mentioned a couple of posts ago:

  • Proteins are polypeptides made from 20 different monomers.
  • On average contain 100-400 monomers.
  • Each monomer has an approximate molecular mass of 110.

– Monomers –> Polymers. The Primary Structure.

  • Amino Acids form peptide bonds (from the carboxylic acid group on one to the amine group on another). This releases water in a condensation reaction. The location of the peptide bond (C-N) is shown below outlined in RED.
  • When reading a sequence of Amino Acids in a protein, start at the Amino terminus (NH2 end) and read to the Carboxyl terminus at the other (COOH).
  • The sequence of amino acids is known as the primary structure of a protein.

The amino acids in chains and proteins can be post-translationally modified – eg, disulphide bridges can form between cysteine residues.

– The Secondary Structure

Assuming the following:

  1. No rotation occurs round the peptide bond (as it is partly double bonded in nature).
  2. The chain of amino acids form a rhythmical structure – forming a repeating pattern.
  3. That the maximum number of interactions from Hydrogen bonding possible are occuring, independant of the type of residue (amino acid).

Now to explain these points:

  1. As mentioned, the C-N bond is partly double bonded and so does not rotate. The bond length of a normal C-N bond is 1.49Å (angstroms, click here for more info), while the length of a normal C=N bond is 1.28Å. The length of the peptide bond is between these, at 1.28Å.
    This is due to the C-N bond resonating between single and double bonded forms, as shown above.
  2. Two different folding points exist. These are called phi and psi. A perfect helix structure (covered later) needs both phi (Φ) and psi (Ψ) to be at an angle of about -60 degrees.
  3. Hydrogen bonds occur between the C=O and H-N of other amino acids. In α helixes, the C=O: would form a hydrogen bond to the N-H 4 residues ahead in the spiral (directly above).

The attachment of Amino Acids to tRNA – Aminoacylation

  1. First the Amino Acid must be activated. This involves the addition of ATP (adenosine triphosphate), forming Aminoacyl Adenylate.
  2. Once the amino acid has been activated it can be attached to the tRNA. This follows the following scheme:
    Aminoacyl Adenylate + tRNA –> Aminoacyl-tRNA + AMP.
    See the following image from wiley.com, showing the structure of a tRNA molecule with amino acid attached. Note at the bottom is the mRNA strand.

– tRNA

As we know, tRNA is an adapter molecule that carries amino acids in an activated form to ribosomes for protein synthesis.

There is at least 1 tRNA molecule for each of the 20 amino acids.

It adopts a folding structure with internal base pairing and is about 75 nucleotides long.

Translation – RNA –> Proteins

Proteins are polymers (polypeptides – aka monomers joined by peptide bonds) of amino acids, of which there are 20 which occur naturally.

They are synthesised in the cytoplasm on ribosomes which decode the mRNA in the 5′–>3′ direction.

Most proteins contain between 100 and 400 amino acids, and as the order of amino acids per protein can be different there are 20^100 to 20^400 possible stuctures.

The average amino acid has a molecular mass of 110, so using this we can estimate the mass of different proteins by multiplying the average mass by the number of amino acids in the protein – eg. a 400 amino acid protein has an estimated molecular weight of 44000.

It is estimated that there are 10^7 or 10^8 different proteins in nature.

– Amino Acids

The 20 different amino acids are:

Single Letter Code Short Name Name
A Ala Alanine
C Cys Cysteine
D Asp Aspartic Acid
E Glu Glutamic Acid
F Phe Phenylalanine
G Gly Glycine
H His Histidine
I Ile Isoleucine
K Lys Lysine
L Leu Leucine
M Met Methionine (START*)
N Asn Asparagine
P Pro Proline
Q Gln Glutamine
R Arg Arginine
S Ser Serine
T Thr Threonine
V Val Valine
W Trp Tryptophan
Y Tyr Tyrosine

*Met (Methionine) is also a start signal in translation for Eukaryotic cells. When the codon for Met is read (AUG), translation begins. Met is often removed or altered once translation has been completed. The START codon is different in Prokaryotes, possibly GUG – valine.

Each protein is coded for by 3 bases – called a triplet. Since there are 4 bases in total, of which 3 can be chosen there are 4^3 possible combinations – 64.

Here’s the triplet codes for each Amino Acid in most cells:

If you’d like the file this screenshot came from: Amino Acid Codes

These codes are almost universal, with the exception of a few types of cell. These include Human Mitochondria, where there are several triplet changes – such as UGA coding for Trp rather than STOP and AUA coding for Met instead of Ile.

– tRNA and Codon Triplets

  • Amino acids are linked to an adapter molecule of tRNA. This forms an anticodon which will match a codon on the mRNA.
  • The amino acid is bonded to the 3′ end of the complementary tRNA strand.
  • Essentially, anticodons come in when they match the mRNA strand and are then removed, leaving an amino acid completemtary to the codon.
  • This is repeated over and over to form a chain of amino acids until a stop codon is reached (UAA, UAG or UGA) and the completed polypeptide chain is released.

To explain this better I’ve found this animation. This is not my work, rather that of the American Society for Microbiology. If found it on their page here. Click here to watch the translation in bacterial cells video.

– Mutations caused by Errors

  • Wild Type = Normal Sequence
  • Miss-sense = One base changed, resulting in the sequence coding for a different Amino Acid.
  • Non-sense = One or more bases changed, resulting in termination of chain.
  • Silent = One of more base changes but the same amino acid coded for.
  • Frameshift Base Deletion = One base removed, resulting in the change of most of the following amino acids.

Transcription – DNA –> RNA

RNA is much the same as DNA, except for a few points:

  1. The sugar is ribose rather than deoxyribose – deoxyribose has one fewer OH group – on C2:
  2. The DNA base Thymine is replaced with Uracil (same but without methyl group):
  3. RNA is single stranded rather than double stranded like DNA. Instead, it folds into well defined structures (rather than combining two seperate strands that can be broken apart by denaturing).

There are several different types of RNA:

  • mRNA – Messenger RNA – template for protein synthesis.
  • rRNA – Ribosomal RNA – major component of ribosomes.
  • tRNA – Transfer RNA – carries activated amino acids to ribosomes.
  • snRNA – participates in RNA splicing.
  • miRNA – binds to mRNA and inhibits translation.
  • siRNA – Small Interfering RNA – binds to mRNA and promotes degradation.

– The Process of Transcription

A strand of RNA is produced from a strand of DNA – much the same as during DNA replication but in this case it is catalysed by RNA polymerase using rNTPs (ribonucleotide triphosphates). No primer is required. The synthesis occurs in the same direction as for DNA replication (5′->3′) and pyrophosphates are still released when the ribonucleotide triphosphates bind to the backbone.

  • ~17 base pairs of DNA duplex uncovered at a time as the DNA is trancribed in RNA. Of those ~17 base pairs, only 9 are paired with RNA at any one time.
  • The transcription ‘bubble’ moves down the DNA strand 3′–>5′ at a rate of ~50 bases/sec until it reaches a termination sequence.
  • In prokaryotes, transcription AND translation occur at the same time.

– The Control of Transcription

The interactions between RNA polymerase and its promoter can be enhanced by activators or blocked by repressors.

A good example is the lac operon in prokaryotes – in Eukaryotes this is much more complex and may require chromatin remodelling to allow access to genes for transcription.

The Lac Operon controls expression of genes related and involved in the metabolism of lactose. A regulatory gene leads to the production of a repressor protein, which (in the absense of lactose) will bind to the operator gene, blocking expression of the later genes. When lactose appears, this disables the repressor protein, changing it’s active site so that it can no longer bind to the Operator gene. This allows expression of the genes further along the strand.

The above diagram shows the events when (a) no lactose is present, and (b) when lactose is present. The diagram below shows what occurs in the Tryptophan operon – you’ll see it is very similar.

– RNA Splicing

Splicing removes non-coding RNA sections from the newly synthesised strand. I mentioned non-coding DNA previously as DNA that has a purely structural role and does not code for any proteins etc. When it is copied into RNA during translation it has no further use and so is removed by splicing.

  • A non-coding segment is called an INTRON (for intragenic regions). These sites start with GU and end with AG.
  • A coding segment is called an EXON (for regions that will be expressed).

By removing non-coding segments, several proteins can be synthesised by just one gene.

Incorrect splicing is a high risk though, and up to 15% of all genetic diseases have been caused by errors and mutations during splicing.