2.9 Mutation, genes and proteins

Billions of people throughout the world are busy revising their accustomed way of life, thanks to a mutation in a virus roughly 1/1000 the width of a human hair.  An extraordinary turn of events that few had imagined possible, despite clear warnings from the science of epidemics.

The immediacy of the situation is bringing science to the centre of public attention.  Many are eagerly following statistics tracking the outbreak and its effects. Equally compelling is the possibility of a vaccine to help the immune system fight off the virus or some kind of drug treatment to prevent it developing in the first place. Science is being observed as it actually works in real time, complete with its inherent uncertainty.

A short stretch of a DNA molecule

The epidemic is also stimulating a deeper and potentially more sustaining interest in science. With science as our ally and scientists our saviours, curiosity about basic concepts is also being aroused. What is a virus, what are antibodies, how do diseases spread and have such diverse consequences for individuals? Over the months these blogs will focus on several of these. In this one, mutation is the starting point. What is it, how does it happen, what are its effects? The word itself simply means change (Latin mutare), but what exactly is it that changes? Why can that have such terrible consequences – as well as beneficial ones sometimes?


It’s the DNA that changes or mutates. DNA is a long and unbelievably thin molecule made up of units strung together in a chain. The total DNA in a human cell is about 2 metres long but only 2.5 billionths of a metre  wide. So compared to a typical reel of cotton thread it is roughly 50 times shorter, but 100,000 times thinner. This extraordinary long thin thing is able to coil up in such a way that it fits inside the nucleus: a compartment inside nearly every cell in your body (red blood cells are an exception).

That’s how the entire information required to make your body can be replicated in every one of the tens of trillions of cells of your body. The total DNA in each cell is called the genome. It’s the information transmitted every time a new cell is made when an earlier one divides in two. It’s also the information that, when combined from the two sexes, goes to define the make-up of offspring – the next generation. So DNA is fundamental to both how our body works as it develops from second to second, and also how it transmits  characteristics to our descendants.

Mutation This fine thread of DNA has a remarkable structure – as discovered famously in the 1950s in Cambridge and London by Francis Crick, Rosalind Franklin and James Watson.

The molecule comprises a backbone made of two helices which repeat regularly along its entire length – a bit like the railings and banister of a spiral staircase. Of itself, this is just a structure, the backbone carries no information of use to the body. The extraordinary power of DNA – the information it carries – lies in the units attached to each of its regular repeats, roughly analogous to the steps on a spiral staircase.

These units, known as ‘bases’, are small chemical groups of four kinds. They are known for simplicity by the first letter of their names: A C G T. Each of these bases is attached firmly to one strand of the DNA double helix. Any one of the four types can sit at any point along the long, helical thread. Thus a single strand of a DNA molecule consists of a regular backbone with an immensely long sequence of letters (or bases) strung along it.

In the human genome there are approximately three billion of these bases in a single strand of DNA. In the genome of the virus causing the COVID -19 disease there are only some thirty thousand. A stretch of your DNA might look something like this, symbolically:  ….. G   T   A   C   T   T   G   C   G   A …..

A mutation can simply be the replacement of one of these letters for another. Instead of a G there’s an A or instead of an T, a C. A letter (or base) can also be deleted or added in. It’s odd to think of such a change happening in a biological system – you’d hardly expect our bodies to behave like a fat-fingered person tapping out a message on a mobile phone. But mutations are generally caused by accidental copying errors when DNA is reproduced as a cell divides. Or they can be caused by a powerful hit from radiation – gamma rays, high frequency X rays, or a beam of protons, for example. This is why we are kept well away from powerful sources of radiation normally. It also why such terrible consequences were visited on the people of Hiroshima and Nagasaki in 1945 and their descendants. Not only was the DNA in the main cells of their bodies damaged by radiation but also the DNA in their sperm and egg cells was altered, the information that shaped the bodies of their children.

Genes determine proteins

We talk colloquially of genes as though their role was simply to transmit features from one generation to the next – “he’s got his father’s eyes”. They do this of course through the copy of the genes carried in the highly specialised egg and sperm cells. But almost every other cell carries all the genes too. What use do they make of them, you might ask?

It’s in the billions of other cells that genes are continuously put to work. Their role is to provide the specific information for the production of all the proteins in our bodies. Proteins are the work horses and machine tools of the body’ various systems. There are thousands of different types in a human being – approximately 30,000 have so far been identified, but there may be up to 100,000. As work horses they do things such as moving our muscles (actin and myosin), carrying oxygen around the body (haemoglobin) or reacting to light in the retina (rhodopsin) for example. The jobs they may do as tools (enzymes) include cutting up big molecules in the stomach and releasing energy by ‘burning’ the glucose we eat in the presence of the oxygen we breathe in. Other types of protein are equally important. Some are structural – such as the collagen in your skin or  keratin in hair. Others, such as antibodies, are protective, recognising viruses or bacteria and ensuring they are eliminated. Others still act as signallers (technically, hormones) regulating the behaviour of systems, as insulin does for our sugar levels.

For each protein there is a corresponding gene – a short stretch of DNA, located somewhere in the vast long double helix thread. The sequence of bases (or letters) in a stretch of DNA called a gene prescribes which type of protein could be produced from it. For example one particular gene called HBB makes a protein which goes to make up the haemoglobin in your blood.

Using genes to make proteins

If it’s the information embodied in genes that enables the proteins in our bodies to be produced, how does this information get used? Where does this all happen?

The mechanism that reads off the information from the DNA, and produces proteins from it, is buried inside every single cell that has a nucleus (most do) – and there are  literally trillions of these in the human body (37 trillion on one estimate[i]). Inside the nucleus, inside every cell, is the complete set of DNA for the whole body – the full information needed to make a person. It’s astounding! What a feat of packaging! But that‘s why cloning is possible. From the DNA in a single cell, a whole sheep can, in principle be created. Hello Dolly.

[i] https://www.smithsonianmag.com/smart-news/there-are-372-trillion-cells-in-your-body-4941473/

The molecules of protein are, like many key molecules in our bodies, long chains, rather like necklaces with beads linked together in one long thread. Each bead or link of the chain is a small molecule, one of a class of molecules called amino acids. These come in twenty varieties, which gives plenty of scope for constructing all the different kinds of protein we need, by linking together different types of “bead” in different sequences.

The proteins we need every second of every day are built inside every cell, largely from components we have eaten. That’s why we need to eat protein from  other living things to survive. But we don’t use animal or plant proteins just as they are; we need to build our own human ones. Our digestion systems have evolved to ensure that the protein molecules we consume – in meat, nuts, milk, eggs, beans or whatever – get broken up as soon as we start to eat, from the moment your saliva gets to work on them. The digestive juices in the stomach and intestines cut up the long chain of a protein into its many links – the amino acids of which it is composed. These then get absorbed through the gut lining into the bloodstream and thus find their way to all the cells of your body.

It is here, inside the billions of cells of which your body is made, that the new proteins, the ones you need, are made .The amino acids that may have originated in a slice of cheese or fillet of fish, are re-purposed to make up the proteins you need as Homo sapiens. Turning the amino acid components from the food you eat into the muscle, skin and bone of which you are made is the routine work of the genes. They contain the vital information that is you, the code that turns amino acids into proteins.

You can read how the code is translated and how mutations occur in the Further Information section below.

Mutations originate in the DNA of our genes as a result of copying mistakes when DNA is replicated. These accidental changes lead to the production of altered proteins. As a rule these changes make little difference to the way our proteins work. In some cases they prove deleterious however, as in the genetic disorder cystic fibrosis, for example. Very occasionally, however, the altered protein turns out to be even better at its job than the intended version. In a classic example, a particular kind of protein in a light coloured moth changed in a way that darkened it. On the darkened surfaces of smoke covered Victorian trees this moth survived better than its brighter peers and went on to breed more successfully. The moth had adapted to a changed environment – the wellspring of evolution. Mutations enable species to evolve to fit their environment better.


Human beings tend to reproduce less rapidly than moths – every twenty to thirty years rather than every few weeks or months. Mutations that occur in the DNA of egg or sperm cells transfer as each new generation is born. For moths this occurs roughly 50 times more frequently than for humans. So evolution happens faster for moths than humans. They are able to adapt more rapidly to their surroundings than humans  – hence their ability to react to air pollution in nineteenth century Britain, while our bodily systems remain largely as they were for our hunter-gatherer ancestors.

Viruses can reproduce even faster, in hours or days in the right host[i]. New generations may therefore appear many thousands of times faster than for human beings. This means mutations occur much more frequently and hence viruses can adapt to changes in the their environment relatively rapidly.  This explains how a virus affecting bats but not humans was able to evolve into a new version, fatal to human life, so rapidly.

[i] https://www.quora.com/How-fast-do-viruses-replicate

The current SARS-COV-2 virus is continuing to mutate as we speak – roughly 2.5 times per month. Mutants are being identified continuously around the world enabling an international research effort to study the progression of the disease across the globe. If you have the time and energy you can read about this fascinating research in the Further Information section below. If you don’t, now’s the time to take a break!

© Andrew Morris 31st May 2020

Further information

Translating the code

Inside the nucleus of each cell, where the DNA is stored, a copy is made of the code on one of the strands of the DNA double helix. This copy, itself along thin molecule, known as messenger RNA, moves out of the nucleus into the main body of the cell where it enables proteins to be built.

As this animated diagram shows the trick of translating the letters in the DNA code into physical protein is performed by an ensemble of molecular actors. The star of the show is the cross-shaped molecule called tRNA (for ‘transfer’). This is a double-ended molecule that is able to interpret the genetic code.

Image Courtesy of Biology Discussion Forums

Protein Synthesis

At one end it carries three ‘letters’ (they are three of the ‘bases’ mentioned earlier). At the other end it carries one specific amino acid out of the twenty kinds available. This is how the code gets interpreted – it’s the enigma machine of the cell.For each type of amino acid attached to one end of the tRNA molecule, a unique triplet of letters is attached at the other. There are twenty different tRNA molecules  – one for each type of amino acid. Floating around in the fluid of the cell are vast numbers of all these molecules.

The code

The remarkable discovery in the 1950s, hard on the heels of the discovery of the double helix structure, was the way information encoded in DNA was interpreted inside our cells. It enables the ‘beads’ for a protein to be strung together into a ‘necklace’. A number of large molecules work in sequence to ‘read’ sections of the code and add specific amino acids to a growing chain that will become a protein.

In the first step, the DNA code is transcribed to make a strand of the very similar RNA molecule (the small difference is that the base T is replaced by a similar one called U). This code is then translated in a way that enables the units that make up a particular protein to be put together.

Each group of three ‘letters’ in the RNA message spells out the code for one specific amino acid (or ‘bead’ in the necklace}. The diagram shows the letters G C U, for example, correspond to an amino acid called Alanine.

In the necklace analogy ‘alanine’ might be a glossy red bead, ‘threonine’ a dull yellow one and so on. Each group of three letters is called a codon.

Each stretch of an RNA molecule was derived from one specific gene. It goes on to enable one specific protein molecule to be made – in this case a protein composed of alanine – threonine – glutamate ……. and so on. What these chemical components are need not concern us here; the key point is that information coded in a strand of DNA – a gene – has been translated, enabling a physical protein molecule to be produced. That’s the amazing story of how your genes enable your body to produce the vital proteins it needs to survive. It also explains why your body resembles that of your ancestors – you have genes in common. They passed them on to you.

Mutation and evolution

With this insight into the intricate mechanism that determines how your body gets built, it’s easy to see how it can all go wrong. Just one letter mis-translated or accidentally substituted or deleted  or inserted, can spell disaster. In the diagram above, codon 3 (G A G) differs by only one letter from codon 7 (U A G). If G A G  were to be accidentally transcribed as U A G the wrong component will be inserted into the emerging protein chain: in place of a glutamate molecule, a signal to stop altogether. The resulting protein would differ from that intended. It might be defective, or effective or possibly even better at its job than the intended protein.

This is how mutations affect living systems. By accidental mis-copying, changes are made in the components of our bodies. Strangely these apparent ‘mistakes’ are happening all the time. In most cases, however, the overall effect is neutral – the altered protein is neither better or worse at its job than the intended one. However on some occasions it is worse and that can lead to genetic disorder. An example is the blood disorder sickle cell anaemia, in which an altered amino acid produces defective haemoglobin molecules which damage the oxygen carrying ability of the blood.

Mutation and the coronavirus

Mutations have a bad name – naturally! But, though they occur regularly as mistakes in the transcription process, it is rare for them to prove harmful. Most have a neutral effect, conferring neither advantage nor disadvantage on the next generation. Influenza viruses seems to mutate fast enough to enable them, each year, to evade the antibodies produced for by previous year’s version. Mutations in some bacteria mean they are evolving in ways that enable them to resist antibiotic drugs designed to kill them.

Under normal conditions the human population is rarely overwhelmed by infections. The various defence mechanisms of the immune system usually succeed in combatting unwelcome intruders: viruses, bacteria, parasites and other microbes. The immune system not only protects against familiar bugs but also identifies new mutated versions and adapts accordingly. That’s why we don’t get last year’s flu again. The mutated coronavirus has caused such devastation because it is not only able to transfer itself so easily from one host to another but it does so many days before anyone is aware of symptoms. A random mutation has resulted in the virus being able to survive and spread through sneezes and coughs before any protective measures have been taken to block its progress. One interesting benefit from the rapid mutation rate of the Sars-CoV-2 virus is that it enables us to track the progress of the disease across the world. The virus’s genes evolve at a steady rate of about 2.5 mutations a month The slightly different genes of the various  mutants can be analysed in different places at different times. The number of altered bases (or “letters”) in each type of mutant can be used to work out how long has elapsed since the original outbreak in Wuhan. An evolutionary tree can be built up showing how, when and where one mutant strain diverged from its predecessors.  The diagram below produced by an open-source collaboration of scientists called nextstrain[i] shows when the various mutants appeared in different countries (for country colours click the link ).

[i] https://nextstrain.org/ncov/global

Evolutionary tree of the virus as it mutates (colours represent countries)