top of page

Microbial community sequencing: what is it and why do we use it?

I often find myself trying to explain what exactly I mean by sequencing, particularly when I talk about sequencing whole microbial communities, and I’m sure I can’t be the only one with this problem, so I thought that I’d try to explain it here. 

What exactly is sequencing?

When we talk about sequencing, we usually mean that we're looking at the sequence of an organisms' DNA - the order that the nucleotides (a small molecule) occur in. There are four nucleotides (also called bases) in DNA; adenine, guanine, cytosine and thymine, and the order that these are found in determines almost everything about that organism. Genes are essentially strings of DNA that make proteins, that in turn make up cells and tell the components of cells what to do. 

What do we use sequencing for?

We usually use sequencing to look at a specific gene that we're interested in, or to work out what a gene might be doing by comparing it to other known genes. We can use the DNA sequence of that gene to work out what protein it is making, and this can tell us about the organism that it comes from. The DNA sequence of a bacterial gene is typically around 900 bases long. Sequencing can also be used to get the DNA sequence of a whole organism (its genome - all of the genes within that organism), which for bacteria is somewhere between 0.6 and 8 million bases. Sequencing is now relatively cheap; some companies offer the sequencing of one gene for approximately £3, and a whole genome for approximately £80.

Sequencing of individual bacteria

In my work, I mainly look at a gene called the 16S gene (approximately 1500 bases long). Because this gene has evolved relatively slowly, it is highly conserved in closely related species of bacteria, and we can therefore use it to identify these bacteria. Once we have its sequence, we can compare it with a database of known sequences of bacteria. By looking at how different it is from other sequences, we can see how related this bacterium is to others. Typically (in bacteria), if this sequence is at least 97% similar to another sequence, we consider the bacteria to be of the same species. Of course, as in all areas of biology, there will always be exceptions to this, but it works quite well as a general rule. If this sequence is not similar enough to class the bacterium we are looking at as the same species as anything else known, then we can still use it see what it is most similar to. This allows us to make predictions about what job this bacterium might be carrying out, where it might live, and what kind of lifestyle it might have.

Sequencing of microbial communities

We can also use sequencing to look at all of the members of a microbial community. This is now quite widely used, mainly due to the advances in computing which allow us to process a large amount of data. This allows us to see how microbial communities differ between different areas, and how they change over time. Usually, when we sequence whole communities, we use a shorter section of the gene of interest (for example, for the 16S gene, we use a section that is just under 400 bases long). This of course reduces the resolution of the data, but allows us to get more of it. 

What do we use this sequencing information for?

For the sequencing that I've been using for my PhD, I got an average of about 10,000 DNA sequences per sample. I then needed to do quite a lot of computational work and data processing - described simply, I remove any sequences that don't meet the quality criteria, see which ones are the same, and then classify them to their closest relatives using databases (like I've described above for individuals). I can then use the number of sequences from each individual to see how abundant it is in our whole community (just as we would if we went out and did a survey of the number of animal species in a jungle, etc.), and we can see how it changes over time, whether it occurs at the same time as another species, whether it only occurs when another species isn't present, and so much more! 

When we put this information about the microbial community together with other information about the environment it was found in, or how we grew it, we can make predictions about where each of the community members might be found, or what they might be doing, even if we haven't been able to culture the individual bacteria. For my PhD, I've been using the sequencing information to try to identify bacteria that might be associated with an ability to degrade plastics. Here we can see the abundance (as a percentage) of different bacterial species growing on either polyethylene (PE) or polystyrene (PS), where each species is a different colour:

We can see that although many of the species occur on both PE and PS, they differ quite drastically in abundance, and in fact, one of them (Marinobacter hydrocarbonoclasticus) makes up almost 50% of the bacterial community on PE (in this case, this is a species that has been previously suggested to be able to degrade plastics). Once we have identified plastic-degrading candidates, we can try to isolate them, or compare them with isolates that we already have, to test whether they are actually able to degrade plastics. If we don't already have isolates of them, then we can look at the kinds of conditions that close relatives need for growth, and use this information to help us obtain them.

bottom of page