| The paper uses 16S rDNA sequencing, which is a bit old fashioned now but it was a good method when the paper was published. The steps basically involve: 1. Extract all DNA from poop, normally using a kit that basically makes DNA stick to tiny plastic beads. You wash the beads in a bunch of different chemical solutions to isolate DNA from the original sample and purify it. There are a lot of different methods to do this. 2. Amplify a small section of DNA that's universally unique to bacteria and archaea which is used as a barcode. This barcode has some areas that change a lot across different species and some areas that don't change much. 3. Sequence the amplified DNA. The DNA sequencer determines the sequence of nucleotides in each DNA amplicon (an amplicon is a piece amplified piece of DNA). An example DNA sequence is ACCTGGCT 3. The DNA sequencer produces millions of DNA sequences in parallel and stores them and some metadata (e.g. quality and confidence measurements) in text files 4. When this paper was published, a friendly bioinformatician would have taken the text file and clustered the different sequences. Sequences 97% similar were binned together as a rough approximation of a species. Different taxonomic levels have different cutoffs, but it's all quite vague and there are better methods now that involve denoising sequences from quality measurements (e.g. dada2 method) 5. A count for each different bin is generated, and "representative sequences" for each bin are matched against taxonomic databases to see what species are present 6. Normal ecological analysis is done on the count data to calculate alpha and beta diversity or do other types of analysis. Once you have counts, it doesn't matter that the data are from bacteria instead of sheep or penguins Newer methods involve sequencing every single bit of DNA in a sample, not just a specific region. This is called metagenomics and it's very hard to do and requires very big computers and big DNA sequencers. |