contestada

In real life, sequencing data is never as clean and clear as the sequencing data presented here. Typically, sequencing machines do not read eDNA sequences perfectly, occasionally mistaking a C for a T, for instance (though this is often <1% of the time). Additionally, there will even be some variability of the barcode among individuals of the same species! How can sequences without 100% identity to a reference still be useful for biodiversity assessment

Respuesta :

Answer:

by using consensus sequences and complementing DNA barcoding data with other sources of information (e.g., morphological, and ecological data)  

Explanation:

DNA barcoding refers to the taxonomic method used to identify and classify species by comparing DNA sequences. Some of the most used barcoding genes include the 16 subunit (16S) of the ribosomal RNA (rRNA) gene in prokaryotes, the Cytochrome c Oxidase I (COI) gene in animals, the Internal Transcribed Spacer (ITS) sequence in fungi and the RuBisCO gene in plants. A good gene for DNA barcoding purposes must exhibit intraspecific variability, conserved regions in order to synthesize appropriate PCR primers, and the sequence need to be short in length (100 to 1000 bases). The main problem associated with this identification method is the confidence value between genetic sequences. For example, when COI sequences are identical, there is a 6 percent of chance that these sequences belong to different species. On the other hand, the percentage of intraspecific variability among DNA barcoding sequences is variable, even among closely related species. This problem, as well as bioinformatic errors related to sequence reads, can be partially overcome by aligning genetic sequences (including outgroups) in order to obtain consensus sequences. Moreover, taxonomists reasonably argue that DNA barcoding data should be complemented with morphological and ecological data in order to achieve accurate species identification.