Global biodiversity in freshwater and the oceans is declining at high rates. Reliable tools for assessing and monitoring aquatic biodiversity, especially for rare and secretive species, are important for efficient and timely management. Recent advances in DNA sequencing have provided a new tool for species detection from DNA present in the environment. In this study, we tested whether an environmental DNA ( eDNA ) metabarcoding approach, using water samples, can be used for addressing significant questions in ecology and conservation. Two key aquatic vertebrate groups were targeted: amphibians and bony fish. The reliability of this method was cautiously validated in silico, in vitro and in situ. When compared with traditional surveys or historical data, eDNA metabarcoding showed a much better detection probability overall. For amphibians, the detection probability with eDNA metabarcoding was 0.97 ( CI = 0.90–0.99) vs. 0.58 ( CI = 0.50–0.63) for traditional surveys. For fish, in 89% of the studied sites, the number of taxa detected using the eDNA metabarcoding approach was higher or identical to the number detected using traditional methods. We argue that the proposed DNA ‐based approach has the potential to become the next‐generation tool for ecological studies and standardized biodiversity monitoring in a wide range of aquatic ecosystems. see also the Perspective by Hoffmann, Schubert and Calvignac‐Spencer
Hybridization among diverging lineages is common in nature. Genomic data provide a special opportunity to characterize the history of hybridization and the genetic basis of speciation. We review existing methods and empirical studies to identify recent advances in the genomics of hybridization, as well as issues that need to be addressed. Notable progress has been made in the development of methods for detecting hybridization and inferring individual ancestries. However, few approaches reconstruct the magnitude and timing of gene flow, estimate the fitness of hybrids or incorporate knowledge of recombination rate. Empirical studies indicate that the genomic consequences of hybridization are complex, including a highly heterogeneous landscape of differentiation. Inferred characteristics of hybridization differ substantially among species groups. Loci showing unusual patterns – which may contribute to reproductive barriers – are usually scattered throughout the genome, with potential enrichment in sex chromosomes and regions of reduced recombination. We caution against the growing trend of interpreting genomic variation in summary statistics across genomes as evidence of differential gene flow. We argue that converting genomic patterns into useful inferences about hybridization will ultimately require models and methods that directly incorporate key ingredients of speciation, including the dynamic nature of gene flow, selection acting in hybrid populations and recombination rate variation.
The rapid expansion of next‐generation sequencing has yielded a powerful array of tools to address fundamental biological questions at a scale that was inconceivable just a few years ago. Various genome‐partitioning strategies to sequence select subsets of the genome have emerged as powerful alternatives to whole‐genome sequencing in ecological and evolutionary genomic studies. High‐throughput targeted capture is one such strategy that involves the parallel enrichment of preselected genomic regions of interest. The growing use of targeted capture demonstrates its potential power to address a range of research questions, yet these approaches have yet to expand broadly across laboratories focused on evolutionary and ecological genomics. In part, the use of targeted capture has been hindered by the logistics of capture design and implementation in species without established reference genomes. Here we aim to (i) increase the accessibility of targeted capture to researchers working in nonmodel taxa by discussing capture methods that circumvent the need of a reference genome, (ii) highlight the evolutionary and ecological applications where this approach is emerging as a powerful sequencing strategy and (iii) discuss the future of targeted capture and other genome‐partitioning approaches in the light of the increasing accessibility of whole‐genome sequencing. Given the practical advantages and increasing feasibility of high‐throughput targeted capture, we anticipate an ongoing expansion of capture‐based approaches in evolutionary and ecological research, synergistic with an expansion of whole‐genome sequencing.
DNA barcoding has had a major impact on biodiversity science. The elegant simplicity of establishing massive scale databases for a few barcode loci is continuing to change our understanding of species diversity patterns, and continues to enhance human abilities to distinguish among species. Capitalizing on the developments of next generation sequencing technologies and decreasing costs of genome sequencing, there is now the opportunity for the DNA barcoding concept to be extended to new kinds of genomic data. We illustrate the benefits and capacity to do this, and also note the constraints and barriers to overcome before it is truly scalable. We advocate a twin track approach: (i) continuation and acceleration of global efforts to build the DNA barcode reference library of life on earth using standard DNA barcodes and (ii) active development and application of extended DNA barcodes using genome skimming to augment the standard barcoding approach.
Preserving biodiversity is a global challenge requiring data on species’ distribution and abundance over large geographic and temporal scales. However, traditional methods to survey mobile species’ distribution and abundance in marine environments are often inefficient, environmentally destructive, or resource‐intensive. Metabarcoding of environmental DNA ( eDNA ) offers a new means to assess biodiversity and on much larger scales, but adoption of this approach for surveying whole animal communities in large, dynamic aquatic systems has been slowed by significant unknowns surrounding error rates of detection and relevant spatial resolution of eDNA surveys. Here, we report the results of a 2.5 km eDNA transect surveying the vertebrate fauna present along a gradation of diverse marine habitats associated with a kelp forest ecosystem. Using PCR primers that target the mitochondrial 12S rRNA gene of marine fishes and mammals, we generated eDNA sequence data and compared it to simultaneous visual dive surveys. We find spatial concordance between individual species’ eDNA and visual survey trends, and that eDNA is able to distinguish vertebrate community assemblages from habitats separated by as little as ~60 m. eDNA reliably detected vertebrates with low false‐negative error rates (1/12 taxa) when compared to the surveys, and revealed cryptic species known to occupy the habitats but overlooked by visual methods. This study also presents an explicit accounting of false negatives and positives in metabarcoding data, which illustrate the influence of gene marker selection, replication, contamination, biases impacting eDNA count data and ecology of target species on eDNA detection rates in an open ecosystem.
Hybrid zones have been promoted as windows on the evolutionary process and as laboratories for studying divergence and speciation. Patterns of divergence between hybridizing species can now be characterized on a genomewide scale, and recent genome scans have focused on the presence of ‘islands’ of divergence. Patterns of heterogeneous genomic divergence may reflect differential introgression following secondary contact and provide insights into which genome regions contribute to local adaptation, hybrid unfitness and positive assortative mating. However, heterogeneous genome divergence can also arise in the absence of any gene flow, as a result of variation in selection and recombination across the genome. We suggest that to understand hybrid zone origins and dynamics, it is essential to distinguish between genome regions that are divergent between pure parental populations and regions that show restricted introgression where these populations interact in hybrid zones. The latter, more so than the former, reveal the likely genetic architecture of reproductive isolation. Mosaic hybrid zones, because of their complex structure and multiple contacts, are particularly good subjects for distinguishing primary intergradation from secondary contact. Comparisons among independent hybrid zones or transects that involve the ‘same’ species pair can also help to distinguish between divergence with gene flow and secondary contact. However, data from replicate hybrid zones or replicate transects do not reveal consistent patterns; in a few cases, patterns of introgression are similar across independent transects, but for many taxa, there is distinct lack of concordance, presumably due to variation in environmental context and/or variation in the genetics of the interacting populations.
Organisms continuously release DNA into their environments via shed cells, excreta, gametes and decaying material. Analysis of this ‘environmental DNA ’ ( eDNA ) is revolutionizing biodiversity monitoring. eDNA outperforms many established survey methods for targeted detection of single species, but few studies have investigated how well eDNA reflects whole communities of organisms in natural environments. We investigated whether eDNA can recover accurate qualitative and quantitative information about fish communities in large lakes, by comparison to the most comprehensive long‐term gill‐net data set available in the UK . Seventy‐eight 2L water samples were collected along depth profile transects, gill‐net sites and from the shoreline in three large, deep lakes (Windermere, Bassenthwaite Lake and Derwent Water) in the English Lake District. Water samples were assayed by eDNA metabarcoding of the mitochondrial 12S and cytochrome b regions. Fourteen of the 16 species historically recorded in Windermere were detected using eDNA , compared to four species in the most recent gill‐net survey, demonstrating eDNA is extremely sensitive for detecting species. A key question for biodiversity monitoring is whether eDNA can accurately estimate abundance. To test this, we used the number of sequence reads per species and the proportion of sampling sites in which a species was detected with eDNA (i.e. site occupancy) as proxies for abundance. eDNA abundance data consistently correlated with rank abundance estimates from established surveys. These results demonstrate that eDNA metabarcoding can describe fish communities in large lakes, both qualitatively and quantitatively, and has great potential as a complementary tool to established monitoring methods.
Gene flow is a fundamental evolutionary force in adaptation that is especially important to understand as humans are rapidly changing both the natural environment and natural levels of gene flow. Theory proposes a multifaceted role for gene flow in adaptation, but it focuses mainly on the disruptive effect that gene flow has on adaptation when selection is not strong enough to prevent the loss of locally adapted alleles. The role of gene flow in adaptation is now better understood due to the recent development of both genomic models of adaptive evolution and genomic techniques, which both point to the importance of genetic architecture in the origin and maintenance of adaptation with gene flow. In this review, we discuss three main topics on the genomics of adaptation with gene flow. First, we investigate selection on migration and gene flow. Second, we discuss the three potential sources of adaptive variation in relation to the role of gene flow in the origin of adaptation. Third, we explain how local adaptation is maintained despite gene flow: we provide a synthesis of recent genomic models of adaptation, discuss the genomic mechanisms and review empirical studies on the genomics of adaptation with gene flow. Despite predictions on the disruptive effect of gene flow in adaptation, an increasing number of studies show that gene flow can promote adaptation, that local adaptations can be maintained despite high gene flow, and that genetic architecture plays a fundamental role in the origin and maintenance of local adaptation with gene flow.
Vertebrates harbour microbes both internally and externally, and collectively, these microorganisms (the ‘microbiome’) contain genes that outnumber the host's genetic information 10‐fold. The majority of the microorganisms associated with vertebrates are found within the gut, where they influence host physiology, immunity and development. The development of next‐generation sequencing has led to a surge in effort to characterize the microbiomes of various vertebrate hosts, a necessary first step to determine the functional role these communities play in host evolution or ecology. This shift away from a culture‐based microbiological approach, limited in taxonomic breadth, has resulted in the emergence of patterns suggesting a core vertebrate microbiome dominated by members of the bacterial phyla Bacteroidetes, Proteobacteria and Firmicutes. Still, there is a substantial variation in the methodology used to characterize the microbiome, from differences in sample type to issues of sampling captive or wild hosts, and the majority (>90%) of studies have characterized the microbiome of mammals, which represent just 8% of described vertebrate species. Here, we review the state of microbiome studies of nonmammalian vertebrates and provide a synthesis of emerging patterns in the microbiome of those organisms. We highlight the importance of collection methods, and the need for greater taxonomic sampling of natural rather than captive hosts, a shift in approach that is needed to draw ecologically and evolutionarily relevant inferences. Finally, we recommend future directions for vertebrate microbiome research, so that attempts can be made to determine the role that microbial communities play in vertebrate biology and evolution.
Measuring the effects of selection on the genome imposed by human‐altered environment is currently a major goal in ecological genomics. Given the polygenic basis of most phenotypic traits, quantitative genetic theory predicts that selection is expected to cause subtle allelic changes among covarying loci rather than pronounced changes at few loci of large effects. The goal of this study was to test for the occurrence of polygenic selection in both North Atlantic eels (European Eel, Anguilla anguilla and American Eel, A. rostrata ), using a method that searches for covariation among loci that would discriminate eels from ‘control’ vs. ‘polluted’ environments and be associated with specific contaminants acting as putative selective agents. RAD ‐seq libraries resulted in 23 659 and 14 755 filtered loci for the European and American Eels, respectively. A total of 142 and 141 covarying markers discriminating European and American Eels from ‘control’ vs. ‘polluted’ sampling localities were obtained using the Random Forest algorithm. Distance‐based redundancy analyses (db‐ RDA s) were used to assess the relationships between these covarying markers and concentration of 34 contaminants measured for each individual eel. PCB 153, 4′4′ DDE and selenium were associated with covarying markers for both species, thus pointing to these contaminants as major selective agents in contaminated sites. Gene enrichment analyses suggested that sterol regulation plays an important role in the differential survival of eels in ‘polluted’ environment. This study illustrates the power of combining methods for detecting signals of polygenic selection and for associating variation of markers with putative selective agents in studies aiming at documenting the dynamics of selection at the genomic level and particularly so in human‐altered environments.
Population differentiation (PD) and ecological association (EA) tests have recently emerged as prominent statistical methods to investigate signatures of local adaptation using population genomic data. Based on statistical models, these genomewide testing procedures have attracted considerable attention as tools to identify loci potentially targeted by natural selection. An important issue with PD and EA tests is that incorrect model specification can generate large numbers of false‐positive associations. Spurious association may indeed arise when shared demographic history, patterns of isolation by distance, cryptic relatedness or genetic background are ignored. Recent works on PD and EA tests have widely focused on improvements of test corrections for those confounding effects. Despite significant algorithmic improvements, there is still a number of open questions on how to check that false discoveries are under control and implement test corrections, or how to combine statistical tests from multiple genome scan methods. This tutorial study provides a detailed answer to these questions. It clarifies the relationships between traditional methods based on allele frequency differentiation and EA methods and provides a unified framework for their underlying statistical tests. We demonstrate how techniques developed in the area of genomewide association studies, such as inflation factors and linear mixed models, benefit genome scan methods and provide guidelines for good practice while conducting statistical tests in landscape and population genomic applications. Finally, we highlight how the combination of several well‐calibrated statistical tests can increase the power to reject neutrality, improving our ability to infer patterns of local adaptation in large population genomic data sets.
Reference is regularly made to the power of new genomic sequencing approaches. Using powerful technology, however, is not the same as having the necessary power to address a research question with statistical robustness. In the rush to adopt new and improved genomic research methods, limitations of technology and experimental design may be initially neglected. Here, we review these issues with regard to RNA sequencing ( RNA ‐seq). RNA ‐seq adds large‐scale transcriptomics to the toolkit of ecological and evolutionary biologists, enabling differential gene expression ( DE ) studies in nonmodel species without the need for prior genomic resources. High biological variance is typical of field‐based gene expression studies and means that larger sample sizes are often needed to achieve the same degree of statistical power as clinical studies based on data from cell lines or inbred animal models. Sequencing costs have plummeted, yet RNA ‐seq studies still underutilize biological replication. Finite research budgets force a trade‐off between sequencing effort and replication in RNA ‐seq experimental design. However, clear guidelines for negotiating this trade‐off, while taking into account study‐specific factors affecting power, are currently lacking. Study designs that prioritize sequencing depth over replication fail to capitalize on the power of RNA ‐seq technology for DE inference. Significant recent research effort has gone into developing statistical frameworks and software tools for power analysis and sample size calculation in the context of RNA ‐seq DE analysis. We synthesize progress in this area and derive an accessible rule‐of‐thumb guide for designing powerful RNA ‐seq experiments relevant in eco‐evolutionary and clinical settings alike.
Genomewide scans for natural selection ( GWSS ) have become increasingly common over the last 15 years due to increased availability of genome‐scale genetic data. Here, we report a representative survey of GWSS from 1999 to present and find that (i) between 1999 and 2009, 35 of 49 (71%) GWSS focused on human, while from 2010 to present, only 38 of 83 (46%) of GWSS focused on human, indicating increased focus on nonmodel organisms; (ii) the large majority of GWSS incorporate interpopulation or interspecific comparisons using, for example F ST , cross‐population extended haplotype homozygosity or the ratio of nonsynonymous to synonymous substitutions; (iii) most GWSS focus on detection of directional selection rather than other modes such as balancing selection; and (iv) in human GWSS , there is a clear shift after 2004 from microsatellite markers to dense SNP data. A survey of GWSS meant to identify loci positively selected in response to severe hypoxic conditions support an approach to GWSS in which a list of a priori candidate genes based on potential selective pressures are used to filter the list of significant hits a posteriori. We also discuss four frequently ignored determinants of genomic heterogeneity that complicate GWSS: mutation, recombination, selection and the genetic architecture of adaptive traits. We recommend that GWSS methodology should better incorporate aspects of genomewide heterogeneity using empirical estimates of relevant parameters and/or realistic, whole‐chromosome simulations to improve interpretation of GWSS results. Finally, we argue that knowledge of potential selective agents improves interpretation of GWSS results and that new methods focused on correlations between environmental variables and genetic variation can help automate this approach.
After decades of discussion, there is little consensus on the extent to which hybrids between endangered and nonendangered species should be protected by US law. As increasingly larger, genome‐scale data sets are developed, we can identify individuals and populations with even trace levels of genetic admixture, making the ‘hybrid problem’ all the more difficult. We developed a decision‐tree framework for evaluating hybrid protection, including both the processes that produced hybrids (human‐mediated or natural) and the ecological impact of hybrids on natural ecosystems. We then evaluated our decision tree for four case studies drawn from our own work and briefly discuss several other cases from the literature. Throughout, we highlight the management outcomes that our approach provides and the nuances of hybridization as a conservation problem.
Investigating how environmental features shape the genetic structure of populations is crucial for understanding how they are potentially adapted to their habitats, as well as for sound management. In this study, we assessed the relative importance of spatial distribution, ocean currents and sea surface temperature (SST) on patterns of putatively neutral and adaptive genetic variation among American lobster from 19 locations using population differentiation (PD) approaches combined with environmental association (EA) analyses. First, PD approaches (using bayescan , arlequin and outflank ) found 28 outlier SNPs putatively under divergent selection and 9770 neutral SNPs in common. Redundancy analysis revealed that spatial distribution, ocean current‐mediated larval connectivity and SST explained 31.7% of the neutral genetic differentiation, with ocean currents driving the majority of this relationship (21.0%). After removing the influence of spatial distribution, no SST were significant for putatively neutral genetic variation whereas minimum annual SST still had a significant impact and explained 8.1% of the putatively adaptive genetic variation. Second, EA analyses (using Pearson correlation tests, bayescenv and lfmm ) jointly identified seven SNPs as candidates for thermal adaptation. Covariation at these SNPs was assessed with a spatial multivariate analysis that highlighted a significant temperature association, after accounting for the influence of spatial distribution. Among the 505 candidate SNPs detected by at least one of the three approaches, we discovered three polymorphisms located in genes previously shown to play a role in thermal adaptation. Our results have implications for the management of the American lobster and provide a foundation on which to predict how this species will cope with climate change.
Recent advances in high‐throughput methods of molecular analyses have led to an explosion of studies generating large‐scale ecological data sets. In particular, noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in‐depth assessments of the composition, functions and dynamic changes of complex microbial communities. Because even a single high‐throughput experiment produces large amount of data, powerful statistical techniques of multivariate analysis are well suited to analyse and interpret these data sets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular data set. In this review, we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and data set structure.
Microorganisms play a crucial role in the biological decomposition of plant litter in terrestrial ecosystems. Due to the permanently changing litter quality during decomposition, studies of both fungi and bacteria at a fine taxonomic resolution are required during the whole process. Here we investigated microbial community succession in decomposing leaf litter of temperate beech forest using pyrotag sequencing of the bacterial 16S and the fungal internal transcribed spacer ( ITS ) rRNA genes. Our results reveal that both communities underwent rapid changes. Proteobacteria, Actinobacteria and Bacteroidetes dominated over the entire study period, but their taxonomic composition and abundances changed markedly among sampling dates. The fungal community also changed dynamically as decomposition progressed, with ascomycete fungi being increasingly replaced by basidiomycetes. We found a consistent and highly significant correlation between bacterial richness and fungal richness ( R = 0.76, P < 0.001) and community structure ( R M antel = 0.85, P < 0.001), providing evidence of coupled dynamics in the fungal and bacterial communities. A network analysis highlighted nonrandom co‐occurrences among bacterial and fungal taxa as well as a shift in the cross‐kingdom co‐occurrence pattern of their communities from the early to the later stages of decomposition. During this process, macronutrients, micronutrients, C:N ratio and pH were significantly correlated with the fungal and bacterial communities, while bacterial richness positively correlated with three hydrolytic enzymes important for C, N and P acquisition. Overall, we provide evidence that the complex litter decay is the result of a dynamic cross‐kingdom functional succession.
Natural history collections provide an immense record of biodiversity on Earth. These repositories have traditionally been used to address fundamental questions in biogeography, systematics and conservation. However, they also hold the potential for studying evolution directly. While some of the best direct observations of evolution have come from long‐term field studies or from experimental studies in the laboratory, natural history collections are providing new insights into evolutionary change in natural populations. By comparing phenotypic and genotypic changes in populations through time, natural history collections provide a window into evolutionary processes. Recent studies utilizing this approach have revealed some dramatic instances of phenotypic change over short timescales in response to presumably strong selective pressures. In some instances, evolutionary change can be paired with environmental change, providing a context for potential selective forces. Moreover, in a few cases, the genetic basis of phenotypic change is well understood, allowing for insight into adaptive change at multiple levels. These kinds of studies open the door to a wide range of previously intractable questions by enabling the study of evolution through time, analogous to experimental studies in the laboratory, but amenable to a diversity of species over longer timescales in natural populations.