Skip to contents

This function recursively scans the ../eDNA-for-MPAs/data/ directory to locate all CSV files containing "GOTeDNA" in their file name or path. Files are grouped by cruise mission, metadata and data files are paired, naming inconsistencies are corrected, and a unified table of sample-level information is produced.

Usage

data_eDNA()

Value

A data frame containing:

ID

Sample identifier

date

Sampling date (class Date)

latitude

Sample latitude

longitude

Sample longitude (corrected to negative values)

species_richness

Count of unique species detected in the sample

method

eDNA sampling method inferred from folder structure

location

Sampling location identifier

year

Extracted sampling year

Details

The function performs several formatting corrections, including:

  • resolving inconsistent materialSampleID naming (e.g., X, _, -)

  • handling metadata files with different column name conventions

  • resolving location name inconsistencies (e.g., ESI, SAB)

  • extracting sample dates, coordinates, richness, and method type

  • correcting positive longitudes by flipping sign

The final result is a single cleaned data frame with one row per sample.

The function assumes the following:

  • The directory ../eDNA-for-MPAs/data/ exists and contains cruise folders.

  • Metadata files contain either materialSampleID, SampleID, or eDNA_Tube.

  • Species columns in data files may be named Species, species, or V6.

  • Latitude/longitude may be stored as decimalLatitude/decimalLongitude or Lat/Long.

The function prints progress messages (i, j, k, l) and may enter browser() in unexpected or inconsistent cases.

Examples

if (FALSE) { # \dontrun{
  df <- data_eDNA()

  # Quick map preview
  leaflet::leaflet(df) %>%
    leaflet::addTiles() %>%
    leaflet::addCircleMarkers(
      lat = ~latitude,
      lng = ~longitude,
      popup = paste("Lat:", df$latitude, "<br>Lon:", df$longitude)
    )
} # }