This function performs spatial quality checks and outputs summary tables and
plots. Checks include percent of observations on land, outside regulatory zone
(spat), and on a zone boundary. If any observation occurs outside the
regulatory zones then summary information on distance from nearest zone is
provided. spatial_qaqc can filter out observations that are not within
the distance specified in filter_dist.
Usage
spatial_qaqc(
dat,
project,
spat,
lon.dat,
lat.dat,
lon.spat = NULL,
lat.spat = NULL,
id.spat = NULL,
epsg = NULL,
date = NULL,
group = NULL,
filter_dist = NULL
)Arguments
- dat
Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
- project
Name of project.
- spat
Spatial data containing information on fishery management or regulatory zones.
sfobjects are recommended, butspobjects can be used as well. If using a spatial table read from a csv file, then argumentslon.spatandlat.spatare required. To upload your spatial data to the FishSETFolder seeload_spatial.- lon.dat
Longitude variable in
dat.- lat.dat
Latitude variable in
dat.- lon.spat
Variable or list from
spatcontaining longitude data. Required for spatial tables read from csv files. Leave asNULLifspatis ansforspobject.- lat.spat
Variable or list from
spatcontaining latitude data. Required for spatial tables read from csv files. Leave asNULLifspatis ansforspobject.- id.spat
Polygon ID column. Required for spatial tables read from csv files. Leave as
NULLifspatis ansforspobject.- epsg
EPSG number. Manually set the epsg code, which will be applied to
spatanddat. If epsg is not specified but is defined forspat, then thespatepsg will be applied todat. In addition, if epsg is not specified and epsg is not defined forspat, then a default epsg value will be applied tospatanddat(epsg = 4326). See http://spatialreference.org/ to help identify optimal epsg number.- date
String, name of date variable. Used to summarize over year. If
NULLthe first date column will be used. Returns an error if no date columns can be found.- group
String, optional. Name of variable to group spatial summary by.
- filter_dist
(Optional) Numeric, distance value to filter primary data by (in meters). Rows containing distance values greater than or equal to
filter_distwill be removed from the data. This action will be saved to the filter table.
Value
A list of plots and/or dataframes depending on whether spatial data quality issues are detected. The list includes:
- dataset
Primary data. Up to five logical columns will be added if spatial issues are found: "ON_LAND" (if obs fall on land), "OUTSIDE_ZONE" (if obs occur at sea but outside zone), "ON_ZONE_BOUNDARY" (if obs occurs on zone boundary), "EXPECTED_LOC" (whether obs occurs at sea, within a zone, and not on zone boundary), and "NEAREST_ZONE_DIST_M" (distance in meters from nearest zone. Applies only to obs outside zone or on land).
- spatial_summary
Dataframe containing the percentage of observations that occur at sea and within zones, on land, outside zones but at sea, or on zone boundary by year and/or group. The total number of observations by year/group are in the "N" column.
- outside_plot
Plot of observations outside regulatory zones.
- land_plot
Plot of observations that fall on land.
- land_out_plot
Plot of observations that occur on land and are outside the regulatory zones (combines outside_plot and land_plot if both occur).
- boundary_plot
Plot of observations that fall on zone boundary.
- expected_plot
Plot of observations that occur at sea and within zones.
- distance_plot
Histogram of distance form nearest zone (meters) by year for observations that are outside regulatory grid.
- distance_freq
Binned frequency table of distance values.
- distance_summary
Dataframe containing the minimum, 1st quartile, median, mean, 3rd quartile, and maximum distance values by year and/or group.
Examples
if (FALSE) {
# run spatial checks
spatial_qaqc("pollockMainDataTable", "pollock", spat = NMFS_AREAS,
lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT")
# filter obs by distance
spat_out <-
spatial_qaqc(pollockMainDataTable, "pollock", spat = NMFS_AREAS,
lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT",
filter_dist = 100)
mod.dat <- spat_out$dataset
}
