hakai-api

View the Project on GitHub HakaiInstitute/hakai-api

Simplified API Documentation

This is a subset of all the documentaion that explains basic Hakai API usage. It should be sufficient for building scripts that pull data from the database and run statitical analysis, etc. This information can also be found at http://hakaiinstitute.github.io/hakai-api

Overview

Table of Contents

What is the Hakai API

This server is essentially a website that provides JSON data. These URLs are intended to be requested from script instead of a web browser to load data into that script without having to have a local copy of it. All url requests listed should be be made to a Hakai API server (https://hecate.hakai.org/api). Each of the listed endpoints in this documentation is a location on this website that provides a specific kind of dataset. For example, if you would like chlorophyll data you would make a request to https://hecate.hakai.org/api/eims/views/output/chlorophyll since the listed endpoint is /eims/views/output/chlorophyll.

Audience

This document is intended for people building scripts to analyze scientific data. It contains information about how to request data from the Hakai EIMS database to automate the process of downloading and loading data into scripts that do statistical analysis, etc.

Required tools

To make requests to these urls you must have a HTTP request library that submits credentials that allow you to read our private datasets. Hakai IT maintains two libraries that do this, namely hakai-api-client-python and hakai-api-client-r. For instructions on how to install and use these, see their individual documentation.

Getting data

You can filter the data returned from any of these endpoints by adding querystring parameters to your request url. A querystring is a text string you place after a “?” in your url to dictate some parameters you want to pass to the server. Typically, querystrings consists one or more key=value pairs joined with “&” symbols. An example would be http://example.com?foo=bar&year=2017, where the url is http://example.com and the querystring is ?foo=bar&year=2017. In the context of this API, querystrings are used to filter and sort data. You can read about all the possible query strings for this API in the querying data documentation. For a more gentle introduction to data filtering, see the data filtering crash course section.

Data filtering crash course

This short tutorial assumes you are using the hakai-api-client-python library, although the instructions should be simple to adapt for the R client as well.

Say you wanted to get all chlorophyll data for the year 2016 that was collected on the KWAK survey. To do this, you could use the following workflow:

  1. Request some data to see what attributes it has.

Just make a request to the chlorophyll data endpoint without any querystring parameters to see what kind of results you get back. The attribute names in the result will be what you use to filter this particular dataset.

  # Example using hakai_api_client_python library
  from hakai_api import Client
  client = Client()
  response = client.get("https://hecate.hakai.org/api/eims/views/output/chlorophyll")

  data = response.json()

  print(data)
  # [
  #   {
  #       "no": "1",
  #       "action": "",
  #       "event_pk": 7065,
  #       "rn": "1",
  #       "date": "2012-05-17",
  #       "work_area": "CALVERT",
  #       "survey": "KWAK",
  #       "sampling_bout": 1,
  #       "site_id": "PRUTH",
  #       "lat": 51.6554,
  #       "long": -128.0913,
  #       "gather_lat": null,
  #       "gather_long": null,
  #       "collection_method": null,
  #       "line_out_depth": 5,
  #       "pressure_transducer_depth": null,
  #       "volume": 250,
  #       "collected": "2012-05-17T17:48:00.000Z",
  #       "preserved": "2012-05-17T07:00:00.000Z",
  #       "analyzed": null,
  #       "lab_technician": null,
  #       "project_specific_id": null,
  #       "hakai_id": "CHL3793",
  #       "is_blank": null,
  #       "is_solid_standard": null,
  #       "filter_size_mm": null,
  #       "filter_type": "20",
  #       "acetone_volume_ml": 10,
  #       "flurometer_serial_no": null,
  #       "calibration": null,
  #       "acid_ratio_correction_factor": null,
  #       "acid_coefficient": null,
  #       "calibration_slope": null,
  #       "before_acid": 87.7,
  #       "after_acid": 46.9,
  #       "acid_flag": null,
  #       "chla": 4.7820864,
  #       "chla_flag": "SVC",
  #       "chla_final": 4.7820864,
  #       "phaeo": 1.869350392,
  #       "phaeo_flag": "SVC",
  #       "phaeo_final": 1.869350392,
  #       "analyzing_lab": "HAKAI",
  #       "row_flag": "Results",
  #       "quality_level": "Principal Investigator",
  #       "comments": "",
  #       "quality_log": "1: Results QC''d by BH; Given new Hakai Ids\r2: Given new Hakai Ids\r3: Results QC''d by BH; Given new Hakai Ids; analyzed pre- 2014-12-05"
  #   },
  #   ...etc.
  # ]
  1. Add some filters to request a subset of all the data.

From the previous printout, we see that we have keys in the data like no, action, event_pk, etc. We can provide provide querystring parameters with our request so that the data we get back matches those criteria. Notice that in the following code, we’ve added querystring parameters to do data filtering.

  # Request data where survey is KWAK and the date falls sometime in 2016.
  request = client.get("https://hecate.hakai.org/api/eims/views/output/chlorophyll?survey=KWAK&date>=2016-01-01&date<2017-01-01")
  data = request.json()
  print(data)
  # [
  #   {
  #       "no": "8751",
  #       "action": "",
  #       "event_pk": 30395,
  #       "rn": "1",
  #       "date": "2016-01-16",
  #       "work_area": "CALVERT",
  #       "survey": "KWAK",
  #       "sampling_bout": 1,
  #       "site_id": "KC1",
  #       "lat": 51.6545,
  #       "long": -128.1289,
  #       "gather_lat": null,
  #       "gather_long": null,
  #       "collection_method": null,
  #       "line_out_depth": 0,
  #       "pressure_transducer_depth": null,
  #       "volume": 250,
  #       "collected": "2016-01-17T03:35:13.000Z",
  #       "preserved": "2016-01-16T00:50:30.000Z",
  #       "analyzed": "2016-01-26T18:44:21.000Z",
  #       "lab_technician": "Bryn,Emma",
  #       "project_specific_id": null,
  #       "hakai_id": "CHL4605",
  #       "is_blank": null,
  #       "is_solid_standard": null,
  #       "filter_size_mm": null,
  #       "filter_type": "Bulk GF/F",
  #       "acetone_volume_ml": 10,
  #       "flurometer_serial_no": "720001154",
  #       "calibration": "2015-08-06T07:00:00.000Z",
  #       "acid_ratio_correction_factor": 1.364,
  #       "acid_coefficient": 3.748,
  #       "calibration_slope": 0.0005069,
  #       "before_acid": 109869.3,
  #       "after_acid": 78069.69,
  #       "acid_flag": null,
  #       "chla": 1.33788238698795,
  #       "chla_flag": "AV",
  #       "chla_final": 1.33788238698795,
  #       "phaeo": 3.02402733269205,
  #       "phaeo_flag": "AV",
  #       "phaeo_final": 3.02402733269205,
  #       "analyzing_lab": "HAKAI",
  #       "row_flag": "Results",
  #       "quality_level": "Principal Investigator",
  #       "comments": "",
  #       "quality_log": "1: Results QC''d by BH"
  #   },
  #   ...19 more rows
  # ]
  1. Remove the restriction on the number of records being returned.

In the previous step, if you look at the returned result, you’ll notice there are only 20 rows of data returned even though we expect there to be many more. This is due to a default limit that restricts the api from returning more than 20 records. To remove this limit, add the querystring parameter limit=-1. Please note that you might receive a lot of data when you remove this limit. You should filter your results in some way to prevent your script from crashing due to receiving too much data at once.

  #
  request = client.get("https://hecate.hakai.org/api/eims/views/output/chlorophyll?survey=KWAK&date>=2016-01-01&date<2017-01-01&limit=-1")
  data = request.json() # The data variable now contains all the 2016 KWAK chlorophyll data

  # This will print about 316 data records of chlorophyll data from the KWAK survey collected in 2016
  print(data)

Default limit

By default, all urls will only return 20 records. The reason for this is that if all records where returned, you might for example request a million rows of CTD data at once and cause your script and the api service to crash. Fortunately, the API will restart itself if this happens, but you may need to force close your script to recover. To avoid this, you must add filters to your request as in the above tutorial to reduce the number of records returned before adding the query string parameter limit=-1 to turn off the default sample restriction. For more details about the ways you can filter data, see the querying data documentation.

Endpoints

These URLs are all paths that exist on the API server https://hecate.hakai.org/api. As such, if the given URL is /eims/views/output/akan_cat_ph, make your request to https://hecate.hakai.org/api/eims/views/output/akan_cat_ph.

EIMS data

URL Purpose
/eims/views/output/akan_cat_ph Get json data equivalent to the portal downloaded spreadsheet for akan_cat_ph
/eims/views/output/c13incubations Get json data equivalent to the portal downloaded spreadsheet for c13incubations
/eims/views/output/chlorophyll Get json data equivalent to the portal downloaded spreadsheet for chlorophyll
/eims/views/output/chlorophyll_blanks Get json data equivalent to the portal downloaded spreadsheet for chlorophyll_blanks
/eims/views/output/chlorophyll_solid_standards Get json data equivalent to the portal downloaded spreadsheet for chlorophyll_solid_standards
/eims/views/output/ctd_all Get json data equivalent to the portal downloaded spreadsheet for ctd_all
/eims/views/output/ctd_drops Get json data equivalent to the portal downloaded spreadsheet for ctd_drops
/eims/views/output/do13c Get json data equivalent to the portal downloaded spreadsheet for do13c
/eims/views/output/doc Get json data equivalent to the portal downloaded spreadsheet for doc
/eims/views/output/fgs Get json data equivalent to the portal downloaded spreadsheet for fgs
/eims/views/output/fgs_samples Get json data equivalent to the portal downloaded spreadsheet for fgs_samples
/eims/views/output/fgs_w_poms Get json data equivalent to the portal downloaded spreadsheet for fgs_w_poms
/eims/views/output/filters Get json data equivalent to the portal downloaded spreadsheet for filters
/eims/views/output/hplc Get json data equivalent to the portal downloaded spreadsheet for hplc
/eims/views/output/metadata_all Get json data equivalent to the portal downloaded spreadsheet for metadata_all
/eims/views/output/mg_clams Get json data equivalent to the portal downloaded spreadsheet for mg_clams
/eims/views/output/mg_epiphytes Get json data equivalent to the portal downloaded spreadsheet for mg_epiphytes
/eims/views/output/mg_filtered_epiphytes Get json data equivalent to the portal downloaded spreadsheet for mg_filtered_epiphytes
/eims/views/output/mg_fish Get json data equivalent to the portal downloaded spreadsheet for mg_fish
/eims/views/output/mg_inverts Get json data equivalent to the portal downloaded spreadsheet for mg_inverts
/eims/views/output/mg_macroalgae Get json data equivalent to the portal downloaded spreadsheet for mg_macroalgae
/eims/views/output/mg_mesograzers Get json data equivalent to the portal downloaded spreadsheet for mg_mesograzers
/eims/views/output/mg_seagrass_biomass Get json data equivalent to the portal downloaded spreadsheet for mg_seagrass_biomass
/eims/views/output/mg_seagrass_density Get json data equivalent to the portal downloaded spreadsheet for mg_seagrass_density
/eims/views/output/mg_seagrass_habitat Get json data equivalent to the portal downloaded spreadsheet for mg_seagrass_habitat
/eims/views/output/microbial Get json data equivalent to the portal downloaded spreadsheet for microbial
/eims/views/output/mussels Get json data equivalent to the portal downloaded spreadsheet for mussels
/eims/views/output/nitrates Get json data equivalent to the portal downloaded spreadsheet for nitrates
/eims/views/output/nutrients Get json data equivalent to the portal downloaded spreadsheet for nutrients
/eims/views/output/nutrients_qc Get json data equivalent to the portal downloaded spreadsheet for nutrients_qc
/eims/views/output/o18 Get json data equivalent to the portal downloaded spreadsheet for o18
/eims/views/output/phytoplankton Get json data equivalent to the portal downloaded spreadsheet for phytoplankton
/eims/views/output/pomfas Get json data equivalent to the portal downloaded spreadsheet for pomfas
/eims/views/output/poms Get json data equivalent to the portal downloaded spreadsheet for poms
/eims/views/output/poms_chlorophyll Get json data equivalent to the portal downloaded spreadsheet for poms_chlorophyll
/eims/views/output/poms_doc_do13c_nut_suva_chlorophyll_ysi_ctd Get json data equivalent to the portal downloaded spreadsheet for poms_doc_do13c_nut_suva_chlorophyll_ysi_ctd
/eims/views/output/pon_poc Get json data equivalent to the portal downloaded spreadsheet for pon_poc
/eims/views/output/pop Get json data equivalent to the portal downloaded spreadsheet for pop
/eims/views/output/samples_by_category Get json data equivalent to the portal downloaded spreadsheet for samples_by_category
/eims/views/output/secchi Get json data equivalent to the portal downloaded spreadsheet for secchi
/eims/views/output/soms Get json data equivalent to the portal downloaded spreadsheet for soms
/eims/views/output/staff_gauge Get json data equivalent to the portal downloaded spreadsheet for staff_gauge
/eims/views/output/suva Get json data equivalent to the portal downloaded spreadsheet for suva
/eims/views/output/ysi Get json data equivalent to the portal downloaded spreadsheet for ysi
/eims/views/output/zooplankton_biomass Get json data equivalent to the portal downloaded spreadsheet for zooplankton_biomass
/eims/views/output/zooplankton_isotope Get json data equivalent to the portal downloaded spreadsheet for zooplankton_isotope
/eims/views/output/zooplankton_microscopy Get json data equivalent to the portal downloaded spreadsheet for zooplankton_microscopy
/eims/views/output/zooplankton_total_biomass Get json data equivalent to the portal downloaded spreadsheet for zooplankton_total_biomass
/eims/views/output/zooplankton_tow Get json data equivalent to the portal downloaded spreadsheet for zooplankton_tow

CTD data

URL Purpose
/ctd/views/file/cast Get a list of all ctd casts joined with metadata from the file it was pulled from
/ctd/views/file/cast/data Get a list of all processed ctd data joined with metadata from the cast and file it was pulled from
/ctd/views/file/cast/raw_data Get a list of all unprocessed ctd data joined with metadata from the cast and file it was pulled from

Solo data

URL Purpose
/solo/views/file/cast Get a list of all solo casts joined with metadata from the file it was pulled from

Copyright (c) 2017 Hakai Institute and individual contributors. All Rights Reserved.

ACO data

HTTP Verb URL Function
GET /aco/aois Get a list of AOIS in GeoJSON format.
GET /aco/aois/:aoi_id(\d+) Get a specific AOI using it’s unique id.
POST /aco/aois Add a new AOI by submitting a “FeatureCollection” or “MultiPolygon” GeoJSON polygon.
GET /aco/camera_calibration Get a list of camera calibration parameters.
GET /aco/camera_calibration/:pk(\d+) Get a specific camera calibration parameter object using it’s unique id.
POST /aco/camera_calibration Add new camera calibration data by posting a JSON object with all the required db keys as the object keys.
GET /aco/dces Get a list of all the data collection events (i.e. field report jobs).
GET /aco/dces/:data_collection_event_id(\d+) Get a specific data collection event using an integer unique id.
GET /aco/flights/:flight_id(\d+)/dces Get a list of all data collection events for a specific flight with unique integer flight_id.
POST /aco/flights/:flight_id(\d+)/dces Add a new data collection event with parent flight that has unique id flight_id.
PUT /aco/dces/:data_collection_event_id(\d+) Update an existing data collection event with the specified unique id.
DELETE /aco/dces/:data_collection_event_id(\d+) Delete the data collection event with with specified unique id.
GET /aco/flights Get a list of flights.
GET /aco/flights/:flight_id(\d+) Get a specific flight object with the specified unique id.
POST /aco/flights Add a new flight.
PUT /aco/flights/:flight_id(\d+) Update the flight with the specified unique id.
GET /aco/lever_arm_calibration Get a list of lever arm calibration parameters.
GET /aco/lever_arm_calibration/:lever_arm_calibrations_id(\d+) Get a specific lever arm calibration with the specified unique id.
POST /aco/lever_arm_calibration Add a new lever arm calibration.
GET /aco/persons Get a list of persons.
GET /aco/persons/:person_id(\d+) Get the person with the specified unique id.
POST /aco/persons Add a new person.
GET /aco/phases Get a list of project phases.
GET /aco/phases/:projectphase_id(\d+) Get a project phase object with the specified unique id.
PUT /aco/phases/:projectphase_id(\d+) Update the project phase with the specified unique id.
GET /aco/projects/:project_id(\d+)/phases Get a list of project phases for the project with the specified project_id.
POST /aco/projects/:project_id(\d+)/phases Add a new project phase with parent being the project with project_id.
GET /aco/projects Get a list of projects.
GET /aco/projects/:project_id(\d+) Get a specific project with the specified unique id.
PUT /aco/projects/:project_id(\d+) Update a project with the specified unique id.
POST /aco/projects Add a new project.

Views

The following are provided for convenience and allow accessing multiple database tables at once, after they are joined in the database.

HTTP Verb URL Function
GET /aco/views/flight_time Get a summary of time spent flying, broken down by date.
GET /aco/views/flights Get a list of flights, with some additional columns like pilot name and operator name.
GET /aco/views/flights/:flight_id(\d+) Same as /views/flights, but get a specific flight by id.
GET /aco/views/flights/dces Get a list of flights joined with the associated data collection events.
GET /aco/views/flights/:flight_id(\d+)/dces Get a list of all data collection events for a specific flight and include the flight information.
GET /aco/views/flights/dces/:data_collection_event_id(\d+) Same as /views/flights/:flight_id(\d+)/dces, but get a specific data collection event by id.
GET /aco/views/operators Get a list of operators with name info from associated persons table.
GET /aco/views/operators/:operator_id(\d+) Get the info for a specific operator by id.
GET /aco/views/pilots Get a list of pilots with name info from the associated persons table.
GET /aco/views/pilots/:pilot_id(\d+) Get the info for a specific pilot by id.
GET /aco/views/projects/phases Get a list of projects phases joined with the higher level project details.
GET /aco/views/projects/phases/:projectphase_id(\d+) Same as /views/projects/phases, but get just the info for the project phase with the specified id.
GET /aco/views/projects/:project_id(\d+)/phases Same as /views/projects/phases, but just get project phases where the project id .
GET /aco/views/projects/:project_num(\d{2}_\d{4})/phases/:phase_num(\d+) Get info for project with project_num (eg. 21_3008) and phase_num (e.g. 1). The parameter values are not to be confused with the database id fields.
GET /aco/views/projects/phases/:projectphase_num(\d{2}\d{4}\d{2}) Get info for project with projectphase_num (eg. 21_3008_01).
POST /aco/views/projects/phases Add a new project and phase, simultaneously. Must submit JSON containing all the fields that the POST /projects and POST /projects/:project_id(\d+)/phases expect.
GET /aco/views/projects/phases/aois Get a list of project, project phase, and aoi data joined together.
GET /aco/views/projects/:project_id(\d+)/phases/aoi Same as /views/projects/phases/aois, but restrict so only the rows where project with project_id are returned.
GET /aco/views/projects/:project_num(\d{2}_\d{4})/phases Get info for project joined to project phase data, where the project_num (eg. 21_3008) is equal to the specified project_num parameter.
GET /aco/views/projects/:project_num(\d{2}_\d{4})/phases/aois Same as /views/projects/phases/aois, but restrict so only the rows where project num (e.g. 20_3038) are returned.
GET /aco/views/projects/:project_num(\d{2}_\d{4})/phases/:phase_num(\d+)/aois Get info for project with project_num (eg. 21_3008) and phase_num (e.g. 1), along with aoi data. The parameter values are not to be confused with the database id fields.
GET /aco/views/projects/phases/aois/:projectphase_num(\d{2}\d{4}\d{2}) Get info for project with projectphase_num (eg. 21_3008_01), along with aoi data.
GET /aco/views/projects/phases/:projectphase_id(\d+)/aois Same as /views/projects/phases/aois, but restrict so only the rows where project phase with projectphase_id are returned.
PUT /aco/views/projects/phases/:projectphase_id(\d+)/aois Insert a new AOI into the database (like POST /aois), and update the project phase with projectphase_id to point to this new AOI database record.
GET /aco/views/projects/phases/:projectphase_id(\d+)/aois/dces Get a list of joined project, project phase, aoi, and data collection event data, where the project phase id is equal to the specified projectphase_id param.
GET /aco/views/projects/phases/aois/dces/:data_collection_event_id(\d+) Get a list of joined project, project phase, aoi, and data collection event data, where the data collection event id is equal to the specified data_collection_event_id param.
GET /aco/views/projects/phases/:projectphase_id(\d+)/aois/dces/:data_collection_event_id(\d+) Get a list of joined project, project phase, aoi, and data collection event data, where the data collection event id is equal to the specified data_collection_event_id param and where the project phase id is equal to the specified projectphase_id param.
GET /aco/views/projects/phases/:projectphase_id(\d+)/dces Get a list of joined project, project phase, and data collection event data, where the project phase id is equal to the specified projectphase_id param.
GET /aco/views/projects/phases/dces/:data_collection_event_id(\d+) Get a list of joined project, project phase, and data collection event data, where the data collection event id is equal to the specified data_collection_event_id param.
GET /aco/views/projects/phases/:projectphase_id(\d+)/dces/:data_collection_event_id(\d+) Get a list of joined project, project phase, and data collection event data, where the data collection event id is equal to the specified data_collection_event_id param and where the project phase id is equal to the specified projectphase_id param.
GET /aco/views/projects Get a list of projects, with a nicely formatted project number like e.g. 20_3008.
GET /aco/views/projects/:project_num(\d{2}_\d{4}) Get the data for a specific project with the specified project number like e.g. 21_4016.
GET /aco/views/projects/:project_id(\d+) Get the data for a specific project with the specific project_id unique identifier.