Saturday, September 15, 2012

Where Do the Data Go?

Oceanographers normally collect large amounts of data in the course of the work at sea. In the background on most academic research vessels are the sensors deployed to measure meteorological conditions (wind speed and direction, air temperature and barometric pressure, humidity and precipitation, and long and short wave solar radiation) and sea surface conditions (seawater temperature, salinity, and fluorescence) continuously as the ship moves along the trackline from the time it leaves port to when it returns.

The New Horizon's bridge and above it the meteorological sensors. Note the two anemometers on either side, presently measuring winds of 19 knots and out of a direction of 318 degrees relative to the vessel. After correcting for the ship's speed and heading, this corresponds to a true wind speed of 14 knots out of 21 degrees (i.e., just east of north) [Photo: G. Lawson]

On our cruise, additional data are collected continuously by acoustic transducers attached to the hull of the ship to measure backscattering at various frequencies (an indicator of plankton and nekton living in the water column). A hose mounted on the bow pulls in air to measure the partial pressure of CO2 (pCO2) and the water from the uncontaminated seawater line is used to measure pCO2, Dissolved Inorganic Carbon (DIC), and pH continuously. At stations, more data are collected by the instruments deployed over the side of the ship.  The CTD/rosette with the Video Plankton Recorder attached deployed to 1000 m, or the CTD/rosette deployed to 3000 m collects pressure, temperature, salinity, fluorescence, oxygen, and light transmission data, and hundreds of Gigabytes of video pictures of plankton. The MOCNESS towed to 1000 m, measures pressure, temperature, and salinity while collecting zooplankton in 8 depth strata between 1000 m and the surface on the up-portion of the tow, and the HammarHead towed body collects broad-band acoustics data as well as pressure, temperature, salinity, and fluorescence at selected depths. The Reeve Net, used to collect animals for live work and other experimental purposes, also has a time-depth recorder to provide a record of the tow.

In the lab on the ship more experimental data are generated in the analysis of the water samples from the Niskin bottles on the rosette that go to depth open and are closed at specific depths on the way back to the surface. These include pH, alkalinity, nutrients (phosphorus, nitrate, nitrite), pCO2, DIC, and Dissolved Organic Carbon (DOC).  Furthermore, on board, there are the data being generated from physiological, morphological, and genetic studies being conducted on the pteropods. In order to keep track of all of the data being collected, an electronic event log (E-Log) is kept that records the beginning and end of every over the side deployment of the instruments including the instrument name,  time, ship position, depth of the cast, water depth, station number, transect number, and person responsible. On this cruise we have an IPad that can be taken around the ship to where events are happening and used to log the event via a wireless connection to the main event log server. The total amount of data can be in the 100’s of megabytes to a few terabytes, by the time the cruise ends. So what happens to all of these data sets at the end of the cruise and some which are not produced until samples get back to the laboratory for further analyses?

The electronic event logger. This is a web browser-based application running from a server on the ship that can be accessed by any computer on the ship's network. We use it to keep track of when and where each event (e.g., instrument deployments, the ship arriving on station, etc) occurs. This is key to later data analysis.

Gareth Lawson using the IPad to enter a CTD recovery into the E-Log [Photo: P. Wiebe]

The answer is that the research funds come with a requirement for data sharing.  Since this cruise has been funded by the biological oceanography section at the National Science Foundation (NSF), the data must be submitted to an official data repository and made publically available within a two year time period or sooner if possible. The repository these data will be submitted to is the Biological and Chemical Oceanography Data Management Office ( located in Woods Hole, MA. The BCO-DMO has a mandate to serve principal investigators funded by the NSF Geosciences Directorate (GEO), Division of Ocean Sciences (OCE) Biological and Chemical Oceanography Programs, and Office of Polar Programs (OPP) Antarctic Sciences (ANT) Organisms & Ecosystems Program. The BCO-DMO manages a repository where marine biogeochemical and ecological data and information developed in the course of scientific research can easily be stored, protected, and disseminated on short and intermediate time-frames. Ultimately the data will be sent to permanent archives like the National Oceanographic Data Center.

The data (and metadata) in the BCO-DMO repository are readily available to anyone with a computer and web browser via the internet. They are available either in a text-based format or in a graphical map-server form.  Anyone reading this blog can go to the BCO-DMO web site and locate data from this cruise (once they are submitted) or other cruises.  It is important to remember that using other peoples data requires informing them if you intend to use them for some reason.

Researcher's End Game

When all is said and done
And we are long since gone
What will remain to be distributed
Are the data we contributed
With digital identifiers assigned
And our names clearly defined
Our work will be on-line
Until the end-of-time.
- PHW 16 June 2008

- Peter Wiebe


  1. very well written blog. i wish i can spend some more time in Desert Safari Dubai

  2. What is DATA in it. is that ship named DATA or any thing else you are calling DATA. i didn't get you. can you explain it Dubai Desert Safari

  3. Desert Safari Dubai Best tours In dubai Hummer Desert Safari Dubai very amazing plate form . i love to visit Dubai.