Open Science and Cyanobacterial Research at EPA

By: Jeff Hollister, Betty Kreakie, and Bryan Milstead

Green, algal-filled pond

Algal bloom containing cyanobacteria.

It wasn’t long ago that science always occurred along a well-worn path. Observations led to hypotheses; hypotheses led to data collection; data led to analyses; and analyses led to publications. And along this path, data, hypotheses, and analyses were held close and, more often than not, the only public-facing view of the research was the final publication.

Science has come a long way with this model.  However, it was conceived when print was the main media and most scientific questions could be investigated by few scientists over a short period of time.

Then came computers. Then came the internet.

Just like in every other aspect of modern life, these advances are greatly impacting science. It has changed who conducts our science, how we share it, and how others interact with scientific information. All of these changes are playing out through the increasing openness of all parts of the scientific process.

This broad area has been defined as having several components. These components suggest that “open science”:

  • is transparent (and, of course, open)
  • includes all parts of research (data, code, etc.)
  • allows others to repeat the work
  • should be posted on an open and accessible website (while protecting Personally Identifiable Information, etc.)
  • occurs along a gradient (i.e. not just a binary open vs. not open)

At EPA, we are learning how to make our research on cyanobacteria and human health (for more info join our webinar) meet those criteria.  We are implementing open science in three ways: (1) making our work available via open access publishing; (2) providing access to the code used in our analysis; and (3) making our data openly available.

Several members of our research group have embraced open access options for publishing their research. For instance, our colleague Elizabeth Hilborn and her co-authors published results of their study—examining a group of dialysis patients following exposure to the cyanobacteria toxin microcystin—in one of the pioneering open access journals, PLoS ONE. Also in PLoS ONE, EPA scientist Bryan Milstead and his collaborators published a modeling method to combine the U.S. Geological Survey’s SPARROW model (a modeling tool for interpreting regional water-quality monitoring data), lake depth, lake volume, and EPA National Lakes Assessment data to estimate nutrient concentrations.

As our work progresses, we will continue to choose open access journals. In our experience, this has allowed our research to reach a larger audience and we can more easily track the impact through readership levels using available tools such as PLoS Article Level Metrics.

We are also sharing our data. Currently, this is accomplished through supplements added to publications and through sites such as the EPA’s Environmental Dataset Gateway. We plan to expand these efforts via data publications, version-controlled repositories, and through the development of Application Programming Interfaces (APIs) that provide access to data for developers and other scientists.

The goal of these efforts, and more (stay tuned for a future post on how coding fits in to open science), is to increase the reproducibility of our work (but challenges remain), reach broader audiences, and eventually have a greater impact on our understanding and management of harmful algal blooms.

About the Authors: EPA ecologists Jeff Hollister, Betty Kreakie and Bryan Milstead study greenwater for a living. If you have questions for them, join the webinar on June 25th or follow the twitter chat on June 26th using #greenwater.