Information Management
for Breeding Bird Atlases

atlas grid representation
SOSP CO FL Block 1044
SOSP PR P Block 1321
SOSP PR A Block 1366
SOSP PO X Block 1570
<< back to BBA Explorer

Document presented at the North American Atlas Committee (NORAC) meeting 22 April 2004

CONTENTS

Start Planning Early

Data Management Goal-setting

Sample goals for an atlas project

Projects should list as many goals as possible

Selecting a Data Management System

Key questions for a project coordinator or data management committee to ask:

Table: A comparison of some of the options for managing Breeding Bird Atlas datasets.

List of online data management systems to consider (links)


START PLANNING EARLY

Q: Why read this?
A: You're going to spend 5 years collecting data and more analyzing and writing. Isn't it worth an ounce of planning?

The best recommendation we can give a new atlas project that is still in the planning stage is: start with data management during the planning. For example, when the steering committee is forming a fundraising group and deciding how to collect abundance data, that is the time to begin recording goals that will help form a data management strategy. The task for this individual or small group is to explicitly identify, in a written list, as many atlas goals as possible. The people deciding the goals will not necessarily be the ones implementing the data management system, but should review and recommend options based on project needs. The strategy for data management should include some responsibility for evaluating and continuing to make sure the selected system meets the needs of the project in a timely and productive way.

(back to top)

DATA MANAGEMENT GOAL-SETTING

How do you determine a set of goals with which to evaluate and plan/select a data management system? Here are some sample goals (below) that may help get you started. Not every goal will be accomplished within a data management system, but it is important to have a list of what you expect to be able to do in evaluating progress and results before you start. Your goal list will be longer than the one provided below.

Sample data management goals for an atlas project:

Goal How to evaluate

Summarize the best breeding evidence by species and block for publication maps

Given a species, does the system allow a listing of all blocks and the highest evidence for each?

Produce a map of species richness and of observer coverage effort

Does the system allow a count summary by block? Does it allow summary based on minimum evidence, e.g. PR vs CO?

Obtain information on when birds nest in the province (nest chronology)

Does the system allow for multiple entries of confirmed species per block, e.g. so that both a Nest-building and a Nest with Young code could be recorded.

Produce a list of all data contributors

Does the system allow attribution of sightings to individual observers

Provide rapid response to frequent data requests from state agency

Does the system allow custom requests by species? Does it allow a data entry mechanism that enables keeping up with data entry tasks (e.g. online data entry)

Provide rapid feedback to volunteers, showing their contributed data on a map

Does system provide maps, or (alternatively) does it allow either an easy export format to a mapping system you have access to?

Provide online data entry for all participants to spread data entry work Does the system have a web-based data form? Is it easy enough to use, and does it check for common errors such as safe date violations? Does it let the user print a verification report?

Allow participants to enter a new field card for each trip into the field

Can the system handle data entry and validation to track repeated visits during the season on separate field cards?

Provide some form of technical support

Is someone available to train coordinators & answer questions, or to correct errors in case something unanticipated happens?

The primary areas to explore that relate to data management for goal-setting fall into three categories: 1) timeliness of data capture, 2) completeness of data capture with respect to project goals, and 3) availability of data (e.g. to participants, to reviewers). Weighing different aspects of these categories, you may come up with some more detailed information needs, such as those that follow.

Projects should list as many goals as possible:

    From a project coordinator's point of view:

    From a regional coordinator perspective:

    From an atlas participant perspective:

    From a public user's perspective (non-participant)

(back to top)

SELECTING A DATA MANAGEMENT SYSTEM

After creating an exhaustive list of project goals as possible, it is necessary to prioritize these goals to indicate which are most critical to the project's success. At this stage, identify the goals and tasks for which a data management system could help save a lot of time.  Projects should also determine the resources and expertise available for data management.  Clearly defining goals and accessible resources will help a project evaluate and choose between available data management solutions.

Key questions for a project coordinator or data management oversight team to ask. Reference to "the committee" below means the data management committee.

How do I compare the different options for managing my project's data?

Most systems that would be used to run a BBA project involve two basic components: a core "data management system" which represents underlying data management software purchased from a vendor (e.g. MS Access, Oracle, Filemaker Pro, etc), and then some customization or additions to the data management software that make it useful for managing an atlas, such as table structure, forms for data entry, reports, and so on. If the committee has a list of prioritized goals for your project, then you will have a sound basis for evaluating any data management system. Bear in mind that no data management system will meet all the goals of a project. The committee should "take a tour" of any proposed existing systems they are considering, noting where the pre-existing data management system does and does not meet the goals. The biggest risk at this point is in comparing these completed solutions (e.g. an online system that already exists) to a non-existent system that you plan to build, since it is easy to idealize a future product and underestimate the time and effort required to get there. In the case of considering a new system, also consider the cost and options for modifying existing systems.

Do I have to be a database expert?

Well, you have to be an atlas data expert and be willing to wade through mounds of data in multiple formats - that's just inherent to collecting data. Whether or not you need a database expert depends how large your project is, and how it will be stored. At a very minimum, someone making database decisions needs to be expert enough to verify that data are being stored properly, and that routine maintenance and backup occur.

Do I have to use a database? I am more comfortable with spreadsheets.

Database software provides more sophisticated mechanisms that, if used properly, are better at protecting and enhancing the usefulness of your data, as well as providing tools for reducing error of all types - typos, duplication, out-of-range values, and so on. Spreadsheet software, by contrast, has simpler validation tools, and focuses more on data summary or analysis. Database software tends to require more work to use data analysis tools, but keeping your results in a database does not preclude using spreadsheet analysis tools. We strongly recommend the use of database software of some sort above spreadsheet software. The larger the project, the more this is true. Spreadsheets also have size limitations.

What type of database should I use?

The most common types of database tools are called relational database management systems (RDBMS). They range from desktop versions (which some professionals do not consider true RDBMS) designed for one or a few individuals to use, to client-server databases that require more expertise to operate (e.g. Oracle, Microsoft SQL Server, Informix). The more robust client-server systems are designed to handle the typical types of database processes associated with being connected to a website, such as very large datasets and multiple concurrent users. None of these RDBMS tools will help you by themselves; each one will require resources to build your atlas database tools within their framework (see first question).

What are the data fields I have to store?

Every project has to go through a series of steps to evaluate this. Generate a list of questions that you will have as the project proceeds (see sample list attached). The data fields will be needed to answer these questions. Keep in mind that any one question can be answered multiple ways. For example, to ask how many blocks are completed, you might want a count of confirmed species per block, or a count of possible breeders and above per block, or a total number of hours of coverage. So a specific field called "Block_completed?", while it sounds handy, might not be sufficient because a simple yes/no field is not flexible enough to handle a question that could be asked a number of different ways.

Can you show me a list of fields you store so I can make sure I collect those data?

See the previous question. We believe review of a list of data fields is a distraction from (or at least secondary to) finding out whether a system can answer all of your important questions. If an existing data management system meets your functional need, the database "structure" probably has all the fields you need.

Should I build my own, or use something someone else has built?

See the attached chart for an evaluation of some of your options. This decision must be based on a number of factors, from available expertise to size of project to using tools you're comfortable with.

Do I need a system that works over the Internet, or can I use a desktop database?

You should pick a data management system based on your needs. If you have thousands of volunteers and blocks, consider how much time it will take to enter that data, and weigh that against the cost of setting up an online system that allows most volunteers to enter their own data. Consider the benefits of volunteer feedback that comes with a "thanks for entering your field card online" message versus the postcard that says, "your field card from three years ago has been entered. Please check the data". If you are constructing your own, do you need all of it to work on the Web, or just certain parts? Finally, what do your participants expect?

What are the risks of using one of the online "shared" databases that other projects are using?

There is some loss of control when you use a centralized data management system hosted by someone else, since you do not store data locally. Make sure you are comfortable with the data maintenance and backup procedures used by the centralized system (true for any database solution you use). A good general recommendation for any data management system would be to clearly identify the copy of the dataset that represents the up-to-date copy where all modifications occur (in the cases of the online system, this will clearly be the one connected to the web site), and to have one or more backup copies in different locations at some time interval. The interval may vary with season, and you can decide what's best.

What are the benefits of using an online "shared" database that other projects are using?

There will be dramatic savings of time and labor that may be used in other parts of the project, because you would be leveraging the time and costs that others have spent developing a solution. An online system, if properly implemented, can distribute the data entry load and speed up almost every aspect of data exchange and review. Also, if it's being used by others, it has been tested, and there will be people who can answer your questions and with whom you can compare notes and uses of the system. 

I'm pretty good with Access / Filemaker Pro. Why shouldn't I just build my own database?

We cannot recommend any single product or vendor, but we can say it is easy to underestimate the resources needed to develop a data management system 'from scratch'. Available expertise in a particular area can help guide your selection (e.g. of database software), but usually one person's limited experience with a particular product cannot substitute for a careful evaluation by several people with different perspectives as well as discussions with database experts. It is also important to note that facility with constructing tables and table relationships in RDBMS software, while a key step, is only the beginning to setting up a data management system. In other words, one person's familiarity with specific software is a poor basis for a decision by itself. Again, any solution will still have to meet the requirements of providing support to the atlas project.

How do I create a system that allows me to make maps easily from my database?

There are numerous paths to making maps: using a GIS (geographic information system), or having someone laboriously color dots in an image editing program, or creatively modifying an off-the-shelf mapping program intended for making travel or trail maps. You will have tradeoffs between cost, learning curve and effort, so you should evaluate options adopted by other atlas projects. GIS products can be expensive and hard to learn, but output from such a system may be in a format more readily transferred to other systems. Also consider what the mapping needs are in terms of frequency of update and number (weekly updates of hand-colored maps is unlikely), and whether you wish to make progress maps widely available or not. You should find out whether anyone with expertise could lend some of his or her skills if needed.

What are the most common mistakes in setting up an atlas database?

One final hint: don't create your own database.

In case we have not been clear: setting up a new data management system is an expensive proposition - period. You will always find someone who says "I can do it cheaply", but usually that statement is made before a careful consideration of your needs. Generally speaking, you cannot possibly anticipate all the needs for building a new online system yourself. While not a deal-breaker, this causes people to grossly underestimate the costs of building, maintaining, and upgrading the system throughout the atlas. Do not compare an existing system with an idealized version that meets all of your needs. Instead, discuss your needs with a database manager (see the List online systems, below) to discuss how many of your needs could be met using an online system, and which needs could be met through small modifications; sometimes needs can be met outside the database. Finally, compare this with a realistic version of what you might build yourself, when you could be out confirming nesting behavior.

(back to top)

Table: A comparison of some of the options for managing Breeding Bird Atlas projects. This table is intended for use along with the goal-setting and common questions documents.

 

Approach

Pros

Cons

Who should use:

A

A series of data files (e.g. spreadsheets) on coordinator's PC

·         Easy access and control of files for editing

·         Allows for complete customization

·         High risk of data loss / accidents

·         Coordinator has main burden of data entry and maintenance

·         Not a serious solution for any but the smallest of projects

Not recommended. Very small projects might consider this (e.g. county atlases), but should look more seriously at solution B, especially if intent is to share data.

B

Building a custom database using a desktop RDBMS product

·         Allows for customization

·         More robust data control options over solution A

·         Does not need to be as "user-friendly" as online solutions, since there are few users

·         Some learning curve with products

·         Novices tend to make database functionality decisions based on abilities with software rather than project needs

Acceptable solution for small projects with simpler data management goals. Small projects = hundreds of volunteers and blocks.

C

Adapting an existing database for atlas purposes

·         Shorter development time than A,B, and D

·         May adopt some pre-built data standards in lack of better guidance

·         May not meet project needs well, variable capacity for customization

Possible solution for projects with limited resources or few options for developing a solution.

D

Creating an online database from scratch

·         Allows more participants to share data entry load

·         Provides better user feedback and generally more timely data entry and review

·         Allows complete control over all data processing

·         More expensive to implement than A or B, because more complex functionality conflicts with user-friendliness in creating system; requires expertise

·         Requires an internet service solution, which may not already exist for the organization managing the atlas

Good solution for projects with clearly defined needs and the resources to create their own system.

E

Using an online system such as those provided by Cornell Lab or Patuxent/NBII

·         Resources saved by leveraging the tens of thousands of dollars spent in development & user testing by others

·         Allows use of expert system without database expert on staff

·         More visibility and availability of information

·         Ability to discuss your project data management strategy with someone outside of your project

·         Solutions include mapping components

·         More visibility and availability of information

·         Somewhat reduced capacity for customization

Good solution for small or large projects with little resources to devote to data management. Projects wishing to make their information available immediately to a wider audience or for combination with other projects.

List of online data tools

While not a comprehensive list, these organizations and sites have successful online systems for managing atlases and may be able to host your atlas data:

 


Document presented at the North American Atlas Committee (NORAC) meeting 22 April 2004.
Some details updated, links added, Jan. 2008.

Mark Wimer and Anna L. Ott
USGS Patuxent Wildlife Research Center
12100 Beech Forest Rd. Laurel, MD  20708
For more information e-mail: mwimer@usgs.gov

(back to top)