Metrics Challenges

From eotcd

Jump to: navigation, search



Scope is a measure of the amount of materials in the archive.


How discrete to measure this? What is the unit to be measured? We are suggesting that archive and collection level statistics be reported. It seems that this is of importance to the archive service provider.


What to count? What unit to count? What is the equivalent of “titles” or "articles"? We are suggesting that discrete files for digital objects be measured for five types: text, image, application, audio, and video. Within applications there are formats that could be separately counted if doing so would characterize the scope of a collection or archive. For example, the scope of a collection consisting of a significant percentage of PDF files or google earth files might be best characterized by an indication of the number and size of these formats.

Perhaps this should be a decision based on some rules, for example, if a collection includes 25% or more files of one application format, counts for these should be reported separately.


The URL+timestamp might be considered the unique identifier for discrete objects in a Web archive. Is this the level at which usage ought to be tracked and usage reports generated for institutions?

  • Unique identifiers for materials in the Web archive
    • Needed for tracking usage at the level of identification
    • Needed for identifying versions of the same identifier

Content Description

The purpose of describing the content of an archive is to allow a library to assess the broadness of applicability of all or a portion of the content in a Web archive. This is a fundamental criteria employed in selection of materials for a collection.

What attributes might consistently describe a collection within an archive?

  • Topical areas covered
  • Unique or exclusive content available
  • Dates materials were harvested

Curated v. non-curated web archive:

  • Impact of this in terms of automated content description versus human-mediated content description