ActiveGraph: An information visualization tool
ActiveGraph was designed with the intention of facilitating pattern discovery in data sets. The data sets can consist of numbers or text, such as bibliographic data. This flexibility makes it possible to use ActiveGraph for digital library applications that allow users to:
- Analyze citation statistics
- Collect and annotate papers for individual or group use
- Explore a list of results returned from a search
In the following progression of images, we describe the functionality of ActiveGraph by showing an artificially constructed "picture perfect" data set.
The main part of the display shows a scatter plot of data in two or three dimensions. The controls beneath the plot allow users to assign different aspects of the data to the X, Y, and Z axes, and to the other visible attributes of color, shape, and size. Constructing different views on the data makes it possible to experiment with the different views and perceive relationships in the data. The data in these images, for example, can easily be reconfigured to look like this:
In addition to changing the visual features of the data points, a logarithmic transformation has been applied. This aspect of the visualization is controlled by clicking on the axis label in the controls at the bottom of the screen.
When users click on the scatter plot, the point nearest the cursor is selected and indicated by brackets. Detailed information about the selected point appears in the control panel on the right-hand side of the screen. The control panel also has two buttons: "Edit This Item" and "Delete Item". When users click on "Edit This Item" a dialog box appears with controls for changing any of the visible attributes associated with the selected point.
Once users close the dialog box, any changes they made are immediately reflected on the scatter plot.
In addition to editing the data, users can change the structure of the data by adding, editing, or deleting data fields. For example, if they select "Add Field" from the Edit menu, a dialog box appears that allows them to choose the type of field to add, to name the field, and to configure any additional variables the field might need. The following example shows how to add a field called "Note":
After adding the field, users can write notes about any data point using the "Edit This Item" button:
The control panel on the right has two tabs, "Selected Item" and "Filters". The filters tab allows users to remove specific points from the display without deleting them from the data set. The following example shows how to filter out the point whose weight = 0. Notice the changes in the scatter plot when it is updated:
Users can select or deselect individual items on the filter list without affecting the rest by holding down Ctrl while clicking on items in the list. They can select ranges of data by clicking on the first item in the range, then holding down Shift and clicking on the last item in the range.
ActiveGraph can handle bibliographic data, including titles and authors, as well as numeric data.
This screen shot shows a data set consisting of papers in the field of information visualization. The X axis is assigned to year of publication and the Y axis and color are assigned to author. Among other things, it immediately becomes apparent that there is a gap in the collection. Indeed, every data point prior to 1999 refers to a paper published in a 1999 review of important papers in the field. Every data point after 2002 refers to a paper published in a journal on information visualization that was first published in 2002.
In this user scenario, ActiveGraph facilitates collaboration and personalization by enabling researchers who are collaborating in an area to share the results of their reading by writing summaries, categorizing the paper according to the group's own criteria, and indicating whether the paper would be of interest to the whole group.
This screen shot shows a data set consisting of citation data for LANL authors.
The color coding here is by subject, in alphabetical order. The blues and purples represent articles in physics, which are particularly important at LANL.
This data set demonstrates a distribution that is common in citation data: a few papers are cited frequently, while most are not. Because of exponential nature of the distribution, it is helpful to view the data set after applying a logarithmic transformation:
All the data points are still visible, but they are more evenly distributed in the available space, allowing users to see the patterns in previously dense areas more clearly. In this case, it becomes obvious that more papers have received 0 citations than have received 1 citation, 2 citations, or any other single number of citations.
|