Skip Navigation
caGrid 1.0 —
National Cancer Institute   U.S. National Institutes of Health www.cancer.gov
 
Home » Workspaces » Architecture » caGrid 1.0
Document Actions

caGrid 1.0

The goal of cancer Biomedical Informatics Grid caBIG™ is to develop applications and the underlying systems architecture that connects together data, tools, scientists and organizations in an open federated environment. In meeting this goal, caBIG will necessarily bring together data from many and diverse data sources. The underlying service oriented infrastructure for caBIG is caGrid. The first public version (0.5) of caGrid was released on September 9, 2005. caGrid 1.0 culminates the development of the federated infrastructure and will more fully support the needs of the cancer research community.

caGrid defines two types of "grid services" that can be registered as nodes on the grid: Data Services and Analytical Services. caGrid provides a standard infrastructure for bioinformaticians to advertise their services thru common metadata defined in Unified Modeling Language (UML) domain information model. Users can access these grid services and data programmatically using locally managed access control policies and using strongly typed data objects in XML format. caGrid infrastructure also provides strong semantic specification thru binding to description logic terminology concepts that can be used by users to discover new and interesting scientific information using semantically aware searches.


Software and Documentation Links

caGrid 1.0 Installer Instructions caGrid 1.0 Installer Instructions
caGrid 1.0 Installer Install caGrid 1.0
caGrid 1.0 Source Download caGrid 1.0 Source code
caGrid 1.0 Users Guide caGrid 1.0 Users Guide
caGrid 1.0 Programmers Guide caGrid 1.0 Programmers Guide
caGrid 1.0 Release Notes caGrid 1.0 Release Notes
NCICB Download Site NCICB Download Site
caGrid wiki caGrid wiki

Project Site

caGrid 1.0 GForge Project Page caGrid 1.0 - Project Page
caGrid 1.0 GForge File Release Site caGrid 1.0 - File Release Site
caGrid 1.0 GForge Document Release Site caGrid 1.0 - Document Release Site

caGrid 1.0 Portal

You can launch caGrid 1.0 Portal that is part of the caGrid 1.0 release. This should be your starting point for monitoring and discovering services that are available in caGrid.

The tool provides visusal display of services on the caGrid infradstructure and also institutions that are participating in the caBIG program.


caGrid 1.0 Browser -Early Preview

The caGrid 1.0 browser is a web-based application that allows users to discover advertised caBIG grid resources and to query those resources for data of interest.

The tool uses caGrid 1.0 supported grid APIs for browsing for advertised services, discovering services based on metadata and registered objects in Cancer Data Standards Repository (caDSR) and concepts from Enterprise Vocabulary Service(EVS) and querying the deployed services using the caBIG XML query language.

Users can access the browser using their existing NCI user accounts (user name and password). For users that don't have NCI accounts, there is a provision in the tool to request for account. However, the approval of user accounts will be done in accordance with the caGrid security policies. The Security Working Group will determine appropriate policies for registering users.


Project Details

The caGrid 1.0 team is comprised of members from the following organizations:

  • Ohio State University - Biomedical Informatics Department - Provided Overall Technical Leadership
  • University of Chicago/Argonne National Laboratory
  • Duke Comprehensive Cancer Center
  • ScenPro, Inc
  • SemanticBits, LLC
  • Science Application International Corporation (SAIC)
  • Booz Allen Hamilton - Provided Program Management
  • National Cancer Institute Center for Bioinformatics (NCICB) - Provided Government Oversight

Significant number of enhancements has been incorporated into the caGrid 1.0 infrastructure. To mention a few highlights:

  • Migrating the underlying infrastructure for supporting services using standard web service resource framework (WSRF) specification
  • Complete overhaul of federated security infrastructure to satisfy caBIG security needs, incorporating many of the recommedations made in the caBIG™ Security White Paper Technology Evaluation
  • New workflow capabilities to enable orchestration of services using industry standard Business Process Execution Language (BPEL)
  • New Federated Query Processing (FQP) capability built in collaboration with the Cancer Translational Research Informatics Platform (caTRIP) project, a caBIG funded project
  • Performance and scalability improvements to the services by implementing specifications such as WS-Enumeration into the underlying Globus Toolkit infrastructure
  • Provision for grid wide object identifier support capability by integrating with The Handle System® service from Corporation for National Research Initiatives
  • Extensive enhacements made to the metadata infrastructure, including standard grid service APIs to Global Model Exchange (GME), Cancer Data Standards Repository (caDSR) and Enterprise Vocabulary Service (EVS)
  • Tighter integration with NCICB components used by caBIG funded projects including Common Security Module (CSM) and caCORE Software Development Kit (SDK)
  • Development of extensive automated system testing framework to validate various components of the infrastructure

In addition to the above mentioned highlights, caGrid 1.0 infrastructure contains the following tools:

Introduce Toolkit: is a service creation toolkit built by caGrid team. It supports easy developement and deployment of caBIG compatible grid enabled data and analytical services. Introduce toolkit reduces the service developers needing to manage the low level details of the WSRF specification and integration with the Globus Toolkit.

Grid Authentication and Authorization of Reliably Distributed Services (GAARDS): provides services and tools for grid wide administration and security enforcement for services that are deployed on caGrid infrastructure. GAARDS consists of following security components:

  • Dorian: allows for the provision and management of user accounts, providing an integration point between external security domains and the grid.
  • Grid Grouper: provides a group-based authorization solution, wherein grid services and applications enforce authorization policy based upon group memberships defined and managed at the grid level.
  • Grid Trust Services(GTS): provides a mechanism for maintaining and provisioning a federated trust fabric of certified authorities in caGrid

caGrid 1.0 Portal: provides a visual view of services running on the infrastructure. The portal provides:

  • Geographic map of nodes runnning on caGrid infrastructure
  • caBIG participating institution/ Service Provider information
  • Dynamic status updates of grid services

Reference Implementations and Early Adopters

As part of the caGrid 1.0 infrastrucure release, the following projects have been working with the caGrid development team and are at various stages of completing their grid enablement process:

  • GeneConnect - Extensible informatics platform that integrates diverse data types and supports interoperable analytic tools - Washington University
  • GridImage - Grid application for viewing and evaluating images - Ohio State University
  • caBIO - Cancer Bioinformatic Data Service - NCICB
  • caArray - Microarray Data Services - NCICB
  • caTRIP - Grid application that ties together disparate data resources in a metadata driven fashion – Duke Comprehensive Cancer Center
  • GenePattern - GenePattern is a powerful analysis workflow tool developed to support multidisciplinary genomic research programs - Broad Institute
  • geWorkBench - Grid enabled platform for integrated genomics - Columbia University
  • Bioconductor - Analytical service for gene expression and other high-throughput analysis in molecular biology - Fred Hutchinson Cancer Research Center

External Technologies Used by caGrid

caGrid 1.0 leverages the following existing technologies:

  • Globus Toolkit: provides the core grid infrastructure and supports service deployment, service registry, invocation and secure communication -From Globus Alliance
  • Mobius GME: provides grid repository for XML Schemas of strongly typed objects transferred on caGrid - From Ohio State University
  • Cancer Data Standards Repository (caDSR): provides repository for Common Data Elements and UML models - From National Cancer Institute Center for Bioinformatics
  • Enterprise Vocabulary Services (EVS): provides controlled vocabularies - From National Cancer Institute Center for Bioinformatics
  • ActiveBPEL™: provides an open source workflow engine whose implementation follows the Business Process Execution Language standard. - From Active Endpoints, Inc.
  • The Handle System®: provides a general purpose distributed information system that provides efficient, extensible, and secure identifier and resolution services for use on caGrid – from Corporation for National Research Initiatives
  • Grouper: provides ability to manage group information across integrated applications and repositories. – from Internet2

User Information

Subscribe to the caGrid Users Listserv


Contacts

Michael Keller - caBIG Architecture Workspace Lead

Scott Oster - caGrid Lead Architect - Ohio State University

Krishnakant (Avinash) Shanbhag - Director, Core Infrastructure Engineering - NCICB

List of caGrid Team Members


Previous Releases

caGrid 0.5

last modified 08-21-2007 02:38 PM