National Cancer Institute   U.S. National Institutes of Health www.cancer.gov
caBIG® Knowledge Center: A part of the Enterprise Support Network

Incubator Projects

From CaGrid

Jump to: navigation, search
  • WorkflowHelper

This project is a component of the caOS workflow execution engine. Given a concrete workflow element (one which is fully described and mapped onto a physical computational resource), it exposes an API which enables its creation, execution, monitoring and destruction.

  • WorkflowManager

This project is a component of the caOS workflow execution engine. On its backend, it interfaces with the WorkflowHelper using its API to create workflow elements and assemble them to obtain a complete workflow. On its frontend, its API must be given the concrete description of a workflow in order to create, execute, monitor and, eventually, destroy a workflow.

  • BpelConsole

This project is part of the caOS project. Its main functionality is to interpret a workflow description in BPEL format and enable the execution of the described workflow. In order to execute a workflow, the concrete workflow description is built as expected by the WorkflowManager API and that API is used to create the workflow.

  • cql2preview

This project is a new version of CQL. This version incorporates feedback from the caGrid community about limitations in the current version of CQL. More details can be found at CQL2. To test-drive a beta 1 release of CQL 2, follow the instructions at CQL2 Tech Preview.

  • Tide

The Tide system is designed to be a bittorrent like grid based parallel data transfer solution. It borrows some aspects from the overall bittorrent model however is different is some aspects. The key difference, other than utilizing grid based protocols for negotiating all transfers, is that in the grid, currently, we don't see the massive ad-hoc user community that you might find in bittorrent. That is, because the data we are talking about transferring, at this point from our user community, the data may only be read by certain people from a certain group or with certain credentials. By limiting the potential data consumer side there will not be much use, at least easily on, for supporting swarms style data transfer. That is, data that is being peer to peer transfered from to consumers who are consuming the same data at the same time. This scenario will be unlikely to occur judging from our use cases. So Tide attempts to be very simple in the way that it stores, publishes, and advertises data replicas. Also, there are no fixed chunk size requirements and no requirements that all the data actually exist anywhere. The Tide system enables a Tide Descriptor to be created that will describe the tide and the currents (chunks) that make up the tide. This Descriptor is then published to a Tide Replica Manager who will manage a list of potential Tide Servers that these data pieces can be consumed from. The pieces can then be consumed in any order and in any proportion from any or all of the potential Tide Servers by any retrieval algorithm on the client side. This type of flexibility in storage and consumption will allow our user community to design solution for them that best suite the predicted storage and retrieval patterns from there respective communities which still gaining the essential parallelism of multi sender one or multi retriever over the internet using standard HTTP protocols.

  • TideReplicaManager

The TideReplicaManager is a service that acts as a registry for Tides. It enables TideDescriptors to be hosted along with hosts which claim to have the particular Tide data available. Utilizing this registry to track the replicas which may exist and enable the Tide client software to utilize the information for retrieval.

  • TideServer

The TideService is the service which maintains a Tide (the descriptor of the data and the actual data pieces, also known as currents). A Tide can be generated with any data. In order to publish a Tide one must first create a TideDescriptor. The TideDescriptor will containe a TideInformation and a list of Currents. The TideInformation is the metadata about the Tide such as name, length, md5sum, and a description. The list of Currents is metadata about each data piece such as length, md5sum, and offset that this piece would exist in the original data. Once this TideDescriptor is created it can then me used to generate a new Tide by using the TideService to create a Tide. This creation operation will return a TransferServiceContextReference which can then be used to upload the actual data to the TideService.

  • Interfaces

The University of Minnesota has developed Introduce extensions making it easier to develop Grid services. Please find more information on the University of Minnesota pages.

  • Kerberos Security

The Yale Krauthammer Lab has provided a modified version of the LDAPLoginModule from CSM that can be configured to only provide the user information that is needed for CSM, allowing a Kerberos server to provide login services. Source, binary distribution and presentation available at the Yale Krauthammer Lab

Tools/Products