SRU (Search/Retrieval Using URL)

SRU/CQL Standardization in OASIS

OASIS Search Web Services Technical Committee

TC Page |  Committee Drafts  | Call for Participation  | Join the TC | Proposals  | Comment

The OASIS Search Web Services Technical Committee has drafted an Abstract Protocol Definition (APD - see document 1 in the table below) providing the framework for the definition of "Application Protocol Bindings".  These bindings may be static or dynamic.  A static binding is a human-readable document, essentially a profile.  A dynamic binding is a machine-readable description of a server, written in a description language that the Committee is also developing (described in  Annex B of the APD).

The premise behind dynamic bindings is that any search engine, even one that existed prior to development of the standard, need only to provide a dynamic binding -  a self-description. It need make no other changes in order to be accessible. A client will be able to access any search engine that provides a description, if only it implements the capability to read and interpret the description and use it to formulate a request (including a query) and interpret the response.

Committee drafts include  bindings for  SRU 1.2 (2)  and openSearch (5) as well as a specification of CQL 1.2 (4 ).

The current phase of work is the development of an SRU 2.0 binding, CQL 2.0, and the description language.


Committee Drafts

The OASIS Search Web Services Technical Committee, released five Committee Drafts, June 30, 2008:

Document

Full name and link

Description

1 Abstract Protocol Definition  (APD) Search Web Services - searchRetrieve Operation: Abstract Protocol Definition Version 1.0 - Committee Draft 01 30 June 2008 Provides the framework for the definition of "Application Protocol Bindings", including for example SRU 1.2, SRU 2.0, and openSearch.
2 Binding for SRU 1.2 Search Web Services - searchRetrieve Operation: Binding for SRU 1.2 Version 1.0 - Committee Draft 01 30 June 2008 The SRU 1.2 binding, together with the Auxiliary Binding for HTTP GET, is  intended to be fully compatible with the current SRU 1.2 SearchRetrieve Operation specification.
3 Auxiliary Binding for HTTP GET Search Web Services - searchRetrieve Operation - Binding for SRU 1.2: Auxiliary Binding for HTTP GET- Version 1.0 - Committee Draft 01 30 June 2008
4 CQL 1.2 Search Web Services - CQL 1.2: The Contextual Query Language Version
1.0 - Committee Draft 01 30 June 2008
Intended to be fully compatible with the current CQL 1.2 specification .
5 Binding for OpenSearch Search Web Services - searchRetrieve Operation: Binding for OpenSearch Version 1.0 - Committee Draft 01 30 June 2008 intended to be fully  compatible with
OpenSearch Draft 3 Specification.


Call For Participation

 The current  phase of work for the Technical Committee is the development of SRU/CQL 2.0 (as mentioned in  Annex A of the SRU 1.2 Binding), and the Description Language (as mentioned in Annex B of the Abstract Protocol Definition).

The Committee hopes to involve the SRU Implementors Group in this next phase of work to develop SRU/CQL 2.0 and the Description Language. You can participate either by joining the committee ( please see Joining the OASIS Search Web Services Technical Committee) or if you are unable to join, via discussion over this list. 


SRU/CQL 2.0 Proposals

Last Updated: December  3, 2008

The Table below shows the current proposals for version 2.0 of SRU and CQL (current as of the above date).  These have been proposed by the OASIS Search Web Services Technical Committee.

For some proposals there is not yet consensus on an approach, and more than one approach is listed.

Feature

Description/Approach

Status

Same container

"find 'A' and 'B' within the same container element  'C'"

Example 1: find the name 'jones' and date '1950' in the same author field.

name=jones PROX/container=author date=1950

Example 2: find 'jack' and 'jones' within the same author field.

jack PROX/container=author jones

Pending further discussion.

'window' relation

Find 'A', 'B', 'C' ..... within a span of X words.

example:
 * dc.title window/distance<5/unit=word "fries salt vinegar"
  fries, salt, and vinegar all within a span of 5 words

Pending a real-life example.
faceted search See below. Pending further discussion.

 

multiple query types

Request parameter: queryType
Optional. Standard-wide default "cql", but the server can override the default - it can name a different default in explain.

For example, if the query type is XQuery, then
...... queryType=xquery&query=[XQuery expression] .....

The list of supported query parameter names is specified by explain.

Approved.

 

Alternative Response Format Request parameterresponseFormat
Optional.  The standard-wide default SRU, though the server can override the standard-wide default.   A 2.0 implementation must support the SRU format.  Register a media type for SRU.  Possibly text/xml+SRU.

Approved.

 

Depricate ‘operation’ and ‘version’ parameters Eliminate these two parameters for version 2.0 and higher. However, keep them as optional, for compatibility and interoperability with earlier version. Approved.
Result size precision

Response Element : resultSizePrecision

Allow the response to indicate or estimate the accuracy of  the result-set-size reported.

The value will be a term from a controlled vocabulary. 'exact' and 'unknown' included, and these two would be required to be understood by the client. 'minimum' and 'maximum' would also be included.

In addition, a value of "more" or "current" might indicate that the process of building the result set continues, and in that case a "maximum" value might be supplied.

Needs further development.

resultCount response element optional The resultCount response element should be optional, because it should be omitted if the result size precision is "unknown"  
resultAnalysis

Response Element : resultAnalysis

Bundle various response elements into a single element, resultAnalysis. It would include faceted search results, result size precision, and perhaps subquery analysis (which has not been defined yet, but would be based on the Z39.50 searchResult-1). 

Needs further development.
Sorting

Sort in both CQL and SRU

Sorting is currently a function of CQL (and not the SRU protocol). Prior to version 1.2 it was a function of SRU (and not CQL). In 1.2 it was removed from SRU and added to CQL. But this means that you cannot describe within the protocol how to sort. Thus, unless you assume CQL as the query language you cannot be certain that sort will be supported. Prior to version 2.0, CQL was the only query language you could use with SRU; in version 2.0 you will be able to use alternative query languages.

Conversely, it is intended that CQL be usable with other protocols. So if sort is not a function of CQL then where sorting is required CQL will not be usable by a protocol that does not support sorting.

Approved.

Comment

To comment on these proposals, or on any other aspect of the TC work:

  1. Subscribe to the comments list.
    send a blank message to
    search-ws-comment-subscribe@lists.oasis-open.org.
  2. post comment.
    to search-ws-comment@lists.oasis-open.org

More ....


Faceted Search Results

The proposed approach to faceted search results is based on the following example.

Query:  "nuthatch"   Total records: 50

Facets:

  • source
  • subject
  • author
  • date

 

source

  • Library of Congress Catalog (10)
  • MELVYL (8)

subject

  • Birds (15)
  • Nuthatches (12)
  • Sitta carolinensis (4)

author

  • Deignan, H. G (2)
  • Dunbar, Catherine (1)
  • Audubon, John James (1)
  • Davies, Melvyn (1)
  • Pravosudov, Vladimir V (1)

date

  • 2007 (5)
  • 2006 (3)
  • 1993 (2)
  • 1979 (1)
  • 1903 (1)

The following XML file represents the above example.

<facetedResults>

   <query>
       <queryType>cql</queryType>
       <queryString>nuthatch</queryString>
       <count>50</count>
  </query>

<facets>
           <!--
******** first facet: source
                 -->


   <facet>
         <facetType>source</facetType>

           <!--
******** first source: Library of Congress Catalog
         -->


        <facetValue>
             <valueString>Library of Congress Catalog</valueString>
             <count>10</count>

     <queries>

          <query>
               <queryType>cql</queryType>
               <queryString>nuthatch AND dc.source="Library of Congress Catalog"</queryString>
          </query>

           <query>
                 <queryType>xquery</queryType>
                 <queryString> [xquery expression] </queryString>
           </query>

    </queries>
</facetValue>
<!--
******** second source: MELVYL
-->


     <facetValue>
          <valueString>MELVYL</valueString>
          <count>8</count>

    <queries>

       <query>
            <queryType>cql</queryType>
            <queryString>nuthatch AND dc.source=MELVYL</queryString>
      </query>

      <query>
          <queryType>xquery</queryType>
          <queryString> [xquery expression] </queryString>

       </query>
    </queries>
  </facetValue>

</facet>
<!--
******** second facet: subject
-->


<facet>
<facetType>subject</facetType>
<!--
******** first subject: birds
-->


<facetValue>
<valueString>birds</valueString>
<count>15</count>

<queries>

<query>
<queryType>cql</queryType>
<queryString>nuthatch AND dc.subject=birds</queryString>
</query>

</queries>
</facetValue>
<!--
******** second subject: nuthatches
-->


<facetValue>
      <valueString>nuthatches</valueString>
      <count>7</count>

<queries>

      <query>
         <queryType>cql</queryType>
          <queryString>nuthatch AND dc.subject="nuthatches"</queryString>
     </query>
  </queries>
</facetValue>
<!--
******** third subject: Sitta carolinensis
-->


<facetValue>
       <valueString>Sitta carolinensis</valueString>
       <count>7</count>

    <queries>

      <query>
           <queryType>cql</queryType>
           <queryString>nuthatch AND dc.subject="Sitta carolinensis"</queryString>
      </query>
    </queries>
  </facetValue>
</facet>

<!--

*********************** ADD FACETS FOR AUTHOR AND DATE HERE
-->

</facets>
</facetedResults>

Procedures: The client may request faceted search results according to a particular schema. In particular there will be a schema to represent the example above, but an alternative schema may be requested.   The server would advertise all such schemas it supports via explain as well as its behavior:

  1. It will always provide faceted search results, whether the client asks for it or not.
  2. It will never supply faceted search results.
  3. It will, but only if asked to (default 'off').
  4. It will, unless asked not to (default 'on').