JavaScript and E4X Best Practices for YQL

Paging Results

YQL handles paging of returned data differently depending on how you control paging within an Open Data Table definition. Let us consider the following example, followed be three paging element scenarios:

select * from table(10,100) where local.filter>4.0

No page element: If no paging element is provided, YQL assumes you want all data available to be returned at once. Any "remote" paging information provided on the select (10 being the offset and 100 being the count in our example), will be applied to all of the results before being processed by the remainder of the where clause. In our example above, the first 10 items will be discarded and only another 100 will be used, and execute will only be called once.
A paging element that only supports a variable number of results: If a paging element is provided that only supports a variable number of results (a single page with variable count), then the execute sub-element will only be called once, with the total number of elements needed in the variable representing the count. The offset will always be 0. In our example, the count will be 110, and the offset 0.
A paging element that supports both offset and count: If a paging element is provided that supports both offset and count, then the execute sub-element will be called for each "page" until it returns fewer results than the paging size. In this case, let's assume the paging size is 10. The execute sub-element will be called up to 10 times, and expected to return 10 items each time. If fewer results are returned, paging will stop.

Note

In most cases, paging within the Open Data Table should match the paging capabilities of the underlying data source that the table is using. However, if the execute sub-element is adjusting the number of results coming back from a fully paging Web service or source, then there is usually no way to unify the "offset" of the page as set up in the Open Data Table with the destinations "offset". You may need to declare your Open Data Table as only supporting a variable number of results in this situation.

Including Useful JavaScript Libraries

When writing your execute code, you may find the following JavaScript libraries useful:

OAuth:

Flickr:

MD5, SHA1, Base64, and other Utility Functions:

Using E4X within YQL

ECMAScript for XML (simply referred to as E4X) is a standard extension to JavaScript that provides native XML support. Here are some benefits to using E4X versus other formats, such as JSON:

Preserves all of the information in an XML document, such as namespaces and interleaved text elements. Since most web services return XML this is optimal.
You can use E4X selectors and filters to find and extract parts of XML structure.
The engine on which YQL is created natively supports E4X, allowing E4X-based data manipulation to be faster.
Supports XML literals, namespaces, and qualified names.

To learn more about E4X, refer to these sources online:

E4X Quickstart Guide from WS02 Oxygen Tank
Processing XML with E4X from Mozilla
AJAX and scripting Web service with E4X by IBM
Introducing E4X by O'Reilly
Popular E4X Bookmarks by delicious
E4X guide by rephrase.net

E4X Techniques

In addition to the resources above, the following tables provides a quick list of tips related to using E4X:

E4X Technique	Notes	Code Example
Creating XML literals	-	`var xml = <root>hello</root>;`
Substituting variables	Use curly brackets {} to substitute variables. You can use this for E4X XML literals as well.	`var x = "text"; var y = <item>{x}</item>;`
Adding sub-elements to an element	When adding sub-elements to an element, include the root node for the element.	`item.node+=<subel></subel>;`
	You can add a sub-element to a node in a manner similar to adding sub-elements to an element.	`x.node += <sub></sub>;` This above code results in the following structure: `<node><sub></sub></node>`
	If you try to add a sub-element to a node without including the root node, you will simply append the element and create an XML list.	`x += <sub></sub>;` The above code results in the following structure: `<node><node><sub></sub>;`
Assigning variably named elements	Use substitution in order to create an element from a variable.	`var item = <{name}/>;`
Assigning a value to an attribute	-	`item.@["id"]=path[0];`
Getting all the elements within a given element	-	`var hs2 = el..*;`
Getting specific objects within an object anywhere under a node	-	`var hs2 = el..div`
Getting the immediate H3 children of an element	-	`h2 = el.h3;`
Getting an attribute of an element	-	`h3 = el.h3.@id;` or `h3 = el.h3.@["id"];`
Getting elements with a certain attribute	-	`var alltn15divs = d..div.(@['id'] =="tn15content");`
Getting the "class" attribute	Use brackets to surround the "class" attribute.	`className =t.@['class'];`
Getting a class as a string	To get a class as a string, get its text object and the apply `toString.`	`var classString = className.text().toString()`
Getting the name of a node	Use `localName()` to get the name of a node.	`var nodeName = e4xnode.localName();`

Note

When using E4X, note that you can use XML literals to insert XML "in-line," which means, among other things, you do not need to use quotation marks:

var myXml = <foo />;

E4X and Namespaces

When working with E4X, you should know that E4X objects are namespace aware. This means that you must specify the namespace before you work with E4X objects within that namespace. The folllowing example sets the default namespace:

default xml namespace ='http://www.inktomi.com/';

After you specify a default namespace, all new XML objects will inherit that namespace unless you specify another namespace.

Caution

If you do not specify the namespace, elements will seem to be unavailable within the object as they reside in a different namespace.

Tip

To clear a namespace, simply specify a blank namespace:

default xml namespace ='';

JavaScript Logging and Debugging

To get a better understanding of how your executions are behaving, you can log diagnostic and debugging information using the y.log statement along with the y.getDiagnostics element to keep track of things such as syntax errors or uncaught exceptions.

The following example logs "hello" along with a variable:

y.log("hello");

y.log(somevariable);

Using y.log allows you to get a "dump" of data as it stands so that you can ensure, for example, that the right URLs are being created or responses returned.

The output of y.log goes into the YQL diagnostics element when the table is used in a select.

You can also use the follow JavaScript to get the diagostics that have been created so far:

var e4xObject = y.getDiagnostics();

Executing JavaScript Globally

When the execute element is placed outside of the binding element, it is available globally within an Open Data Table and is usable across each of the table's bindings. For example, in the following Open Data Table, the function arrayConcat is available to be used within the execute element for SELECT statements:

Making Asynchronous Calls with JavaScript Execute

The y.rest and y.query functions take callback functions as a parameter. The callback function is called either when y.rest or y.query completes a call or times out. This callback mechanism allows for simultaneously handling and processing multiple REST calls.

The callback functions are just JavaScript functions that are passed the results of the REST call. When the callback function below is passed as a parameter to y.rest, the response is assigned to result.

The code example shows the basic syntax of using y.rest with a callback function. You can see how the callback can be implemented in Examples.

Examples

y.rest

The returned result of y.rest has the two properties timeout and url that allow you to identify calls that have timed out and retain the URI resource. The two properties can also be accessed by the callback function.

A single callback function can be used for multiple requests. The following example uses the same callback to log the URL and status of the response as well as the number of responses received.

The two properties url and timeout of the response help you determine if a call has timed out or associate the response with the original URL. The code snippet below show you the url and timeout properties and how you could potentially use them:

The callback function can also use the url and timeout properties to take the appropriate action as seen in this code snippet:

yquery

The syntax and usage for callbacks with y.query is nearly identical to that of y.rest. The callback function is passed as a parameter to y.query. The two properties timeout and query allow you to identify calls that have timed out and reuse the query.

The code example below uses the callbackto keep track of the successful calls, logs the results, and saves the first returned response.

This code example uses the timeout and query properties to log the queries that timed out.

Yahoo! Developer Network