This specification introduces Server Timing, which enables the server to communicate performance metrics about the request-response cycle to the user agent, and a JavaScript interface to enable applications to collect, process, and act on these metrics to optimize application delivery.
This is a proposal and may change without any notices. Interested parties should bring discussions to the Web Platform Incubator Community Group.
Accurately measuring performance characteristics of web applications is an important aspect of making web applications faster. [[NAVIGATION-TIMING-2]] and [[RESOURCE-TIMING]] provide detailed request timing information for the document and its resources, which include time when the request was initiated, and various milestones to negotiate the connection and receive the response. However, while the user agent can observe the timing data of the request it has no insight into how or why certain stages of the request-response cycle have taken as much time as they have - e.g. how the request was routed, where the time was spent on the server, and so on.
This specification introduces Server Timing, which enables the server to communicate performance metrics about the request-response cycle to the user agent, and a JavaScript interface to enable applications to collect, process, and act on these metrics to optimize application delivery.
Server Timing constitutes of two parts: a definition of the `Server-Timing` header field, which allows the server to communicate performance metrics and descriptions within the response and in a well defined format, and a `PerformanceServerTiming` interface to allow JavaScript to collect, process, and act on these metrics to optimize application delivery.
The Server-Timing header field is used to communicate one or more metrics and descriptions for the given request-response cycle. The ABNF (Augmented Backus-Naur Form) syntax for the Server-Timing header field is as follows:
Server-Timing = "Server-Timing" ":" #server-timing-metric server-timing-metric = metric [ ";" description ] metric = metric-name [ "=" metric-value ] metric-name = token metric-value = 1\*digit [ "." 1\*digit ] description = token | quoted-string
See [[!RFC7230]] for definitions of `token`, `digit`, and `quoted-string`.
The PerformanceServerTiming interface participates in the [[!PERFORMANCE-TIMELINE-2]] and extends the following attributes of the PerformanceEntry interface:
Performance
InterfaceIf this method is not called, the user agent SHOULD store at least 150 PerformanceServerTiming resources in the buffer, unless otherwise specified by the user agent.
If the `maxSize` parameter is less than the number of elements currently stored in the buffer, no elements in the buffer are to be removed and the user agent MUST NOT fire the `servertimingbufferfull` event.
servertimingbufferfull
that bubbles, isn't cancelable, has no default action, at the Performance object [[!NAVIGATION-TIMING-2]].The user-agent MUST process Server-Timing header field communicated via a trailer field (see [[!RFC7230]] section 4.1.2) using the same algorithm.
Cross-origin resources (i.e. non same origin) MUST be included as PerformanceServerTiming objects in the Performance Timeline. If the "timing allow check" algorithm, as defined in [[RESOURCE-TIMING]], fails for a cross-origin resource:
Server must return the `Timing-Allow-Origin` HTTP response header, as defined in [[RESOURCE-TIMING]], to allow the user agent to fully expose, to the document origin(s) specified, the values of attributes that would have been set to zero or empty string due to the cross-origin restrictions.
The interfaces defined in this specification expose potentially sensitive application and infrastructure information to any web page that has included a resource that advertises server timing metrics. For this reason the access to `ServerTiming` interface is restricted by the same origin policy by default, as described in . Resource providers can explicitly allow server timing information to be available by adding the `Timing-Allow-Origin` HTTP response header, as defined in [[!RESOURCE-TIMING]], that species the domains that are allowed to access the server metrics.
In addition to using the `Timing-Allow-Origin` HTTP response header, the server can also use relevant logic to control which metrics are returned, when, and to whom - e.g. the server may only provide certain metrics to correctly authenticated users and nothing at all to all others.
The permanent message header field registry should be updated with the following registrations ([[RFC3864]]):
> GET /resource HTTP/1.1 > Host: example.com < HTTP/1.1 200 OK < Server-Timing: miss, db=53, app=47.2; < Server-Timing: customView, dc;atl < Trailer: Server-Timing < (... snip response body ...) < Server-Timing: total=123.4
Name | Value | Description |
---|---|---|
miss | ||
dc | atl | |
db | 53 | |
app | 47.2 | customView |
total | 123.4 |
The above header fields communicates five distinct metrics that illustrate all the possible ways for the server to communicate data to the user agent: metric name only, metric with value, metric with value and description, and metric with description. For example, the above metrics may indicate that for `example.com/resource` fetch:
The application can collect, process, and act on the provided metrics via the provided JavaScript interface:
var serverMetrics = window.performance.getEntriesByName('https://example.com/resource.jpg'); for (i = 0; i < serverMetrics.length; i++) { entry = serverMetrics[i]; if (entry == "server") { // entry.metric, entry.duration, entry.description } }
Server processing time can be a significant fraction of the total request time. For example, a dynamic response may require one or more database queries, cache lookups, API calls, time to process relevant data and render the response, and so on. Similarly, even a static response can be delayed due to overloaded servers, slow caches, or other reasons.
Today, the user agent developer tools are able to show when the request was initiated, and when the first and last bytes of the response were received. However, there is no visibility into where or how the time was spent on the server, which means that the developer is unable to quickly diagnose if there is a performance bottleneck on the server, and if so, in which component. Today, to answer this question, the developer is required to use different techniques: check the server logs, embed performance data within the response (if possible), use external tools, and so on. This makes identifying and diagnosing performance bottlenecks hard, and in many cases impractical.
Server Timing defines a standard mechanism that enables the server to communicate relevant performance metrics to the client and allows the client to surface them directly in the developer tools - e.g. the requests can be annotated with server sent metrics to provide insight into where or how the time was spent while generating the response.
In addition to surfacing server sent performance metrics in the developer tools, a standard JavaScript interface enables analytics tools to automatically collect, process, beacon, and aggregate these metrics for operational and performance analysis.
Server Timing enables origin servers to communicate performance metrics about where or how time is spent while processing the request. However, the same request and response may also be routed through one or more multiple proxies (e.g. cache servers, load balancers, and so on), each of which may introduce own delays and may want to provide performance metrics into where or how the time is spent.
For example, a CDN edge node may want to report which data center was being used, if the resource was available in cache, and how long it took to retrieve the response from cache or from the origin server. Further, the same process may be repeated by other proxies, thus allowing full end-to-end visibility into how the request was routed and where the time was spent.
Similarly, when a Service Worker is active, some or all of the navigation and resource requests may be routed through it. Effectively, an active Service Worker is a local proxy that is able to reroute requests, serve cached responses, synthesize responses, and more. As a result, Server Timing enables Service Worker to report custom performance metrics about how the request was processed: whether it was fetched from server or server from local cache, duration of relevant the processing steps, and so on.
This document reuses text from the [[NAVIGATION-TIMING-2]], [[RESOURCE-TIMING]], [[PERFORMANCE-TIMELINE-2]], and [[RFC6797]] specifications as permitted by the licenses of those specifications.