Android 6.0 Marshmallow introduces a new way for users to engage with apps through the assistant.
Users summon the assistant with a long-press on the Home button or by saying
the keyphrase
. In
response to the long-press, the system opens a top-level window that displays
contextually relevant actions for the current activity. These potential
actions might include deep links to other apps on the device.
This guide explains how Android apps use Android's Assist API to improve the assistant user experience.
Using the Assist API
The example below shows how Google Now integrates with the Android assistant using a feature called Now on Tap.
The assistant overlay window in our example (2, 3) is implemented by Google Now through a feature called Now on Tap, which works in concert with the Android platform-level functionality. The system allows the user to select the assistant app (Figure 2) that obtains contextual information from the source app using the Assist API which is a part of the platform.

Figure 1. Assistant interaction example with the Now on Tap feature of Google Now
An Android user first configures the assistant and can change system options such as using text and view hierarchy as well as the screenshot of the current screen (Figure 2).
From there, the assistant receives the information only when the user activates assistance, such as when they tap and hold the Home button ( shown in Figure 1, step 1).

Figure 2. Assist & voice input settings (Settings/Apps/Default Apps/Assist & voice input)
Assist API Lifecycle
Going back to our example from Figure 1, the Assist API callbacks are invoked in the source app after step 1 (user long-presses the Home button) and before step 2 (the assistant renders the overlay window). Once the user selects the action to perform (step 3), the assistant executes it, for example by firing an intent with a deep link to the (destination) restaurant app (step 4).
Source App
In most cases, your app does not need to do anything extra to integrate with the assistant if you already follow accessibility best practices. This section describes how to provide additional information to help improve the assistant user experience, as well as scenarios, such as custom Views, that need special handling.
Share Additional Information with the Assistant
In addition to the text and the screenshot, your app can share additional information with the assistant. For example, your music app can choose to pass current album information, so that the assistant can suggest smarter actions tailored to the current activity.
To provide additional information to the assistant, your app provides global application context by registering an app listener and supplies activity-specific information with activity callbacks as shown in Figure 3.

Figure 3. Assist API lifecycle sequence diagram.
To provide global application context, the app creates an implementation of
Application.OnProvideAssistDataListener
and registers it
using registerOnProvideAssistDataListener(android.app.Application.OnProvideAssistDataListener)
.
In order to provide activity-specific contextual information, activity
overrides onProvideAssistData(android.os.Bundle)
and onProvideAssistContent(android.app.assist.AssistContent)
.
The two activity methods are called after the optional global
callback (registered with registerOnProvideAssistDataListener(android.app.Application.OnProvideAssistDataListener)
)
is invoked. Since the callbacks execute on the main thread, they should
complete promptly.
The callbacks are invoked only when the activity is running.
Providing Context
onProvideAssistData(android.os.Bundle)
is called
when the user is requesting the assistant to build a full ACTION_ASSIST
Intent with all of the context of the
current application represented as an instance of the AssistStructure
. You can override this method to place
into the bundle anything you would like to appear in the
EXTRA_ASSIST_CONTEXT
part of the assist Intent.
Describing Content
Your app can implement onProvideAssistContent(android.app.assist.AssistContent)
to improve assistant user experience by providing references to content
related to the current activity. You can describe the app content using the
common vocabulary defined by Schema.org
through a JSON-LD object. In the example below, a music app provides
structured data to describe the music album the user is currently
looking at.
@Override public void onProvideAssistContent(AssistContent assistContent) { super.onProvideAssistContent(assistContent); String structuredJson = new JSONObject() .put("@type", "MusicRecording") .put("@id", "https://example.com/music/recording") .put("name", "Album Title") .toString(); assistContent.setStructuredData(structuredJson); }
Custom implementations of onProvideAssistContent(android.app.assist.AssistContent)
may also adjust the provided content
intent
to better reflect the top-level context of the activity, supply
the URI
of the displayed content, and fill in its setClipData(android.content.ClipData)
with
additional content of interest that the user is currently viewing.
Default Implementation
If neither onProvideAssistData(android.os.Bundle)
nor onProvideAssistContent(android.app.assist.AssistContent)
callbacks are implemented, the system will still proceed and pass the
information collected automatically to the assistant unless the current
window is flagged as secure.
As shown in Figure 3, the system uses the default implementations of onProvideStructure(android.view.ViewStructure)
and onProvideVirtualStructure(android.view.ViewStructure)
to
collect text and view hierarchy information. If your view implements custom
text drawing, you should override onProvideStructure(android.view.ViewStructure)
to provide
the assistant with the text shown to the user by calling setText(java.lang.CharSequence)
.
In most cases, implementing accessibility support will enable the
assistant to obtain the information it needs. This includes
providing android:contentDescription
attributes, populating AccessibilityNodeInfo
for custom views, making
sure custom ViewGroups
correctly expose
their children, and following
the best practices described in “Making Applications
Accessible”.
Excluding views from the assistant
An activity can exclude the current view from the assistant. This is accomplished
by setting the FLAG_SECURE
layout parameter of the WindowManager and must be done
explicitly for every window created by the activity, including Dialogs. Your
app can also use SurfaceView.setSecure
to exclude a surface from the assistant. There is no
global (app-level) mechanism to exclude all views from the assistant. Note
that FLAG_SECURE
does not cause the Assist API callbacks to stop
firing. The activity which uses FLAG_SECURE
can still explicitly
provide information to the assistant using the callbacks described earlier
this guide.
Voice Interactions
Assist API callbacks are also invoked upon keyphrase detection
. For more
information see the voice
actions documentation.
Z-order considerations
The assistant uses a lightweight overlay window displayed on top of the
current activity. The assistant can be summoned by the user at any time.
Therefore, apps should not create permanent system alert
windows interfering with the overlay window shown in Figure 4.

Figure 4. Assist layer Z-order.
If your app uses system alert
windows, it
must promptly remove them as leaving them on the screen will degrade user
experience and annoy the users.
Destination App
The matching between the current user context and potential actions displayed in the overlay window (shown in step 3 in Figure 1) is specific to the assistant’s implementation. However, consider adding deep linking support to your app. The assistant will typically take advantage of deep linking. For example, Google Now uses deep linking and App Indexing in order to drive traffic to destination apps.
Implementing your own assistant
Some developers may wish to implement their own assistant. As shown in Figure
2, the active assistant app can be selected by the Android user. The
assistant app must provide an implementation of VoiceInteractionSessionService
and VoiceInteractionSession
as shown in
this example and it requires the BIND_VOICE_INTERACTION
permission. It can then
receive the text and view hierarchy represented as an instance of the AssistStructure
in onHandleAssist()
.
The assistant receives the screenshot through onHandleScreenshot()
.