Speakable Schema

In one line

Speakable schema is a structured data markup that identifies webpage text best suited for audio playback via voice assistants. Learn how to implement it.

Definition & overview

Speakable schema is a structured data markup that identifies specific sections of webpage text best suited for audio playback by voice assistants. It enables search engines to deliver direct verbal answers and serves as a critical technical foundation for modern Answer Engine Optimization.

Search marketing teams across the industry are adapting to a massive shift away from traditional ten blue links toward zero-click searches and direct conversational AI responses on smart speakers like Google Home. Implementing Speakable structured data helps brands maintain LLM visibility for voice queries during this transition. By explicitly highlighting the most relevant content on a page, developers ensure Google Assistant and other Answer Engines can extract the exact information a user requests.

This strategy replaces outdated voice search tactics with precise data structuring. So instead of hoping an algorithm guesses which paragraph matters most, the Speakable markup explicitly guides the Text-to-Speech engine to the exact right answer.

How to implement speakable schema

Executing this markup requires adding JSON-LD structured data to the head or body of a webpage. The goal is to isolate the exact HTML elements you want the Text-to-Speech engine to read aloud.

1Identify the target content: Locate the specific article summary or critical paragraph that directly answers a user query. While originally designed for news publishers, modern B2B use cases and SaaS use cases often target executive summaries or technical definitions.
2Choose a targeting method: Decide whether to identify the text using a cssSelector or an xPath.
3Write the JSON-LD script: Construct the SpeakableSpecification markup and place it within the NewsArticle or WebPage structured data block.
4Deploy and validate: Push the code to the live page and verify the syntax.

When choosing a targeting method, developers must decide whether to use a cssSelector or an xPath. A cssSelector targets elements based on their assigned class or ID attributes. This approach is highly resilient because class names rarely change during minor page updates. An xPath points to the exact structural location of the content in the HTML document tree. This method is incredibly precise but breaks easily if a developer adds a new container or alters the page layout. So a cssSelector is generally the safer choice for long-term stability.

Example

Here's a concrete JSON-LD code snippet demonstrating how to apply the SpeakableSpecification property using a cssSelector.

{
"@context": "https://schema.org/",
"@type": "WebPage",
"name": "Answer Engine Optimization Guide",
"speakable": {
"@type": "SpeakableSpecification",
"cssSelector": [".executive-summary", ".key-takeaways"]
},
"url": "https://aloha.digital/aeo-guide"
}

Once you draft your code, you must validate the syntax to ensure search engines can parse it correctly.

1Open the Google Rich Results Test tool or the Schema Markup Validator, as recommended in official developer documentation.
2Select the "Code" tab instead of the URL tab because you are validating raw markup rather than a live page.
3Paste your complete JSON-LD snippet into the testing field so the tool can analyze the syntax.
4Click "Test Code" and review the output for syntax errors or missing required properties.

Common mistakes

Deploying this structured data requires strict adherence to content guidelines. Teams often run into these practical pitfalls during implementation:

Targeting the wrong HTML elements: Highlighting entire articles or navigation menus instead of concise summaries breaks the read aloud functionality, frustrating users with massive audio blocks.
Ignoring language limitations: The feature remains in BETA status and officially supports only U.S. English queries. Deploying it for international markets simply wastes developer time.
Using legacy formatting: Writing the markup using outdated Microdata instead of JSON-LD prevents search engines from parsing the code correctly.

Frequently asked questions

What is the difference between JSON and JSON-LD?

JSON is a standard format for storing basic data, but JSON-LD adds context by linking that data to Schema.org vocabularies. Google requires JSON-LD for Google speakable schema markup because it explicitly defines content relationships for search engines.

How does speakable work?

Developers add structured data to highlight specific text blocks. Google Assistant then parses this code to locate the targeted CSS or XPath elements, extracts the exact text, and uses text-to-speech technology to generate immediate audio playback.

Answer Engine OptimizationJSON-LDStructured dataSchema markup

Want this handled for you?

See how your site performs across Google, AI Overviews, ChatGPT, and Gemini.

Get your free visibility report