XML Sitemaps Feature Project Proposal

Note: a follow post was published with more recent information about this project.

While web crawlers usually discover pages from links within the site and from other sites, sitemaps supplement this approach by allowing crawlers to pick up all URLs included in the sitemap and learn about those URLs using the associated metadata.

Today, WordPress core does not generate XML Sitemaps by default, affecting a high number of WordPress websites search engine discoverability. 4 out of the top 15 plugins on WordPress plugin repository currently ship with their own implementation of XML sitemaps, pointing to a universal need for this feature and a great potential to join forces.

This post proposes integration of XML Sitemaps to WordPress Core as a feature project. The proposal was created as a collaboration between Yoast*, Google** and various contributors.

Proposed Solution

In a nutshell, the goal of the proposal is to integrate basic XML Sitemaps in WordPress Core and introduce an XML Sitemaps API to make it fully extendable. Below is a diagram of the proposed XML Sitemaps structure:

XML Sitemaps will be enabled by default making the following content types indexable

  • Homepage
  • Posts page
  • Core Post Types (Pages and Posts)
  • Custom Post Types
  • Core Taxonomies (Tags and Categories)
  • Custom Taxonomies
  • Users (Authors)

Additionally, the robots.txt file exposed by WordPress will reference the sitemap index.

Developers

An XML Sitemaps API will be introduced as part of the integration allowing extensibility. At a high level, below is a list of the ways the XML Sitemaps may be manipulated via the API:

  • Add extra sitemaps and sitemap entries
  • Add extra attributes to sitemap entries
  • Provide a custom XML Stylesheet
  • Exclude a specific post type from the sitemap
  • Exclude a specific post from the sitemap
  • Exclude a specific taxonomy from the sitemap
  • Exclude a specific term from the sitemap
  • Exclude a specific author from the sitemap
  • Exclude a specific authors with a specific role from the sitemap

Non Goals

While the initial XML Sitemaps integration will fulfill search engines minimum requirements and cover most WordPress content types, below is a list of features which will not be included in the initial integration:

  • Image sitemaps
  • Video sitemaps
  • News sitemaps
  • User-facing changes like UI controls to exclude individual posts or pages from the sitemap
  • XML Sitemaps caching mechanisms

i18n

The XML Sitemaps will leverage standard internationalization functionality provided by WordPress core.

Since there are plans by WordPress leadership to officially support multilingual websites in WordPress, the XML Sitemaps will be flexible enough to list localized content in the future as per web development best practices.

Whatā€™s next?

Your thoughts on this proposal would be greatly valued. Please share your feedback, questions or interest in collaboration by commenting on this post. After that we can decide on how to best proceed with this proposed project and set up a meeting on Slack to kick things off.

* @joostdevalk, @omarreiss, @jonoalderson, @herregroen

** @swissspidy @albertomedina @westonruter @flixos90 @tweetythierry

#feature-plugins, #feature-projects, #proposal, #seo, #xml-sitemaps