Webcontent.gov - Your Guide to Managing U.S. Governement Websites

Home About Us Frequently Asked Questions     Topics A-Z  Contact Us   Jobs

Bookmark and Share


Government Web Content Managers Forum Monthly Conference Call

Thursday, February 15, from 11 - 12 ET

Conference Call Topic: Using Sitemaps to Help Search Engines Find Your Content

Speakers:

  • JL Needham, Public Sector Content Partnerships, Google, Inc.
  • Representatives from three federal agencies who've implemented sitemaps: PlainLanguage.gov, Library of Congress, and OSTI/Energy

Read more about the Sitemaps tool at: http://www.sitemaps.org/

Presentation slides:

JL Needham from Google

Site maps help search engines crawl your site and find content. Google developed site maps, but they’ve worked with other search engines to make it an industry-neutral standard that can help improve results on all the major search engines.

JL discussed “Common barriers to search crawling.pdf”

  • Web search is different than site search.
  • More people find government websites via search engine than any other method.
  • Site maps really help with “long tail” searches—helping people find obscure information.
  • Typically, search engines cannot search “behind” web forms, so many data is not accessible when people search. Site mapping helps search engines find that information so it’s accessible to visitors.
  • Four most important tags for XML sitemap: URL’s location, last modification, change frequency and priority

Q - How do you create this sitemap?

A: Use the Google tool to create map, place it in root directory where search engines can find it, then notify search engines where to find the URL of your sitemap.

Real-life examples:

Miriam from PlainLanguage.gov

Site has been online for over ten years, site was growing without much structure, no budget. Volunteers redesigned site, now site has both static and data-driven content. They get traffic from all over the world. Used Microsoft Access for their data, but the type of data field they used wasn’t searchable. After a while, realized search engines were randomly generating pages and finding some of the data, but site maps allowed them to supply ALL URLs, even dynamic content. She developed a script to generate the URL list, took about a day to build, but now she just runs the script once per month to get an updated URL list (XML sitemap) of the site, and reposts the list for the search engines to find.

Jim from Library of Congress

Their site has a huge amount of data. “American Memory” is a legacy database application with millions of items, all of which are accessed dynamically. Dynamic URLs kept changing, so robots.txt wasn’t able to keep up with changing URLs. Sitemaps helped them create “handles” – persistent URLs pointing to that dynamic data. This open standard opens all this content to all major search engines.

Jeff from OSTI/Energy

Site contains research reports and other technical data, lots of pdfs. Now have around 2.3 million records. They’ve automated their sitemap so it’s dynamically updated. Since crawler recognizes date last changed/updated, they can tell search engines not to re-crawl info if it hasn’t changed, making this process much faster. Sitemaps also helps them find orphan files.

JL talked about NCES and how sitemaps help to ensure that parents get most current info.

Learn more: Federal sitemaps wiki

Q & A

Q: Folks would like to see sitemaps from other agencies

A: We’ll send those URLs out later, sharing examples. Webcontent.gov will also host some more info about sitemaps. Please email examples to Sheila Campbell if you’d like to share.

Q: Where does the sitemap file live? How is it used?

A: Sitemap is posted on your site. Use Sitemap tools to tell Google where to find it. Note, this does NOT replace “regular” site crawling, but rather just enhances it.

Google hosts a webinar every Thursday at 3 eastern, if you’d like some training or help.

Q: Verification process?

A: Google verifies your website by giving you a “key” so Google crawler can find you, and verify your identity. You can secure your sitemap so only Google can see it, locking down IP addresses, etc., but since this is generally all public content, it’s not usually locked.

Q: Will sitemaps work on intranets ?

A: No, this only works for public content. The intent of a sitemap is to make all public content accessible, even if it’s “hidden” in a database.

Q: Is it best to install at highest level of website, or can it be used only on lower “sections” of a large site?

A: Can work either way – depends on what will work best for your agency, and how you want to administer this info.

Q: Why would you use this on sites that aren’t “dynamic?”

A: Sitemaps are a great way to quickly disseminate information—for instance, new press releases might not be crawled for a while, but sitemaps will pull the URL for that new press release so it can be found right away. Google can “ping” your site as often as once per minute to see if anything has been updated, so this can really help get information out immediately.

Q: How does metadata fit in?

A: Not so much with sitemaps, but does help with ranking, so it’s still important.

If you have questions, you can contact JL Needham at sitemap-partners@google.com

Next Forum Call:

March 15.

Reminders:

  • Everyone should have updated firstgov.gov links to usa.gov. Instructions are on webcontent.gov.
  • Sign up for the Gerry McGovern webinar on February 21, 2007. There’s still space available, it’s free for government employees.
  • Feb 28 – Web Manager University’s Web Analytics Marketplace training. Hear from various vendors and learn how to define customers. Real-life examples from agencies who’ve implemented these solutions.
  • April 24 – 2007 Government Web Content Managers Conference. Sign up now!

 

Page Updated or Reviewed: June 26, 2007

Privacy Policy About Us FAQ's Topics A-Z Contact Us Jobs
USA dot Gov: The U.S. Government's Official Web Portal