Mobile Sitemap

I’ve been running the Wordpress Mobile plugin for a while, works very well. While doing some housekeeping today and upgrading Wordpress to the most recent, I decided to try out the Google Sitemap plugin to generate a sitemap for the blog. What I really wanted though was a mobile sitemap. Because the Wordpress plugin currently works based off of the user agent of the blog (or being forced into mobile mode with a ‘mobilenow’ parameter) I wasn’t sure if Google was picking up the mobile version. So I applied some quick and vicious hacks to the sitemap plugin to get it to also generate a sitemap with with ‘mobilenow’ parameter turned on to see if that gets Google to pick up the mobile version. Turns out Google probably was hitting the mobile version, but we’ll see how the sitemap ends up affecting that.

There’s a lot of stuff still quite undefined when it comes to working on mobile web publishing. Search engine optimization and marketing have become pretty staple in the online side of the world. But with the mobile side all mixed up across transcoding and content adaptation depending on http headers or user agents, the proper behavior for the environment really depends on a lot of contextual factors that aren’t well specified. How would something like the mobile version of a wordpress blog get bootstrapped into Google if it weren’t for manually jamming in sitemaps? In the RSS side of the world that’s done through an alternative format metatag link. That’s not really appropriate on the mobile end cause there’s no distinct MIME type to hang onto to distinguish the mobile version from the wired web version.

Still, seems like something of the sort would make the activities involved much more obvious from the outside and open up the environment. Right now a lot of the stuff that’s done through server controlled adaptation really needs to be more transparent and explorable. A lot of the stuff out on the mobile web is effectively cloaked from the search engines because the sites might not see the indexers as mobile clients. The alternative of having the engines crawl with user agents that match known devices causes problems for the publishers because they can’t as easily distinguish spider/crawler hits from real traffic. Currently Google seems to be crawling with a somewhat devicelike user agent: Nokia6820/2.0 (4.83) Profile/MIDP-1.0 Configuration/CLDC-1.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html). Not bad, that gets picked up as a mobile device by a lot of the adaptation techniques, which will look at the longest matching prefix string to figure out what the device is. Yahoo crawls with something pretty similar: Nokia6682/2.0 (3.01.1) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 configuration/CLDC-1.1 UP.Link/6.3.0.0.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html). What does your site do when you feed it that user agent? Can the indexers see your mobile version?

One Response to “Mobile Sitemap”

  1. Sean Owen Says:

    I agree that getting the mobile web to show itself to a crawl is not necessarily easy. I’ll point out that we (Google) send an Accept header which asks for mobile Internet media types too. While that’s the right way to do it, it doesn’t work everywhere. We also have to use yet another UA to get some cHTML content; many sites look for something like a “DoCoMo” UA.

    (PS for the mobile webmasters, if you’re going to write rules based on UA, do make sure you send your mobile content when you see “Googlebot-Mobile”!)

    Even sites that are trying to look at Accept headers properly have trouble since text/html is the official type for both desktop-oriented HTML markup and cHTML.

    I think this highlights how confused the whole MIME / Internet media type system has always been — some document types have multiple media types (XHTML anyone?), some media types map to multiple document types (e.g. text/html for HTML, cHTML) — but, that’s another story.

Leave a Reply