Ripping mobility from the clutches of telecom
Mobile Search, Blended Indexes, Content Ranking
I’ve been wondering a lot lately about search indexes and mobile content, and the reinforcing effect an established web index can have on the growth of any new channels. I’ve been paying particular attention since Google changed to a unified index for web and mobile searches a few weeks ago. It used to be that as a publisher providing both web and mobile content the only way to really make sure that my stuff would get indexed properly was to create a mobile sitemap for my mobile pages and inform Google of it as an explicit set of mobile pages. Now I’m not sure what to do.
My initial interpretation of the decision was that I should rely on the handheld media type alternate link to let Google know about the mobile version of pages, and rely on Google just indexing the main web version in order to find the mobile version. Makes sense, but what do I do about not having the mobile sitemap there to inform Google about the structure of my site? For instance normally there’s an indexing penalty associated with having duplicate content. Well, my web version and my mobile version are duplicate content. How much does Google use the linking from main web version to mobile version? Should I deny the Googlebot access to the mobile version in order to make sure that the web version doesn’t get penalized?
And of course there’s the problem of mobile only content. While folks can use the loopback handheld media type link hack to keep Google from transcoding their pages, how does that affect their indexing? Say that my statement from the paragraph above is true and Google uses the handheld media link to determine when duplicate content is actually a valid duplication. What does it do when the page points to itself? On our pages at Mowser we have a web version and a mobile version. The web version points to the mobile version as an alternate. The mobile version points to itself. That’s so that if the mobile version ends up in outside search indexes the page won’t get double transcoded. However I don’t have a good understanding of what that does to index ranking.
The layout really seems to encourage Google’s position of dominance with respect to search on the whole. By smacking their index together and refocusing people on increasing the ranking of their web content and relying on media linking to find the mobile version it really makes it hard for upstarts focused on making better mobile search to get any attention at all out of content owners. And with all the uncertainty around the behavior of Google’s index some of us are actually trying to hide away portions of our mobile content in order to figure out how to get the Google index to behave the way we expect it to. While Google controls the lions share of searches online, who would muck around with their content and endanger their position in the Google index in order to increase their ranking in the much lower utilization mobile specific indexes? Besides Russ and I at least. Although I love what the Google folks are doing for mobile in terms of GMail, and Gmaps, and Android – this whole issue around mobile search and advertising gives me the creeps.
Let me give a concrete example of something we just haven’t figured out yet. We have a directory of feeds at Mowser, organized by topics, rendering summaries and providing an adapted version of the full site if the user clicks through. We figured it would be a nice way to introduce mobile users to existing content without having to go to each of the publishers. We give mobile users content, give blog owners new readers, yay! Right? The problem is Google isn’t picking up the pages for some reason.
Take for instance our mobile technology blogs page, a page very near and dear to me cause it includes the greatest website in the entire world (this one). Now do a search for “mobile technology blogs” site restricted to mowser.com. It’s not in there for some reason. It’s not even very link deep from the pages we directly expose in our sitemap. Why does that page not get picked up? Is the sitemap we have there in some way affecting the normal behavior of the Googlebot? It’s not like it isn’t pounding on the site all the time, it’s just not finding stuff. I thought that’s what it was supposed to do.
Obviously, it’s Google’s site and Google’s index and Google’s users so they’re free to do whatever they want. For once I’m not going to bitch and tell them what they should do. But I do think a little extra transparency here would go a long way toward helping us figure out what we should be doing. Even just going through and whacking anything that’s old and outdated from the Google support section might help. Is the mobile sitemap even supposed to be used any more now that the index is unified? The information available for webmasters seems to be really conflicted now.
| Print article | This entry was posted by miker on February 13, 2008 at 1:03 am, and is filed under Community, Mowser, ThisIsMobility. Follow any responses to this post through RSS 2.0. You can leave a response or trackback from your own site. |

about 2 years ago
I’m really surprised to see all the mowser transcoded sites indexed in google.
You should really block your transcoded pages via robots.txt or put a noindex meta tag on it.
That’s a major duplicate content seo ding to anyone’s site that get’s caught in the google mowser index. I would be pretty pissed myself if my site was in there.
about 2 years ago
We have!! The Googlebot is denied /web, but still the content is showing up in there.
about 2 years ago
I’ve agonized over the same issues. I ended up just having both web and mobile Google sitemaps and putting every page in the appropriate sitemap without worrying about duplicate content. I also use the handheld meta tag to point to mobile content from web content and to recursively point from mobile content to itself.
I’m pretty satisfied with the results, 80% of my mobile and my web traffic comes from Google and traffic has been growing steadily. Most of my pages are indexed too.
I think Google’s algorithms should recognize that duplicate content across a full-web site and it’s mobile counterpart is a good thing and not penalize it.
It sure would be nice if Google did a better job of documenting what they consider best practices for webmasters. I still think the best approach (and something I think Matt Cutts and other Googlers have said) is to ignore Google and design for users.
about 2 years ago
Hey Dennis,
Thanks for the comment. I agree, definitely, when it comes to organizing and presenting your information design for users and not the indexes. However, when you’re building a content adapting proxy it requires certain behaviors in order to make sure you’re behaving properly as a citizen of the ecosystem on the whole. That’s the part I’m somewhat unsure about.
- Mike