Ever since I learned that Google is indexing kml and kmz files for inclusion into their searches (I say searches as in the regular old Google search and the search feature in Google Earth), I have been battling to get our data in there.
Well, progress has been made, on the Google Earth search front. Not the Google Search front though. So for my personal sanity and maybe to benefit others out there, I plan to document my journey here.
Starting with what I had to do server-wise.
First off: I run on an unix box under Apache and use PHP to bring it all together. My Kml output is database driven / I don’t have any static Kml files lying around / I generate them on the fly.
The first challenge I came across was this mess, which I considered more of an anomaly, so read it if you want, but chances are it will not affect you. Fixing this allowed me to paste a kml url from here into Google Maps and displaying it there. If that was any indication of what the rest of Google sees, well, this was pretty important for me.
Next up, headers. I struggled for a long while trying to get my headers to correctly talk to different browsers. Originally, I just wanted my site to “force download” our kml files to my users, which worked until recently until IE decided to start download empty files.
Of course the forced download approach probably was not going to help me with this whole Google Kml indexing journey I am on. So, according to Google, they are looking for a application/vnd.google-earth.kml+xml header. Ok this would seem easy enough, but IE still gave me problems. Honestly, I don’t understand. I admit I am not a “header” master by any means, so this may be way off, but here is the “mess that works” that I ended up with:
header(”Cache-Control: no-store, no-cache, must-revalidate”);
header(”Cache-Control: post-check=0, pre-check=0″, false);
header(”Cache-control: private”);
header(”Pragma: no-cache”);
header(’Content-Type: application/vnd.google-earth.kml+xml; filename=”myKmlFile.kml”‘);
In addition to this, I also added these headers into my .htaccess file, which looks like this:
AddType application/vnd.google-earth.kml+xml .kml
AddType application/vnd.google-earth.kmz .kmz
I am feeling pretty good at this point, I am sort of showing up in Google Earth, but not in Google Search. Where to go next?
Well, somewhere I read that having a .kml extension is a good thing, I don’t entirely believe that seeing how Google states they base this sort of thing off of header/mime-types. But, still, I am going to do something about it just in case.
Enter Apaches fabled Mod_Rewrite. In theory I can turn this:
http://www.crankfire.com/data/dlkml.php?tkn=&tid=20&w=421
Into this a “file” with a kml extension:
http://www.crankfire.com/kml/tid-20-t-0-w-421.kml
Apparently these Mod_Rewrites can be a real pain in the ass, but I was lucky enough to find this little site - it’s a Mod_Rewrite wizard that helps you create what you gotta do. A few minutes later I had my Mod_Rewrite rule:
RewriteEngine On
RewriteRule ^tid-([^-]*)-t-([^-]*)-w-([^-]*)\.kml$ /data/dlkml.php?tid=$1&t=$2&w=$3 [L]
Then I added this to its own htaccess file in its own directory to keep things simple. It works beautifully.
Now, as of the beginning of November 2007, the last thing I have done is create a sitemap file for this site and all of our geographic data. I tried a couple of sitemap generators (online and software based) and was not happy with what they produced. So, in the end, I generated my own. Using my Mod_Rewrite trick above, I generated a nice geographic sitemap to go along with my general sitemap. Take a look:
http://www.crankfire.com/sitemapgeographic.xml
So now, at least I am pretty sure Google should see me, several times over. And naturally, my Kml’s still don’t show up in the regular Google Search results. So the battle continues.
Next up? Optimizing my Kml files, they show up like holy hell in the Google Earth. If you can find them at all.