google sitemaps in sava
UPDATE/IMPORTANT: the plugin version is now complete and ready to use. (06/16/2009)
Google Sitemaps are a very effective tool in maximizing Google's indexing of your website, and should be part of your site's overall SEO plan. Even if your site is html compliant and otherwise SEO-friendly, they can be a handy tool for identifying problems and especially important if you have a lot of dated or non-priority content. They also help to ensure that pages not found in navigation menus will be indexed by Google.
I've developed a simple function for automatically generating sitemaps in Sava. It will travel through each website in an install and generate a sitemap.xml for each. It uses Sava's slick Class Extension Manager to help define which pages should go into the sitemap, as well as the change frequency and priority (though if you exclude those, the function sets default values). I tried to design it so it would be easy to integrate into a Sava install, and because the developer or plugin docs haven't been released yet I avoided monkeying around in the guts of Sava to get it working.
Instructions are as follows:
1. Insert the function below into the /default/includes/contentRender.cfc component.
<cfset var GoogleXML = XmlNew(true)>
<cfset var increment = 0 />
<!--- not strictly correct, but it puts 'contentBean' variable into a local scope --->
<cfset var contentBean = StructNew() />
<cfset var siteID = "" />
<cfset var delim=application.configBean.getFileDelim() />
<cfset var theDomain= "" />
<cfset var theFileLocation= "" />
<cfset var includeInSiteMap = false />
<!--- get the list of sites --->
<cfset var siteStruct = application.settingsManager.getSites()>
<!--- loop through each site --->
<cfloop collection="#siteStruct#" item="siteID">
<!--- puts the sitemap.xml in the root of the site, i.e. www.somesite.com/default/sitemap.xml --->
<cfset theFileLocation="#application.configBean.getWebRoot()##delim##siteID##delim#sitemap.xml" />
<cfset theDomain="http://#application.settingsManager.getSite(siteID).getDomain()#">
<cftry>
<cfset GoogleXML.XMLRoot = XMLElemNew(GoogleXML,"urlset") />
<cfset GoogleXML['urlset'].XmlAttributes['xmlns'] = "http://www.sitemaps.org/schemas/sitemap/0.9" />
<!---
get all of the content items that are part of the "Sitemap" subtype; we will exclude later based upon 'includeInSitemap' attribute
pages have to be approved, active and displayed
--->
<cfquery name="rsSiteTree" datasource="#application.configBean.getDatasource()#" username="#application.configBean.getDBUsername()#" password="#application.configBean.getDBPassword()#">
SELECT
contentID
FROM
tcontent
WHERE tcontent.siteid='#siteid#'
AND tcontent.subtype = 'SiteMap'
AND tcontent.Approved = 1
AND tcontent.active = 1
AND tcontent.display = 1
</cfquery>
<cfoutput query="rsSiteTree">
<cfset contentBean=application.contentManager.getActiveContent(contentID,siteid) />
<cfset includeInSiteMap = iif(len(contentBean.getValue('includeInSiteMap')) eq 0 or contentBean.getValue('includeInSiteMap') eq "true",de("true"),de("false")) />
<cfif includeInSiteMap eq true>
<cfset increment = increment+1>
<cfset GoogleXML['urlset'].XmlChildren[increment] = XmlElemNew(GoogleXML,"url") />
<cfset arrayappend(GoogleXML['urlset'].XmlChildren[increment].xmlChildren, XmlElemNew(GoogleXML,"loc")) />
<cfset arrayappend(GoogleXML['urlset'].XmlChildren[increment].xmlChildren, XmlElemNew(GoogleXML,"lastmod")) />
<cfset arrayappend(GoogleXML['urlset'].XmlChildren[increment].xmlChildren, XmlElemNew(GoogleXML,"changefreq")) />
<cfset arrayappend(GoogleXML['urlset'].XmlChildren[increment].xmlChildren, XmlElemNew(GoogleXML,"priority")) />
<cfset GoogleXML['urlset'].XmlChildren[increment]['loc'].XMLText = "#theDomain##getURLStem(siteID,contentBean.getFilename())#" />
<cfset GoogleXML['urlset'].XmlChildren[increment]['lastmod'].XMLText = dateformat(contentBean.getLastUpdate(),"yyyy-mm-dd") />
<!--- defaults to 'monthly' if the attribute doesn't exist --->
<cfset GoogleXML['urlset'].XmlChildren[increment]['changefreq'].XMLText = iif(len(contentBean.getValue('changefreq')),de(contentBean.getValue('changefreq')),de("monthly")) />
<!--- defaults to '0.5' if the attribute doesn't exist --->
<cfset GoogleXML['urlset'].XmlChildren[increment]['priority'].XMLText = iif(len(contentBean.getValue('priority')),de(contentBean.getValue('priority')),de("0.5")) />
</cfif>
</cfoutput>
<cffile action="write" file="#theFileLocation#" output="#GoogleXML#" />
<cfcatch>
<cfdump var="#cfcatch#" />
</cfcatch>
</cftry>
</cfloop>
</cffunction>
I've placed this function in the contentRender.cfc for two reasons: a) because it was an easy place to put it, and b) because it also uses the getURLStem() function, which leads us to the next step:
2. uncomment the getURLStem function in the /default/includes/contentRenderer.cfc. (Note: I'm not actually sure if this is strictly required, since there is a getURLStem function in the source contentRenderer.cfc, but we've done this to get SEO urls)
3. log into your Sava administrator and click on Site Settings » Edit Current Site.
4. click on Add Class Extension, select a base type of Page and call the subtype "Sitemap".
Note: now, technically you can skip ahead to #10, because while the next few options allow you to set some specific values for priority et.al. they aren't required.
5. click on Add Attribute Set and call the new set sitemapsettings.
6. click on Add New Attribute; set the name to changefreq, label to Change Frequency, input type to SelectBox, default value to monthly, and in the option list add daily^weekly^monthly^yearly. click Add.
7. click on Add New Attribute; set the name to priority, label to Priority, input type to SelectBox, default value to 0.5, and in the option list add 0.1^0.2^0.3^0.4^0.5^0.6^0.7^0.8^0.9^1.0. click Add.
8. click on Add New Attribute; set the name to includeInSiteMap, label to Include In Site Map, input type to RadioGroup, default value to 1, and in the option list add 0^1, in the option label list add no^yes. click Add.
9. if you have portals in your website, repeat 4-8 with the exception of choosing Portal as your base type.
10. now hop on over to the Site Manager and edit each page in your website. Simply changing the type from Page to Page / Sitemap will ensure it will be added to your sitemap. You can also go to the Extended Attributes tab if you have added the attributes and change the frequency, priority and include options.
11. once your pages are updated, add a new page (just a plain page, not a Page / Sitemap) and call it Google Site Maps. Make sure and exclude this from site navigation, and set the page template as "blank.cfm". Paste the code snippet below into the page (remove the asterisk -- Sava is intent upon rendering the tag otherwise).
12. load your "Google Site Maps" page in a browser. If all has gone well, the pages you have marked as Page / Sitemap should now all be listed in a sitemap.xml document inside the root of every site in Sava (i.e. /default/sitemap.xml). We've got these pages set as a scheduled task in the CF Administrator for our client sites.
The last step is to go to the Google Webmaster Tools page and register your sitemap(s).
Using the Class Extensions wasn't really necessary, since I could have used the "Include in Site Navigation" flag as a general indication of whether or not a page should be included, but I wanted to play around with these before I actually needed to develop a client solution with them. Also, I'm sure there is a way to run the function without having to create an actual page in Sava, but until there are docs ... not nagging, just saying ;) ... I'm sticking to simpler-is-better for the little stuff.
Comments
- Sean Schroeder
Hi Grant,
Great post. It's awesome to see other developers doing cool things with Sava.
One thing to note...the most important reason to add your function to the "local" content renderer is that this allows you to customize Sava without taking you off the upgrade path. The local contentRenderer.cfm overrides the root one much like CSS styles override styles that come before later ones.
Also, in your example you use "default" for your site but really this should be for whatever site you may have in Sava.
I'm gonna take this for a spin and see how it goes! Nice work!
- January 21, 2009, 4:21 PM
- Grant
@Sean: only the beginning, my friend. Sava's the playground I've been waiting for.
As to the 'default' thing, I know I don't follow your advice on creating a new site, but since I have a fresh install for every website I haven't seen the point ... I'm sure that will come up with a 'gotcha' at some point, though.
The function is designed to do all the sites in a Sava install however, so you should only have to put it into one (like 'default') for it to do all of the sites you've created. I guess you're saying to put into the first site you setup, which makes sense.
- January 21, 2009, 4:31 PM
- Hugo Ahlenius
Very cool, and I got it set up without much ado - very good instructions. A bit of a hassle to go through every page and change that, but one can live with that one-time thing.
But! I have galleries on my site, and I would prefer to have them in the sitemap as well. I assume that one can add the gallery in the same way as for a portal - but what about the gallery items?
- January 22, 2009, 1:03 AM
- Grant
@Hugo: I agree that it is a pain to go through and change the page types to use the SubType, but it was the only way to add the fine-grain control for each page. I'm planning to build a menu-builder plugin for finer control over site menus, and I'll integrate sitemaps into that too so it'll get easier once that is available.
As to gallery pages, adding the subtype to those should work fine as far as any page text is concerned, but Google does say that you should not include image urls in sitemaps:
http://www.google.com/support/webmasters/bin/answer.py?answer=34658&ctx=sibling
- January 22, 2009, 1:58 AM
- Sean Schroeder
@Grant: Using the default site isn't necessarily a bad way to go, much like just editing the site.css file rather than extending it with theme.css. We recommend both because they allow for more flexibility, but if you know you're not gonna need it, then by all means, do it however you like. I'm sure you already know this, but I would be remiss if I didn't point it out...when you add a new site in a Sava install, it copies the "default" directory to create a new site. If you want new sites to have all your tweaks and conventions in place, then it's doing you a favor. If you don't, you can always just download the latest version and replace the new site directory's contents with the default one that ships with Sava....an easy fix.
BTW, the plug-in manager is coming along nicely and you can expect to see it (and documentation for it) and a few plug-ins in the not so distant future.
Thanks for being patient with regards to the developer docs. We know it is far from ideal and are doing our best to get them done.
Thanks again for the post!
- January 28, 2009, 12:37 AM
- Nathan Miller
This is great, worked perfectly.
- January 28, 2009, 9:25 PM
- Joel Richards
Thanks for sharing this. It works great. Your documentation is clear and easy to read. I'm excited to see people developing extensions and plugins for SAVA
- February 24, 2009, 3:43 PM
- Andrew Duvall
Thanks for posting this. I will share two things that slowed down my installation process.
1. I use JQuery in my sava install and the shadowbox gave me issue until i went into the blank.cfm template and added: the JS includes to jquery and shadowbox_jquery.js
<*script type="text/javascript" src="#application.configBean.getContext()#/#request.siteid#/js/jquery/jquery.js"><*/script>
< *script type="text/javascript" src="#application.configBean.getContext()#/#request.siteid#/js/adapter/shadowbox-jquery.cfm"><*/ script>
2. i admit i read over step #12 too quickly and was confused that when i ran the page i wasn't getting an xml page popping up. so after i re-read step 12, i realize i just execute the page and then I can get the sitemap.xml at the root afterwards.
anyways i just posted this to help others if they get a slight snag like i did.
thanks and good job.
- March 17, 2009, 11:10 AM
- Andy
Yes, in step #12, you must go to the Google Site Maps page in the admin and click on View Content Version to load it up.
- March 29, 2009, 2:25 PM
- Dan
Great job, works well.
One thing to note, and this might not effect anyone, but we have several clients running on the same install of Sava. After setting this up and changing all of their pages, I could only get it to spit out one full sitemap.xml. All the others were blank. After a little tinkering around, I got it fixed.
I needed to reset the increment var inside the loop over the collection. So, adding <cfset increment = 0/> inside the cfloop fixed it and now all sites are creating the maps correctly.
I am not the best coder in the world, so i may have missed something major / the need for that to be outside the loop, but it seems to have fixed my problems.
Again, fantastic work!
- May 17, 2009, 4:26 PM
- Grant
@Dan...
Thanks for the note on the issue ... The new plugin version has been tested on multi-client setups so that won't be a problem any more ... now I just have to get over to the guys at Mura!
- May 23, 2009, 12:28 AM
- Andy
Hi!
Any updated timeframe on getting this plugin to the Mura folks? Thanks!
- May 28, 2009, 6:02 AM
- Grant
@Andy: this week for sure. We just have to test it with the latest changes in Mura.
- May 31, 2009, 3:49 PM
- Phil
Any update on a release?
- June 11, 2009, 2:06 PM
- Grant
@Phil,
We sent the plugin over to the Mura folks last week, but they're pretty busy working on, well, Mura, so I've posted the Sitemaps plugin on our blog until it appears in the Mura App store.
- June 14, 2009, 1:23 PM