PDA

View Full Version : XML Sitemaps


john@stinkyink.
23-Jul-2008, 02:54 PM
If you have ever wondered why we bother with generating an XML sitemap, then wonder no more!. Have a look at this link for an interesting (well I thought it was interesting - but they all reckon I need to get out more!) anecdote about misusing and then understand XML Sitemaps (http://blog.stinkyinkshop.co.uk/2008/07/23/sitemap-hell)

jont
23-Jul-2008, 05:23 PM
Scary how a small change can provide disastrous results... funny how it is easier to kill your site than it is to improve it :rolleyes:

combat68
23-Jul-2008, 05:36 PM
Geez thats scary!

I'm new to all this and working on getting google to pick up the right pages

So after a couple of days of putting together robots.txt and an accurate xml sitemap i'm looking forward to seeing if google will refresh its indexing where i want it.

Cheers for the info

Rob

Stereo Steve
23-Jul-2008, 06:23 PM
We binned the XML sitemap as it's too easy to get wrong, particularly with the frequency thing. We found google indexed new pages but put them straight into supplemental without indexing the content. They would then stay there for months despite all efforts to link to them from the index or from other sites etc.

Letting google find our new pages naturally from a home page link may take a day or so longer but at least it looks at them properly and most times they go straight to the main index.

I saw no increase in total pages indexed with a sitemap. My advice would be to make sure your internal linking is well structured and not bother. It gives google an excuse to not crawl your site, just index the URL and not bother with the rest.

Buzby
23-Jul-2008, 08:43 PM
John,

Which software do you use to generate your feed?

I use Mole End's mash but it doesn't have any priorities to the urls, just urls.

Kind regards

Jason

grantglendinnin
23-Jul-2008, 09:42 PM
So the moral of the story is....automated is never easy!? I'm looking at the way our site is laid out and now thinking of the nightmare of creating an XML sitemap for the entire site with every product having its own single page:(

john@stinkyink.
24-Jul-2008, 08:44 AM
We were using G-Site crawler, and just accepted the default settings which appeared to be OK. Then Rob wrote his own version which gave us the problem. The learning curve was vicious, but I think that we are better people for a bit of pain!!!!!!!

I'm quite relaxed about it now, once we understood what had happened and how to benefit from it then we can use the knowledge. Not understanding how the Sitemap is being used is the bigger issue I think. Having set the priorities as we indicated in the Figure, the program now Auto updates at midnight every night and then pings Google to tell it if anything has changed

Anyway, as Jont said earlier, it is easier to break a website than improve it - but you know us blokes, can't stop tinkering! If it is working fine try to break it!

The thing that scared me was the speed that it all happened at, it was literally dead in hours, and when corrected fixed in hours - which is not how Google usually works

Mike Hughes
24-Jul-2008, 09:29 AM
Intersting stuff John. It looks like something I'll leave alone as it's too complex by far.

What are your thoughts on why the change to the 'priority' value affected the serps so much? Do you think Google just doesn't like seeing a value different from it's calculations or is that value somehow being used in determining the serps? This kind of implies that there could be scope for adjusting how Google ranks the focus of the website.

Mike

PS. The link to Google Webmaster Guidelines doesn't work.

john@stinkyink.
24-Jul-2008, 09:53 AM
Morning Mike,

I don't think its complex once we understood that everything is relative to your index page which is 1. When we dropped the Priority of our printer pages to 0.4 then google just re-ranked them (is that a proper word?) to fairly unimportant in the index - we literally just dropped from front page to nowhere. When we put them back to 0.9 we were almost instantly back in the to their old positions. What I haven't said in the article is that we then (can't resist tinkering) adjusted our product pages to 0.5 to intentionally reduce their importance and increase the importance of the printer pages. That is more difficult to quantify in terms of the difference it has made to our traffic, but we are happy with it at the moment, in fact our Google traffic is increasing at our traditionally quiet time of year.

As we think we understand it, you can't increase your overall importance of your site, but RELATIVE to the importance of your index page you can adjust how Google ranks the rest of your site. That is I think where we are at.

I've searched and there is very little information out there about XML sitemaps so we are offering our experience for what it is worth, if anybody else has any experience of 'Sitemap engineering' (hey, I've invented a new discipline) I would be fascinated to discuss it

(I've fixed the Google link - thanks)

malbro
24-Jul-2008, 12:32 PM
I've searched and there is very little information out there about XML sitemaps so we are offering our experience for what it is worth, if anybody else has any experience of 'Sitemap engineering' (hey, I've invented a new discipline) I would be fascinated to discuss it

(I've fixed the Google link - thanks)
I found this on google webmaster tools for the google version of the sitemap generator, which has to run on the web site host and uses the local server logs to get information.

The Sitemap Generator assigns priority to URLs it finds in the logs based on how often each URL is accessed. For instance, a URL that has been accessed 100 times will be given a higher priority than a URL that has been accessed twice. The actual priority assignment is relative and depends on each URL as compared to other URLs in the site.

john@stinkyink.
24-Jul-2008, 01:34 PM
That's interesting and bears out what we thought. What was striking about the Google auto generator is that it ranks pages on the amount of traffic, so if you prioritise your pages as we have done will Google assume that those pages are the most important?. That is certainly what we think is happening

Ghome1971
01-Aug-2008, 08:15 PM
I uploaded a new site map on 28/06 (G Crawler). My business has nose dived over the past few days but I am not sure if it is just the time of year and current market. I have uploaded a mole end site map which is what I had before just in case. I will keep you posted.

Best Wishes

Geraldine

Buzby
01-Aug-2008, 09:51 PM
I uploaded a more in-depth sitemap about a week ago and have seen a large increase in traffic.

I used the mole end feed and then added the extra information using a find and replace option in Dreaweaver, and then edited the priorities individually.

I would not under any circumstances give Google access to my server or logs just as I would not let a competitor look at my order book.

Google can rank my site on content and not how busy my site is.

A restaurant when quiet fills the window area to give the appearance it is busy. They wouldn't hang a sign on the door saying that they are empty so why actually send that sign to Google by letting them in to view.

Jan
02-Aug-2008, 08:17 AM
That is really interesting John.

I'll be updating the mole end generator to allow additional pages to be defined and to include suplementary pages such as additional info pages. When I do this I will also add the option to define priority, so you can have a default one for most pages but then add a variable to the key ones that assigns a higher one to it. It will be a little while in the making but will be a good addition when it arrives.

Regards,

Darren B
02-Aug-2008, 09:25 AM
Jan I was thinking about you while reading the thread, using a variable to set priorities, i like the idea

John thanks for extremely well laid out case study, and much more i thank you for sharing this sort of information

D

Steve G Griggs
04-Aug-2008, 01:00 PM
Hi John.

Well done on bringing this up, I have never got too involved with sitemaps, but I probably will have to soon, but that is scary :eek: how it affects a webiste so quickly.
Definitely one to beware of.

john@stinkyink.
04-Aug-2008, 02:01 PM
Hi Steve,

I think once your understand how Google sees the information then you can use it to your benefit, we were just uploading a site map without any thought to what it has the potential to do!!

gabrielcrowe
03-Sep-2008, 04:13 PM
With information from this page in mind, I'm creating (well, already created) a php script for ActSQL, that generates perfect sitemaps, including the priority.

Without any priority data, and a lack of manual input, i opted to generate them based on the structure of the catalog.

The code walks back through the sections to count the depth from the root. this number is then used to determine the priority, with a special override, for products pages.

With this is mind, we can configure the sections to different priorities.

regardless of the depth, what is the rough guide to those people who have sites with differing structures. Yours at stinkyink is differnt from a few others, and not everyone has all major sections leading to a tree of brands, then printers then inks.

are products more important? are the major sections? or is it more down to whatever gets the most hits in your stats?

malbro
03-Sep-2008, 04:47 PM
are products more important? are the major sections? or is it more down to whatever gets the most hits in your stats? I suspect you will get some very different answers to this but my answer is that products are much more important than sections.

john@stinkyink.
03-Sep-2008, 04:57 PM
Hah, you say that, but we use the final section as our printer page and products are much less important for us because we are optimised within Google as follows:

Manufacturer:Family:Printer:Ink - so for instance search Google for 'hp photosmart c4180 ink' and we are top two and that page is a subsection

malbro
03-Sep-2008, 05:06 PM
Hah, you say that, but we use the final section as our printer page

As I said I think Gabe will get as many answers to this one as there are possibilities, it depends on what you are selling, how the site is organised, and what you expect your potential customers to search for.

pinbrook
03-Sep-2008, 08:16 PM
with a spp site i tend to strip all section pages out to leave me with a xml sitemap of spp pages only. i also favour my spps to be

section
subsection
product
subsection
product

thus i only have addtocart on spp and lead google to those pages only

Darren B
03-Sep-2008, 09:10 PM
im even more confused with bloody xml sitemaps than i already was, i know does not take much but kin ell

Runner
27-Sep-2008, 12:20 PM
Hi all,

Since reading this thread back at the beginning of September, I decided to remove my xml sitemap from google on 3rd Sept. Reason: The number of pages indexed was approx 1/3 of those submitted and was flatlining week on week.

Since removal I can report an increase of pages indexed of 20% and an increase in visits of 17% compared to the period previous to sitemap removal.

This may be coincidence? I will persevere and see what happens....

Duncan Rounding
27-Sep-2008, 12:30 PM
...Since removal I can report an increase of pages indexed of 20% and an increase in visits of 17% compared to the period previous to sitemap removal....

Or perhaps an incorrectly configured sitemap.

Please report back with more details as time goes by.

john@stinkyink.
29-Sep-2008, 08:22 AM
I would incline to agree with Duncan, it would be interesting to see how your sitemap was configured. I am no expert though and I don't think there are many out there who do know the in's and out's of sitemaps. All I have reported is based on our own experience using them

Runner
29-Sep-2008, 11:02 AM
Just took a look at my last sitemap.

Here are some of the settings that were applied automatically:

PAGE PRIORITY
TLD (www.mysite.co.uk (http://www.mysite.co.uk)) 1.0
index.html 0.8
catalog 0.8
brochure pages 0.8
Top section 0.8
sub-section1 0.64
sub-section2 0.51
sub-section3 0.51
SPP 0.51

In all cases the change frequency was set to daily.

Maybe time to intervene? I was interested in Pinbrokes method of stripping out section pages and just leaving the SPP's in. What do you think?

john@stinkyink.
29-Sep-2008, 11:12 AM
Hi Keith,

Where did you post the sitemap?, we've just had a look at your site and also at Googles cache and they don't list an XML sitemap in the root directory in their cache, which we would more or less expect to still be there

Runner
01-Oct-2008, 09:10 AM
Hello John,

Thanks for the info via email. I used to ftp my sitemap.xml to the root directory. I see that your site has it in acatalog/.

I have put a modified version, having manually gone through the priorities, in my acatalog/.

You have a link from your page footer. Do you upload the file independantly or via Additional files?

Have you found it beneficial to link to it rather than just have it sitting on the server?

I have resubmitted to Google and awaiting results. I will update later.

Regards,

Keith

john@stinkyink.
01-Oct-2008, 09:17 AM
Morning Keith,

The XML Sitemap has to be in your root directory. Ours certainly is. We have linked to it from the bottom of our index page. We upload it independently from Actinic because we generate it independently and every time we update it we also inform Google that we have uploaded a new one in Webmaster tools.

gabrielcrowe
01-Oct-2008, 09:50 AM
i dont think that its position in the site is important.

its actually possible to manually upload your sitemap, and so its not even important that its there or not. its inly important if you need google to come and get it regularly, using their scheduled downloads.

I have been working on a set of code that builds the sitemap in Actinic, and uploads this file when you upload your catalog. As of yesterday, it supports change frequency and priority, set at the section level, in the actinic interface.

For those people with less rigid structures (different depths to your sections) this really is the only way we could think of. I did try setting the value of priority, relative to the depth of the section, but sometimes that does not work, for example, if you have products directly off a main section, and then a section 4 deep in, with spp products in it. the only choice was to set the priority on a per page basis, yes, more time consuming, but ultimately, more control over what is more important to you.

As the sites show changes, i'll certainly let you all know how we're getting on.

Runner
01-Oct-2008, 11:21 AM
Thanks for the replies.

Sorry John, I must have been looking at your actinic sitemap link.

GAViN™©
13-Nov-2008, 10:38 AM
Have read this post with interest, and saw Jan's post about updating the google sitemap generator plugin.

Can anyone confirm if this has been done yet?

GAViN™©
13-Nov-2008, 10:51 AM
Cheers. Maybe a good idea to upload a sitemap on the website.

Can anyone tell me the template file I need to modify in order to make the sitemap.html file easier on the eye? I find the standard sitemap on the website to look a bit naff.

Jan
13-Nov-2008, 10:57 AM
Have read this post with interest, and saw Jan's post about updating the google sitemap generator plugin.

Can anyone confirm if this has been done yet?It hasn't been done yet, it's still on the list though.

Regards,

GAViN™©
13-Nov-2008, 11:47 AM
Thanks Jan.

Early next year? :D

GAViN™©
18-Nov-2008, 04:30 PM
For anyone interested, Jan dropped me an e-mail and informed me this was on a the "to do" list, and since this is something she herself requires for her own site, *might* help push this up to the top of the pile :D

Jan
27-Nov-2008, 08:20 AM
Hi Chaps,

Just a note to let you know that the changes I have been talking about in here for my free sitemap generator program are now available in beta, the changes are :

Additional information pages are now included in the listing
Priority can be specified for each page
Change frequency can be specified for each page
You can define a list of extra pages to include in the sitemap
A couple of bug fixes
Improved interface for user defined variable selection.

The beta is available here if you would like to try it:

http://www.mole-end.biz/download/ME_FroogleMash251108.exe

Regards,

GAViN™©
27-Nov-2008, 08:36 AM
Great stuff Jan, will be installing this beta version this morning. :)

Pilgrim
20-Dec-2008, 09:24 PM
Thanks John for your thread, which I first read in September when I looking for xml sitemapping software for one of my sites. I recently made some some changes and notice a drop off in vistitors, so I double check google, and found I had lost links etc etc, and quickly relocated your articale. So thank John, my site back up google, and I gain some extra links to boot!

www.mctooling.com
www.astonlee.com

People remember the quality of your products, long after they have forgotten the price they paid!

Sean Williams
10-Jan-2009, 07:00 PM
Just found this thread - many thanks for sharing this John :)
It's been a couple of months now - have any of you that have changed your xml sitemaps seen any noticeable changes?

Gabe - do you have an update?

I must say that Pinbrooks idea of stripping out everything but the SPP's and Index sounds appealing.

TIA