Sitemap Generator – Exclude Patterns
Sitemap Generator — how to exclude pages from sitemap…
Since version 0.95 of Sitemap Generator you can set “exclude patterns”. Exclude patterns look like this:
In the above example we say the crawler to avoid any page that includes the text “blog/” and “freeware” in the url. In that case pages like “http://wonderwebware.com/blog/index.html” or “/freeware/index.html” will not be added to the sitemap. The asterix ( * ) says the crawler that anything or nothing more fits the pattern, so if the pattern is *blog/* then all these below will be excluded:
Note that you can set the exclude pattern in different form, for example:
but in this case if the link in given html page looks like this one: “/nofollow/program.html” it will not be skipped (because the exclude pattern requires the full domain url to be found in the link anchor). So, to avoid all pages from this /nofollow/folder, you should use different pattern:
Now, no matter how the link is written in the html page, if it contains the text “nofollow/” it will be skipped.