Top

Sitemap Generator – Exclude Patterns

Sitemap Generator — how to exclude pages from sitemap…

 

sitemap generator exclude

Since version 0.95 of Sitemap Generator you can set “exclude patterns”. Exclude patterns look like this:

*blog/*

*freeware*

In the above example we say the crawler to avoid any page that includes the text “blog/” and “freeware” in the url. In that case pages like “http://wonderwebware.com/blog/index.html” or “/freeware/index.html” will not be added to the sitemap. The asterix ( * ) says the crawler that anything or nothing more fits the pattern, so if the pattern is *blog/* then all these below will be excluded:

www.mysite.com/blog/index.html

mysite.com/blog/page.html

/blog/index.html

/blog/

blog/

Note that you can set the exclude pattern in different form, for example:

http://mysite.com/nofollow*

but in this case if the link in given html page looks like this one: “/nofollow/program.html” it will not be skipped (because the exclude pattern requires the full domain url to be found in the link anchor). So, to avoid all pages from this /nofollow/folder, you should use different pattern:

*nofollow/*

Now, no matter how the link is written in the html page, if it contains the text “nofollow/” it will be skipped.

No Comments

Sorry, the comment form is closed at this time.