Session IDs and tracking

> Don't confuse the bot

Updated: October 16th 2015

?PHPSESSID=01elm211k6 ???? Googlebot

"Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page."

- from the Google webmaster guidelines 1

Don't confuse Googlebot

The use of session id's and other tracking methods may cause your site to be not indexed, or improperly indexed. Bots like Googlebot may misinterpret tracking arguments or session IDs of one page as several different pages and may not be able to eliminate URLs, which means duplicate copies of your web page may be a result of using these techniques.

Most websites do not use session IDs. If you do not know what they are you probably don't need to worry about this guideline.

If you do not use session IDs or arguments this guideline does not apply to you.

This guideline is one of the several guidelines that emphasizes that your site needs to be search engine crawler friendly.

Use of session IDs can result in several problems for a website in how they are indexed and ranked in search engines. There are many problems possible, here are a couple examples -

Duplicate content

One result of session IDs is duplicate content.

If you have a webpage about dogs, the URL might be...

If you are using session IDs, when a user goes to that page it results in the user seeing a page with a URL of...

This does not really affect the user, but to a search engine crawler it may (and usually does) appear to be a different page.
A search engine crawler might index a thousand different versions of your "dog" page - even though the only difference is the URL and the session ID at the end of it.
This can result in a search engine indexing multiple copies of what is in reality just one page.

If you have multiple copies of one page on your website, it is not following the guidelines and may appear as an attempt to manipulate search engine results and can be deemed as "spam".

URLs longer that 255 bytes

Session IDs can make a URL longer than 255 bytes. If your URL is longer than 255 bytes you are not following the recommendations of the Hypertext Transfer Protocol.
This can result in a Google crawl error - URLs not followed /Redirect URL too long

Key Points

- The use of session Ids and excessive arguments may confuse bots and cause your site to be poorly indexed.
- Bots may not distinguish different URLs pointing to the same content and may result in duplicate content.

Patrick Sexton by