Understanding Google Crawlers
Most website owners understand that Google has “spiders” that crawl the web in search of the latest content. We all want the spiders to find our content in the hopes that our site will rank as well as possible. But how does Google perform this crawling and what can we do to optimize our content to help with rankings? Google has two types of spiders it uses to index content on the web. Having a better understanding of how they work will help us make better content decisions.
The Shallow Crawl
The first spider is fast and doesn’t look very closely at a site. It does a quick scan of a page of two and moves on. While only Google knows the exact method it uses, common sense would tells us that if the shallow crawl finds some significant changes, it might provide some indication to the deep crawler that the site in question might be worth a closer look. You can imagine that if the shallow crawl finds something new, your site might make it higher on the list of sites to deep crawl. Conversely, if the shallow crawl finds nothing new every time, it might mark the site as a relatively low priority.
This makes sense – there are a lot of websites out there and for Google to be effective, it wants to index the latest and greatest content first, so why spend valuable resources on a site that doesn’t seem to have changed?
The Deep Crawl
This spider is big and spends a lot of time analyzing most if not all content on a given site. It looks at all pages, tags, images and more. This is the one that can affect your page rank. So we want our sites to be well formatted and presentable to this crawler. We need well written content, a proper sprinkling of important keywords, relevant links and so on. It collects all this data, stores it in a giant database, applies their secret sauce and voila! Out pops page ranking when someone searches for something relevant to your content.
While Google’s “secret sauce” is very complex, evaluating hundreds of different site elements, we as site owners can think of things in more general terms. Google has little interest in you and your business specifically. Their focus is on providing relevant search results to anyone using the Google search engine. Their goal is to provide the latest, most up to date results to their users. In an effort to find the latest content as quickly as possible, they use the Shallow Crawler, enabling them to visit as many sites as possible in a relatively short time. But they must also use the Deep Crawler to understand the overall scope of a site to provide relevant results.
So we must get their attention with our newest content, and we want them to know that we have new content on a regular basis. So if you were the fast Shallow Crawler, where would you look? There are two possibilities here – the home page and another page or pages known to change often. We may not know exactly how they do it, but we can cover our bets by updating the homepage often in addition to any other sub-pages. If you have a blog, be sure to add a “Latest Post” feature on the home page. If you’ve added a new product or service, it’s a good idea to have some mention on the home page as well.
When it comes to Google rankings, change is good and frequent change is better.