SEO at a Glance – An introduction, Parsing, Crawling and Indexing

by on February 2, 2009

So I thought I would do a quick overview of Search Engine Optimisation in no more than 6 posts. This is for my own benefit as well. I will look back at this next year when I come to do the same thing and think, what the hell was I talking about, it’s all about video and social …. f*ck SEO !!!. But for now, I’ll continue ..

How that bot views your page

The Google bot basically skims over a page doing the following:

a. Stripping out Jscript
b. Strip out the formatting
c. Strip out the iframes (YES they don’t help bots)
d. Keep those all important meaningful tags (title, meta, h1-h6,img) etc

Your words are stored in individual buckets. A weighting algorithm is applied to give your page an overall rank. Links on the page are also indexed separately. By looking at your links in and links out Google calculates your all important PageRank (although not important to your keyword rank, that’s right, explanation in future post).

So a search engine doesn’t rank your site. It ranks your webpage. If you are using generic page titles throughout your site. Then you are missing opportunities to promote your business under different keywords.

Retrieving and Ranking your Page

When a user does a search, the following happens, in lightning quick time:
a. The top 1000 results are retrieved (from Google’s index, if you are not in that index, you have zero chance of ranking).
b. They rank the page by relevance / importance
c. They reorder this taking into account a bunch of other factors (User behaviors). If you want to know what they are I suggest reading some of the white papers ….

Getting rid of the duplicate content

Google tries it’s best to filter out duplicate content. This is the same (or almost the same) content found on identical URL’s. Google’s whole mantra revolves around user experience. If you are producing duplicate content, there is a good chance you will be buried under “Omitted Results”. It was often thought you were banished into a supplemental index. But sometime in 2007, Google confirmed there was no supplemental index. So just know, duplicate content should be avoided.

How are the 1000 results sorted

Google basically looks at number of things when sorting those 1000 results. The essentials are:

a. On page “optimization”
b. Link reputation
c. Authority / Page Rank
d. Trust Factors
e. User behavior

Check out the next post in this series, Keyword Research and Discovery

Share

Previous post:

Next post: