Interesting Facts About Google Search Engine [View all]
http://someinterestingfacts.net/interesting-facts-about-google-search-engine/
But how does Google build this index- and how does it populate it with accurate and meaningful results data that will be useful to users? Even Google has to start somewhere, so it uses software programs known as spiders, also commonly referred to as crawlers or Googlebots. These useful little crawlers are sent out initially to the most logical places on the web. If you search for Marmite, most likely the first site the spider will have compiled your search results by visiting will be www.marmite.com, so no rocket science as yet. This first stage of website search is known as the seed level.
After we pass the seed level we start to branch out. The spiders will then crawl outwards further and follow links from the initial pages that it finds and start to weave a web of interconnected websites that share relevance in terms of content. The spider builds up a pattern of pages linked to pages, which must be recursively revisited in order to ensure they still contain content relating to the original search. Pages are revisited based on frequency policies that are set by software that resides on Googles servers. But what we need to remember is that the web is so vast and changeable that no spider will ever capture all the information out there.
So lets start with a search. Say we want to look up toasted cheese sandwiches. We type in those three words and press Return. Googles query processor software then gets to work filtering through its indexes to decide which links to present. But hang on whats to stop us getting results on cheese making, results on toaster-buying advice and results on the Earl of Sandwich? Well, Google asks questions. More than 200 questions in fact. You could say that Googles software uses a little artificial intelligence fiere because it tries to apply human logic to the vast lumps of raw data that it has to wade through.
To decide which toasted cheese sandwich website to present to us, Google asks whether the words appear in the websites title or URL. Google asks how many times the words appear in the correct order on any given website. Does the page include synonyms for toasted cheese sandwich such as grilled Cheddar buttie or hot cheesy panini. Discussing the mechanics of how to describe a toasted cheese sandwich might sound silly, but its all logical to the guys who run Googles data centre.
more at link above