|
Home Hacks Interesting Links Local Links Photography Parks Cooking |
Comparing Google and MSN Search(Apr. 7, 2005) Update: Some of the deficiencies mentioned here have been fixed by MSN; I will mark those out as and when I confirm the fixes. Feb. 23, 2005 Google is the undisputed heavyweight champion of search engines. Even though it was a latecomer to the search arena, superior technology and elegant design have made it numero uno. As expected, Microsoft has responded to this phenomenon by unleashing their own Mothra to Google's Godzilla (you didn't think the wings on MSN Search were a coincidence, did you?). Seeing that MSN Search (hereafter referred to as MSN) has Google squarely in its sights, it is interesting to see how far MSN has come, and how close is it to catching up to (or even surpassing) Google. Having suddenly found myself with a lot of spare time, I decided to put it to good use: I decided to objectively compare the two search engines using the following criteria:
Methodology: I used a P4 2.8Ghz machine with 256MB memory, running the browser Firefox. I used the same browser for both sites, to avoid any browser bias. The WinXP system was patched with the latest patches from Microsoft. This machine is connected to a Linux box (my main workstation) which is also the firewall, and I used tcpdump on the Linux box to capture the packets flying about, to get accurate size and timing results. There was nothing else of significance running on the WinXP machine; it had been freshly rebooted. Windows firewall was active, but Firefox was allowed unfettered access to the 'net. Also, a few queries were run on both the engines to "prime" them, so to speak (resolve DNS queries, cache images, etc). Before going any further, please keep in mind the following disclaimers: I have no relationship with Google and/or Microsoft (or MSN); while it is possible to nitpick problems with this little "study", this is not a PhD dissertation and should be treated just like any other such study you would find on the Internet.
But before we jump into monitoring the network traffic, we must remove the network delay from our calculations; after all, it isn't the SE's fault if I'm sitting in Hicksville USA on the wrong end of a dialup line (with due apologies to those of you who are on the wrong end of a dialup line).
Ping times:
C:\>ping -n 10 search.msn.com
Pinging a134.g.akamai.net [209.18.34.71] with 32 bytes of data:
Approximate round trip times in milli-seconds:
Minimum = 46ms, Maximum = 107ms, Average = 53ms
C:\>ping -n 10 www.google.com
Pinging www.google.akadns.net [64.233.161.99] with 32 bytes of data:
Approximate round trip times in milli-seconds:
Minimum = 47ms, Maximum = 89ms, Average = 55ms
Actual tests: First, I queried both the engines with randomly generate pairs of words. The idea was to see how long the engines take for some really rare search terms: usually these resulted in none or few matches. The time taken (the time difference between the sending of the first packet and the sending of the last ACK on delivery of the search results) is shown below, along with the terms.
Next, we move to the other end of the spectrum and query the engines with really popular terms. Since Ms. Hilton seems to be quite popular these days, her name was the first term used. The others were chosen for similar reasons ("Apache" is a popular server and many sites have the server's name at the bottom; "contact us" is also in many pages, hence "contact" was chosen).
Number of hitsThis time, we are looking at how many matches did the SE report. Of course, there's no way to verify this number; the SE can report whatever number it pleases, but I'm hoping the SEers won't resort to such shenanigans.
For this category, I will just refer to the "Paris Hilton" table above
(sidetrack: it is quite interesting to see how Google vanquishes MSN in the number
of matches for "Paris Hilton". Since she's a recent "pop" on the scene,
my guess is that Google's robots are quicker at trawling for information
than MSN's). Not to be discouraged by this amazing bit of insight, I did several queries with terms derived from emails I have received seeking out information. The idea was to see if the same terms, when given to an SE, would result in hits that might have answered the original question. Most often, the sites returned by both the engines were similar and ranked similarly. But there were a few exceptions, and these are given below.
Query: how to scan pictures
Scanning Basics 101 - All about digital images
and for MSN, these are the first 5:
A few scanning tips. by Wayne Fulton. The purpose is to offer some scanning tips and to explain the basics for photos and documents. ... Scanning for Beginners or Basic Scanning Techniques ... I'd even seen it done. Crisp, clean scans that looked as good as the original photos. ... Unless otherwise noted, all photos on this site are displayed as scanned. ... HOW DO I SCAN PICTURES TAKEN WITH FILM CAMERA WITH MY HOME... ... ...SCAN PICTURES TAKEN WITH FILM CAMERA WITH MY HOME... ... how do i scan pictures with a printer/scanner. Your answer will be published for anyone to see and rate. ... Scanning Pictures ... and then select Adobe Photoshop 6.0, as shown in the picture below ... photos, black and white vs ... The preview button is used to scan a preview of the document in the ... Windows XP and Digital Photography: Printing and Scanning Pictures ... How to insert digital photos into Word to customize their printing layout. ... like home printing and online photo sharing to pictures you shot ... Scanning Line Art ...
Trend Micro - Free online virus Scan
I was indeed quite surprised to see links to an anti-virus site, a CATscan site and a Lottery site (???) in the top 5 for this search.
... Ease your mind and scan your PC for viruses. Scan Now. It's Free! Trend Micro. Mobile Security: The integrated solution provides automatic, real-time scanning to protect wireless devices against ... Computed Tomography (CT) Scan ... scan . The dye makes blood vessels and other structures or organs more visible on the CT scan pictures . The dye may be used to evaluate blood flow, detect tumors, and locate areas of ... BinaryPhotography.com - powered by vBulletin ... What do you scan pictures with? Never 2 6 Printers Discuss regular, color laser, and photo printers Never 1 6 Etc. Tripods, Camera bags, Memory cards, Filters, Batteries, Etc. Never 3 9 Software for manipulating ... Big Lottery Fund Access key list, click here to skip accessibility, alt+0. .home, alt+1, The big lottery fund, alt+l. Newsroom, alt+r. Consultation forum, alt+c. Funding Programmes.alt+p Click here to read access keys ... How To Scan Pictures and Prepare Them for the Web How To Scan Pictures and Prepare Them for the Web by Sheryl Cormicle Knox with adaptations for use at Lapeer County Library by Victor P. Illian Note: This document describes the process of ... Another search that returned less-than-stellar results was the question, "do I need more memory" (without the quotes). First, Google reports 25,900,000 matches to MSN's 1,180,365 (a 22x difference). But 4 of the 9 results returned by MSN (yes, it returns 9 on the first page) all point to the same site, kingston.com. While I agree that Kingston are purveyors of some fine memory, I am sure there are other sites which could have answered the question better.
The query for "sony vaio laptop" would seem fairly straightforward;
but this was not to be. On Google, the first result is to Sony's site. However, on MSN, the first two results are to
your run-of-the-mill "get free laptop"
Verdict? Google is better; while most of the searches will fare
equally well on both engines, Google's results seem more relevant in
some cases.
In other words, we are interested in seeing how well the clusters of information are represented in the search results. For a better view of this clustering phenomena, I would recommend a quick excursion to Clusty the clustering engine. Query:jaguar. The first page of results on Google has representatives from all of the three distinct sets mentioned above. On the other hand, MSN's results are dominated by links to the car, with a couple of links to companies with "Jaguar" in the name thrown in. Surprisingly, there's no mention of Apple's Jaguar on the first page of results, and the big cat is nowhere to be found. Query: football. Now, what the North Americans call "football" is known as something else in the world; and what the rest of the world calls "football" is known as "soccer" in North America. On Google, a search for "football" brings up NFL at #1, FIFA (the world body of soccer) at #2, as well as UEFA (European league) among the top 10. On MSN, there's no mention of FIFA or UEFA in the first page of results. Query: polo. A search on Google returns links to Ralph Lauren's outfit, as well as to US Polo Association and Water Polo sites in the first page. On MSN, 7 of the first 10 hits are to Ralph Lauren's company, an unwarranted domination.
Verdict? Google looks to be the winner. Its results
are well-balanced, with no single site hogging the limelight, and diverse
subsets are well-represented.
MSN has responded with its own list of features, and here I'll take a quick look at how they compare, without exploring all possibilities exhaustively.
CalculatorBoth offer a calculator; you can type in expressions and the SE will return the answer to you. For example, 987*432 returns 426,384 as the answer. However, there are some differences:
Other Convenience FeaturesGoogle offers other convenience features like FedEx and UPS parcel tracking; Vehicle ID Number (VIN) lookups; patent numbers, etc. Complete list is here. MSN does not have the full repertoire yet, but these features are trivial to implement, so expect MSN to have them soon (should they choose to do so).And finally: Google returns up to 1000 matches to a search request, while MSN returns only a maximum of 250. While I pity the person who has to click through these many result pages, it may occasionally be useful.
Verdict? This round also goes to Google. But
it should be easy for MSN to catch up to (and surpass) in this department.
One of my biggest fears is that these SEs will become the de-facto portals to the information universe, and hence become the single point of failure; if someone can force them to not show some content, then that content will effectively become blacklisted. These SEs could make censorship easier. Google and Yahoo have been reported to be censoring content in China, as per that government's wishes. Some could say it is the price of business: follow the law of the land. Some would disagree: information yearns to be free. This issue is too complicated for me to wrap my nerdy little brain around. What we can see is if these SEs are practicing any "self-censorship". What better way to test this than by looking at the mother of all lightning rods: Scientology. A quick peek over at MSN reveals that the results from MSN are all pro-scientology, without any mention of sites offering opposing viewpoints (and we do know there are quite a few of them). Google's results are much more balanced; the first result is for the Church of Scientology (as expected), but the second is for "operation clambake". About half of Google's results are to sites offering opposing viewpoints to Scientology. I am sure there are others out there who can offer more such examples; I would be more than happy to list them out here if you send me some.
Verdict? MSN may find it hard to balance their
other interests with the role of an unbiased search results provider. Even
though Google is a public company, expect them to offer more resistance to
the control freaks than MSN ("do no evil" ?).
The end result, of course, is that we the consumers win. It's a great time to be alive! Comments, criticisms and large sums of money welcome. :) |
|