I enjoyed that, but I'm not sure about a couple of the claims. I switched from A...

diego · on Jan 13, 2011

I was at Inktomi in the late 90s/early 2000s. We were trying to keep up with Google in terms of scale, and we could do everything they could do in terms of link analysis, relevance, etc. Objectively, our results were as good as theirs. They took over in terms of index size because they could add cheap Linux machines easily. We were tied to our expensive Sun hardware, it took us three years to switch to Linux (long story) and then it was too late.

tl,dr: relevance doesn't matter if you don't have the result the user wants.

RyanMcGreal · on Jan 13, 2011

Google's build-out has been nothing short of astonishing. I remember when Google indexed the web once a month - and the web was much smaller then. Now my sites get crawled several times a day - and new articles on my news site are in Google News within less than an hour. The mind boggles.

candeira · on Jan 13, 2011

I just did a search for an error message in a piece of software. The one result from Google was my own pastebinned traceback from five hours earlier.

Google indexes random pastebins faster than I can forget I posted them.

This still cracks me up, but now I laugh nervously: http://www.ftrain.com/robot_exclusion_protocol.html

diego · on Jan 13, 2011

The main reason their cache was useful back then is that many pages were long gone and the links were 404s. Inktomi marketing always tried to play up our freshness compared to Google. In 1999 they had some pages in their cache that were 3-5 months old. We were pushing new indexes once a week, no page was older than a month. Seems quaint now.

nostrademons · on Jan 13, 2011

I find that the cache is useful now because articles get posted to social news sites, the influx of traffic brings down the site, and Google Cache is a handy mirror that usually seems to have the page already.

jforman · on Jan 14, 2011

Ex-Microsoft here. It's been a while since I worked there, but my recollection is the same: blind, automated testing stripped of UI elements showed that Microsoft's search results were at parity to Google's in terms of relevance.

People sometimes forget that all of these large companies have teams of very smart engineers and researchers. Google may be a talent sink, but they're not a talent monopoly.

jobu · on Jan 13, 2011

Swu said it best:

"Google lacks a feature that it should have added year ago:

A search user who is logged in should have the ability to block entire domains from all future results.

The benefits of this are many. The cost is very low.

Why is this option not already available? Google - we depend on you. Do it."

tptacek · on Jan 13, 2011

This is a feature that would be used by 0.00000001% of Google's userbase. It would have the effect of shutting up that tiny subset of Google's users while allowing Google's result to continue to degrade silently. Google may not agree that it's in their long-term interests to hide a problem that may eventually make them competitively vulnerable.

It's also possible that the fast path of "simple search" -> "standard SERP" is so optimized that serving that 0.00000001% of users might involve a headache disproportionate to the payoff.

mistermann · on Jan 13, 2011

>Why is this option not already available?

I think the answer to this question is the same reason why google's results have become spammy. Allowing users to exclude specific domains, or even having a "report as content farm" button, are in direct conflict with google's business model, to a certain degree.

gwern · on Jan 13, 2011

I wonder how much is due to user-friendliness. An invisible global blacklist on my search results? How could that possibly go wrong...

quanticle · on Jan 13, 2011

Well, does it have to be an invisible global blacklist? Would it be possible to create personal blacklists and have one for each user? I mean, Google already customizes search results for logged in users, so this wouldn't be too far of a stretch.

gwern · on Jan 13, 2011

> Would it be possible to create personal blacklists and have one for each user?

I think that's pretty clearly what we were talking about. There are invisible global blacklists already, as should come as no surprise (even if you haven't run into one of the hits omitted thanks to the DMCA).

kijinbear · on Jan 13, 2011

Maybe Google will charge money for the "premium" search service. Take money from advertisers to show their ads, _and_ take money from users to hide the same ads. I'll hate Google when that day comes.

At least I take comfort in the fact that programming-related searches now return more StackOverflow links than ExpertSexChange links.

cpeterso · on Jan 13, 2011

I believe there are Firefox and Chrome extensions to blacklist sites from Google search results.

saurik · on Jan 13, 2011

How much of Google returning better results than AltaVista, though, came down to Google being able to expand its index with new pages faster? What I remember is exactly what the article claims: that competing search engines couldn't "keep up" with Google's increasing scale.

michael_dorfman · on Jan 13, 2011

My experience, or at least my recollection, was quite different. I had no complaints whatsoever with AltaVista not adding pages to the index quickly enough; rather, it is that the results increasingly appeared to be random pages that happened to contain the search term, and you might need to wade through several pages of results to find what you were looking for. With Google, the page you wanted seemed to almost uncannily be located toward the top of the list. At the time, it seemed like magic.

petervandijck · on Jan 13, 2011

I remember Google mostly for having good results, yes, but more for being FAST. It was just soo fast, when all other search engines took up to seconds to show results.

randall · on Jan 13, 2011

I distinctly remember 404s on Yahoo, which was why I switched.