FREE counter and Web statistics from sitetracker.com
collision detection
content | discontent
send me yours
July 07, 2003
"The" -- a secret code to unlocking Google?

A couple of days ago, I posted about a neat Google hack -- the search results for "weapons of mass destruction". In the comment field for the item, Franco pointed out that when he recently tried to seach for the goddess "Tykhe", Google asked him if he really meant to search for the word "the". As Franco sardonically joked: "Yes, I meant to search the entire internet for the word 'the' -- a word which you refuse to search for." And it's true: Whenever you type in a search string with common words like "the" or "and", Google strips them out. Generally, Google won't even allow you to include "the" as a search term.

But here's the weird thing: If you type in only the word "the" as a search, you actually do get results. When I searched for "Tykhe", Google gave me the same response it gave Franco:

Searched the web for Tykhe -- Results 1 - 10 of about 302. Search took 0.05 seconds. Did you mean: The

So I clicked on the "the" search, and discovered it generates 3,680,000,000 results. The top-ranked search results are, in order:

The Onion
The White House
The Economist
NASA
The Guardian
AllTheWeb.com
The Weather Channel
The New York Times
The Washington Post
The Hunger Site

This is really intriguing. Since "the" is the most common word in the English language, it would -- theoretically -- be distributed pretty evenly around the Internet. In that case, when Google searches for "the", it faces a unique situation. It would be very hard for Google's semantic or key-word-matching tools to figure out which web site used the word most frequently, or in a most significant fashion. Most semantic or key-word-matching reasoning is rendered useless. And indeed, look again at the number of results: 3,680,000,000. That's almost precisely the number of sites that Google claims to index -- 3,083,324,652. Thus, the search "the" is returning results for every single page on the Internet.

In this situation, the main trick Google has to fall back on is PageRank: Its patented system for determining which sites are important, by counting the number of links that point to them. This would mean, then that The Onion -- and those other nine sites -- may have more links to it than most other sites on the Net. They are, in effect, the most popular sites on the Net, since PageRank popularity is clearly the main criteria -- if not the only criteria -- that Google is using to place them on the Top 10 list, right?

Well, maybe. Possibly the names of the sites are important, too. Notice that, except for NASA, all the sites have the word "the" in their official web-site title -- and thus probably also in their meta tags, and various other semantically important bits of HTML. That may explain why The Hunger Site appears so high.

Pretty weird, eh?

Posted by Clive Thompson at July 07, 2003 11:01 PM

Trackback Pings

TrackBack URL for this entry: http://www.collisiondetection.net/mt3/mt-tb.cgi/424

Comments

As I discovered a while back, if you Google on just the letter "s" you get www.gnu.org as the top result. My theory is that it is because of the "'s" in their slogan, "GNU's Not Unix."

Posted by: Tom at July 8, 2003 10:49 AM

Heh.

Posted by: Clive at July 14, 2003 1:34 PM

Also, you can include a plus sign in front of any word google removes to force it to stay there. eg. "+the onion" or "+this +or +that"

Posted by: RicMoo at October 11, 2003 6:29 PM

Oh, that's cool!

Posted by: Clive at October 13, 2003 11:14 PM

Where can I find more information about this ?

Posted by: Swinging Couples at January 11, 2004 11:21 AM

Nice site. thx.

Posted by: Online Casino at January 16, 2004 2:50 AM

To address this issue, we turn to the second place to put variables, which is called the Heap. If you think of the Stack as a high-rise apartment building somewhere, variables as tenets and each level building atop the one before it, then the Heap is the suburban sprawl, every citizen finding a space for herself, each lot a different size and locations that can't be readily predictable. For all the simplicity offered by the Stack, the Heap seems positively chaotic, but the reality is that each just obeys its own rules.

Posted by: Cassandra at January 19, 2004 6:51 PM

Each Stack Frame represents a function. The bottom frame is always the main function, and the frames above it are the other functions that main calls. At any given time, the stack can show you the path your code has taken to get to where it is. The top frame represents the function the code is currently executing, and the frame below it is the function that called the current function, and the frame below that represents the function that called the function that called the current function, and so on all the way down to main, which is the starting point of any C program.

Posted by: Edith at January 19, 2004 6:51 PM

The rest of our conversion follows a similar vein. Instead of going through line by line, let's just compare end results: when the transition is complete, the code that used to read:

Posted by: Prudence at January 19, 2004 6:51 PM

This is another function provided for dealing with the heap. After you've created some space in the Heap, it's yours until you let go of it. When your program is done using it, you have to explicitly tell the computer that you don't need it anymore or the computer will save it for your future use (or until your program quits, when it knows you won't be needing the memory anymore). The call to simply tells the computer that you had this space, but you're done and the memory can be freed for use by something else later on.

Posted by: Bellingham at January 19, 2004 6:51 PM

Earlier I mentioned that variables can live in two different places. We're going to examine these two places one at a time, and we're going to start on the more familiar ground, which is called the Stack. Understanding the stack helps us understand the way programs run, and also helps us understand scope a little better.

Posted by: Cornelius at January 19, 2004 6:52 PM

But variables get one benefit people do not

Posted by: Elizabeth at January 19, 2004 6:52 PM

Seth Roby graduated in May of 2003 with a double major in English and Computer Science, the Macintosh part of a three-person Macintosh, Linux, and Windows graduating triumvirate.

Posted by: Prospero at January 19, 2004 6:52 PM

Seth Roby graduated in May of 2003 with a double major in English and Computer Science, the Macintosh part of a three-person Macintosh, Linux, and Windows graduating triumvirate.

Posted by: Quivier at January 19, 2004 6:52 PM

Let's see an example by converting our favoriteNumber variable from a stack variable to a heap variable. The first thing we'll do is find the project we've been working on and open it up in Project Builder. In the file, we'll start right at the top and work our way down. Under the line:

Posted by: Polidore at January 19, 2004 6:52 PM

The rest of our conversion follows a similar vein. Instead of going through line by line, let's just compare end results: when the transition is complete, the code that used to read:

Posted by: Jenkin at January 19, 2004 6:52 PM


  • カリビアンコム caribbeancom

  • カリビアン caribbean

  • 米国性動画通信 2345 2345.tv 外人 海外 洋物 洋もの ポルノ 無修正ビデオ

  • オナニーコム 0721.com 0721

  • コスチュームシアター costumetheater

  • DXLIVE ライブチャット

  • DXライブ デラックスライブ ビデオチャット

  • EXSHOT 動画チャット

  • EXショット アダルトチャット

  • Girls On Air GirlsOnAir

  • ガールズオンエアー GOA

  • 一本道 一本堂 1本道

  • HGMO HGMO H:G:M:O

  • 東京真夜中DX TOKYONIGHTS

  • URAYA URAYAOnlineTV うらや オンライン TV

  • X-GALLERY XGALLERY Xギャラリー

  • 日本人のおしっこEX JapanesePeeEX Japanese Pee EX

  • 赤外線盗撮の世界 XRAY

  • Erox EroxJapan Z EroxJapanZ エロックス エロックスジャパンZ

  • URAMOVIE 裏ムービー

  • HYPER裏ビデオ通信 ハイパー裏ビデオ通信 PINKEYES.COM

  • 出会い PURE21

  • エロアニメ TV エロアニメTV EroanimeTV Eroanime TV

  • ネットコミック NETCOMIC

  • 週間ドラムカン 文庫ドラムカン

  • JapaBeauty JapaBeauty.tv ジャパビューティー

  • eroika eroika.com エロイカ

  • 抜天市場 抜き天 抜店 nukiten

  • 画像

  • 動画

  • アイコラ

  • 盗撮

  • 無料

  • 写真集

  • 無修正

  • 覗き のぞき ノゾキ

  • サンプル SAMPLE

  • 壁紙

  • レイプ

  • パンチラ ぱんちら

  • コスプレ

  • 裏ビデオ

  • アダルト あだると

  • ダウンロード DOWNLOAD

  • AV女優タレントアイドル

  • 巨乳

  • ブルマ ぶるま

  • セーラー服 制服 ブルセラ

  • マンコ まんこ

  • セックス SEX

  • エッチ えっち

  • 熟女 人妻

  • オナニー おなにー

  • 女子校生 女子高生

  • エロ えろ ero

  • ヌード ヘアヌード nude

  • 99 BB 9BB 99B 99bb 99bb.com GON 裏ビデオ 無修正 有料 会員制

  • フリーセックス フリーセックスジャパン FreeSexJapan Free Sex Japan

  • 東京キュートガールズ TokyoCuteGirls Tokyo Cute Girls

  • 覗き屋 のぞきや のぞき屋 Nozokiaya

  • 直撃ドットコム Chokugeki.com

  • 熟まん 熟マン Jyukuman

  • Jハードコア Jhardcore J hardcore

  • JPムービーズ JPmovies JP movies

  • プリティーピンク プリティーピンクジャパン PrettyPinkJapan Pretty Pink Japan

  • JPチックス JPChicks

  • ハードポーン ハードポーンジャパン HardPornJapan Hard Porn Japan

  • J-SMUT JSMUT Jスマット

  • ヌキヌキ学園 ぬきぬき学園

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • .

  • Posted by: julia at January 24, 2004 6:55 PM

    Post a comment

    Thanks for signing in, . Now you can comment. (sign out)

    NOTE: If you posted a comment and you can't see it -- try refreshing your browser.


    Remember me?