|
|
 |
February 09, 2003
Why are some blogs popular? And other "power law" mysteries

Check out the chart above: It illustrates the most popular blogs, ranked by their "inbound links" -- the number of people who link to them. As is obvious, a very small number of blogs account for the vast majority of traffic.
There's a great piece by Clay Shirky today about this phenomenon -- it's called the "power law distribution." It basically notes that in a field of open competition, we normally expect that a plethora of choices will flatten everyone's popularity. Think of it this way: Say I and a friend set up rival lemonade stands on our street. Say we were roughly the same quality, and had equally as good advertising and word-of-mouth. You would expect that we'd both get roughly half the street's business, right? If another equally-similar rival showed up, they would pose competition -- and the pie would be sliced three ways. Each new competitor flattens everyone's popularity. Entire textbooks of traditional economics are based on this basic concept.
But that's not the way the world works, does it? We all know this. We know that there are a couple of dozen almost-equally-good soft drinks out there, but somehow, Pepsi and Coke and Ginger Ale dominate. Same goes for TV shows, clothes, cars -- you name it. This is because of the cardinal rule of "power law distributions": One person's choice affects another's. Blogs have become the latest example of this, as Shirky notes:
If we assume that any blog chosen by one user is more likely, by even a fractional amount, to be chosen by another user, the system changes dramatically. Alice, the first user, chooses her blogs unaffected by anyone else, but Bob has a slightly higher chance of liking Alice's blogs than the others. When Bob is done, any blog that both he and Alice like has a higher chance of being picked by Carmen, and so on, with a small number of blogs becoming increasingly likely to be chosen in the future because they were chosen in the past.
Think of this positive feedback as a preference premium. The system assumes that later users come into an environment shaped by earlier users; the thousand-and-first user will not be selecting blogs at random, but will rather be affected, even if unconsciously, by the preference premiums built up in the system previously.
Note that this model is absolutely mute as to why one blog might be preferred over another. Perhaps some writing is simply better than average (a preference for quality), perhaps people want the recommendations of others (a preference for marketing), perhaps there is value in reading the same blogs as your friends (a preference for "solidarity goods", things best enjoyed by a group). It could be all three, or some other effect entirely, and it could be different for different readers and different writers. What matters is that any tendency towards agreement in diverse and free systems, however small and for whatever reason, can create power law distributions.
This is what's interesting about power-law distributions: They aren't about "merit", at least in the way we normally think about it. It's more like high school: Why is that guy popular? Because he's popular. This is okay when it comes to blogs, of course. The stakes are lower; having a high or low-rated blog doesn't -- yet, anyway -- matter too much to your livelihood. (Also, even a small audience can be useful. Though Collision Detection probably doesn't have more than a few dozen people regularly linking to it, that's enough to make it the #1 result for a Google search "Clive Thompson".) Power-law distributions do no real harm online.
But in the real world of work, they're much more troubling. As Robert Frank and Philip J. Cook noted in their superb, superb book The Winner-Take-All Society, power-law networks wind up massively rewarding people for minute, tiny, almost indistinguishable differences in talent. Increasingly, the guy making $40 million a year is only about 2% more qualified than the one making $400,000, and maybe only 5% more than the one making $40,000. Under power-law distributions, the workplace is now resembling the worlds of Olympic or pro sports -- where being a tenth of a second faster than someone else, an amount that would normally be considered vanishingly negligible, separates those who remain amateurs from those who get gazillion-dollar endorsements.
Now for something completely different -- well, almost completely different. If you're interested in this power-law stuff, you might want to check out a feature I wrote last summer for the New York Times Magazine. It was about Richard Wallace, the creator of ALICE -- one of the world's most life-like chatbots.
ALICE "works" well because Wallace had a power-law epiphany: He realized that a tiny number of utterances -- maybe 40,000 -- make up over 95% of everything humans say in everyday conversation. To seem incredibly realistic, a chatbot didn't need to have highly sophisticated artificial intelligence. All it needed was to have preprogrammed responses for those 40,000 everyday utterances:
Wallace had hit upon a theory that makes educated, intelligent people squirm: Maybe conversation simply isn't that complicated. Maybe we just say the same few thousand things to one another, over and over and over again. If Wallace was right, then artificial intelligence didn't need to be particularly intelligent in order to be convincingly lifelike. A.I. researchers had been focused on self-learning ''neural nets'' and mapping out grammar in ''natural language'' programs, but Wallace argued that the reason they had never mastered human conversation wasn't because humans are too complex, but because they are so simple.
''The smarter people are, the more complex they think the human brain is,'' he says. ''It's like anthropocentrism, but on an intellectual level. 'I have a great brain, therefore everybody else does -- and a computer must, too.''' Wallace says with a laugh. ''And unfortunately most people don't.''
(Thanks to Boing Boing for originally pointing out Shirky's piece!
Update: In the comment sections for this item, Jeff Liu has posted some extremely cool research related to power-law distributions; go check them out!
Also, I just noticed that the big-ass graphic at the top of this piece has once again stretched my blog template into a slightly wider-than-usual shape. I lack the requisite design kung-fu to fix it, so, there it is.)
Posted by Clive Thompson at February 09, 2003 08:04 PM
Trackback Pings
TrackBack URL for this entry: http://www.collisiondetection.net/mt3/mt-tb.cgi/239
In Six Degrees: The Science of a Connected Age, a book I cannot recommend highly enough, there is a passage I'm going to paraphrase because the book is at home now and if I wait until I get home I'm going to forget. (If you see me online after 8:00pm and want the full excerpt, yo me.)
There was a psychiatry experiment where ten people were shown into an auditorium and shown a series of slides. The slides would have an image on the left, and three images on the right. One of the images on the right would match exactly the image on the left, and it would be immediately apparent. For example, the image on the left would be a line segment, and the images on the right would be line segments, but the ones that didn't match would be twice or three times as long.
For each slide, each person was asked in turn which image, a,b, or c on the right matched the image on the left. Nine of the ten test subjects were plants and were told to give the same wrong answer. When the tenth person, the real test subject was asked, a shockingly large percentage (more than zero) would give the wrong answer!
This is how power laws for blogs come about. People are social animals and will gravitate towards blogs that many other people also read, because, hey, if that many other people read it, it must be good.
And in other news, one-click shopping is going to bankrupt me one of these days.
I now have a perverse desire to see how long I can make this response.
Posted by: Jeff Liu at February 12, 2003 12:52 PM
That is insanely cool. And, mother of all coincidences, the author of that book -- Duncan Watts -- is speaking at MIT today! I am supposed to be in a Knight Fellow seminar at the same time, but I am considering sneaking out so that I can attend Watts' talk.
I so totally have to read this book!
Posted by: Clive at February 12, 2003 1:24 PM
Jesus Christ! Clive, when are you gonna get off of the penny ante journalist trip and write a damn book on this stuff? I am getting impatient!
Posted by: Erik at February 17, 2003 1:46 PM
There was a psychiatry experiment where ten people were shown into an auditorium and shown a series of slides. The slides would have an image on the left, and three images on the right. One of the images on the right would match exactly the image on the left, and it would be immediately apparent. For example, the image on the left would be a line segment, and the images on the right would be line segments, but the ones that didn't match would be twice or three times as long.
Posted by: dsl at January 4, 2004 4:47 PM
There was a psychiatry experiment where ten people were shown into an auditorium and shown a series of slides. The slides would have an image on the left, and three images on the right. One of the images on the right would match exactly the image on the left, and it would be immediately apparent. For example, the image on the left would be a line segment, and the images on the right would be line segments, but the ones that didn't match would be twice or three times as long.
Posted by: dsl at January 8, 2004 8:10 AM
Posted by: Online Casino at January 16, 2004 2:52 AM
To address this issue, we turn to the second place to put variables, which is called the Heap. If you think of the Stack as a high-rise apartment building somewhere, variables as tenets and each level building atop the one before it, then the Heap is the suburban sprawl, every citizen finding a space for herself, each lot a different size and locations that can't be readily predictable. For all the simplicity offered by the Stack, the Heap seems positively chaotic, but the reality is that each just obeys its own rules.
Posted by: Bennett at January 20, 2004 12:23 PM
When a variable is finished with it's work, it does not go into retirement, and it is never mentioned again. Variables simply cease to exist, and the thirty-two bits of data that they held is released, so that some other variable may later use them.
Posted by: Ingram at January 20, 2004 12:24 PM
Note the new asterisks whenever we reference favoriteNumber, except for that new line right before the return.
Posted by: Marmaduke at January 20, 2004 12:24 PM
When a variable is finished with it's work, it does not go into retirement, and it is never mentioned again. Variables simply cease to exist, and the thirty-two bits of data that they held is released, so that some other variable may later use them.
Posted by: Denton at January 20, 2004 12:24 PM
This back and forth is an important concept to understand in C programming, especially on the Mac's RISC architecture. Almost every variable you work with can be represented in 32 bits of memory: thirty-two 1s and 0s define the data that a simple variable can hold. There are exceptions, like on the new 64-bit G5s and in the 128-bit world of AltiVec
Posted by: Polidore at January 20, 2004 12:24 PM
Being able to understand that basic idea opens up a vast amount of power that can be used and abused, and we're going to look at a few of the better ways to deal with it in this article.
Posted by: Cecily at January 20, 2004 12:24 PM
For this program, it was a bit of overkill. It's a lot of overkill, actually. There's usually no need to store integers in the Heap, unless you're making a whole lot of them. But even in this simpler form, it gives us a little bit more flexibility than we had before, in that we can create and destroy variables as we need, without having to worry about the Stack. It also demonstrates a new variable type, the pointer, which you will use extensively throughout your programming. And it is a pattern that is ubiquitous in Cocoa, so it is a pattern you will need to understand, even though Cocoa makes it much more transparent than it is here.
Posted by: Justinian at January 20, 2004 12:24 PM
We can see an example of this in our code we've written so far. In each function's block, we declare variables that hold our data. When each function ends, the variables within are disposed of, and the space they were using is given back to the computer to use. The variables live in the blocks of conditionals and loops we write, but they don't cascade into functions we call, because those aren't sub-blocks, but different sections of code entirely. Every variable we've written has a well-defined lifetime of one function.
Posted by: Jocatta at January 20, 2004 12:24 PM
When the machine compiles your code, however, it does a little bit of translation. At run time, the computer sees nothing but 1s and 0s, which is all the computer ever sees: a continuous string of binary numbers that it can interpret in various ways.
Posted by: Michael at January 20, 2004 12:24 PM
The rest of our conversion follows a similar vein. Instead of going through line by line, let's just compare end results: when the transition is complete, the code that used to read:
Posted by: Gregory at January 20, 2004 12:24 PM
Posted by: julia at January 24, 2004 8:09 PM
Post a comment
| | |
In Six Degrees: The Science of a Connected Age, a book I cannot recommend highly enough, there is a passage I'm going to paraphrase because the book is at home now and if I wait until I get home I'm going to forget. (If you see me online after 8:00pm and want the full excerpt, yo me.)
There was a psychiatry experiment where ten people were shown into an auditorium and shown a series of slides. The slides would have an image on the left, and three images on the right. One of the images on the right would match exactly the image on the left, and it would be immediately apparent. For example, the image on the left would be a line segment, and the images on the right would be line segments, but the ones that didn't match would be twice or three times as long.
For each slide, each person was asked in turn which image, a,b, or c on the right matched the image on the left. Nine of the ten test subjects were plants and were told to give the same wrong answer. When the tenth person, the real test subject was asked, a shockingly large percentage (more than zero) would give the wrong answer!
This is how power laws for blogs come about. People are social animals and will gravitate towards blogs that many other people also read, because, hey, if that many other people read it, it must be good.
And in other news, one-click shopping is going to bankrupt me one of these days.
I now have a perverse desire to see how long I can make this response.
Posted by: Jeff Liu at February 12, 2003 12:52 PM
That is insanely cool. And, mother of all coincidences, the author of that book -- Duncan Watts -- is speaking at MIT today! I am supposed to be in a Knight Fellow seminar at the same time, but I am considering sneaking out so that I can attend Watts' talk.
I so totally have to read this book!
Posted by: Clive at February 12, 2003 1:24 PM
Jesus Christ! Clive, when are you gonna get off of the penny ante journalist trip and write a damn book on this stuff? I am getting impatient!
Posted by: Erik at February 17, 2003 1:46 PM
There was a psychiatry experiment where ten people were shown into an auditorium and shown a series of slides. The slides would have an image on the left, and three images on the right. One of the images on the right would match exactly the image on the left, and it would be immediately apparent. For example, the image on the left would be a line segment, and the images on the right would be line segments, but the ones that didn't match would be twice or three times as long.
Posted by: dsl at January 4, 2004 4:47 PM
There was a psychiatry experiment where ten people were shown into an auditorium and shown a series of slides. The slides would have an image on the left, and three images on the right. One of the images on the right would match exactly the image on the left, and it would be immediately apparent. For example, the image on the left would be a line segment, and the images on the right would be line segments, but the ones that didn't match would be twice or three times as long.
Posted by: dsl at January 8, 2004 8:10 AM
Nice site. thx.
Posted by: Online Casino at January 16, 2004 2:52 AM
To address this issue, we turn to the second place to put variables, which is called the Heap. If you think of the Stack as a high-rise apartment building somewhere, variables as tenets and each level building atop the one before it, then the Heap is the suburban sprawl, every citizen finding a space for herself, each lot a different size and locations that can't be readily predictable. For all the simplicity offered by the Stack, the Heap seems positively chaotic, but the reality is that each just obeys its own rules.
Posted by: Bennett at January 20, 2004 12:23 PM
When a variable is finished with it's work, it does not go into retirement, and it is never mentioned again. Variables simply cease to exist, and the thirty-two bits of data that they held is released, so that some other variable may later use them.
Posted by: Ingram at January 20, 2004 12:24 PM
Note the new asterisks whenever we reference favoriteNumber, except for that new line right before the return.
Posted by: Marmaduke at January 20, 2004 12:24 PM
When a variable is finished with it's work, it does not go into retirement, and it is never mentioned again. Variables simply cease to exist, and the thirty-two bits of data that they held is released, so that some other variable may later use them.
Posted by: Denton at January 20, 2004 12:24 PM
This back and forth is an important concept to understand in C programming, especially on the Mac's RISC architecture. Almost every variable you work with can be represented in 32 bits of memory: thirty-two 1s and 0s define the data that a simple variable can hold. There are exceptions, like on the new 64-bit G5s and in the 128-bit world of AltiVec
Posted by: Polidore at January 20, 2004 12:24 PM
Being able to understand that basic idea opens up a vast amount of power that can be used and abused, and we're going to look at a few of the better ways to deal with it in this article.
Posted by: Cecily at January 20, 2004 12:24 PM
For this program, it was a bit of overkill. It's a lot of overkill, actually. There's usually no need to store integers in the Heap, unless you're making a whole lot of them. But even in this simpler form, it gives us a little bit more flexibility than we had before, in that we can create and destroy variables as we need, without having to worry about the Stack. It also demonstrates a new variable type, the pointer, which you will use extensively throughout your programming. And it is a pattern that is ubiquitous in Cocoa, so it is a pattern you will need to understand, even though Cocoa makes it much more transparent than it is here.
Posted by: Justinian at January 20, 2004 12:24 PM
We can see an example of this in our code we've written so far. In each function's block, we declare variables that hold our data. When each function ends, the variables within are disposed of, and the space they were using is given back to the computer to use. The variables live in the blocks of conditionals and loops we write, but they don't cascade into functions we call, because those aren't sub-blocks, but different sections of code entirely. Every variable we've written has a well-defined lifetime of one function.
Posted by: Jocatta at January 20, 2004 12:24 PM
When the machine compiles your code, however, it does a little bit of translation. At run time, the computer sees nothing but 1s and 0s, which is all the computer ever sees: a continuous string of binary numbers that it can interpret in various ways.
Posted by: Michael at January 20, 2004 12:24 PM
The rest of our conversion follows a similar vein. Instead of going through line by line, let's just compare end results: when the transition is complete, the code that used to read:
Posted by: Gregory at January 20, 2004 12:24 PM
Posted by: julia at January 24, 2004 8:09 PM