There’s been a lot of talk around the blogosphere about the interaction between search engines, blogs, and professional news outlets. The current round of hand-wringing was kicked off by Andrew Orlowski’s mean-spirited discussion of "googlewashing" in the Register. Seems that an article by Patrick Tyler in the New York Times described the anti-war protesters around the world in the build-up to the current war in Iraq as a "second superpower"–a rather nice way of thinking about us protesters, especially as our influence was almost universally maligned practically everywhere else. But a few weeks later, James F Moore posted "The Second Superpower Rears its Beautiful Head" to his website, describing net users, in the spirit of Joi Ito’s "emergent democracy", as a "second superpower"–again, a pretty nice way to describe us bloggers and other ‘net enthusiasts, especially as our influence is almost universally maligned practically everywhere else. Within a short time, bloggers’ posts discussing and linking to Moore’s piece had raised it’s Whuffie far beyond Tyler’s piece, so that any search for "second superpower" invariably returns comments on Moore’s piece instead of on Tyler’s. Says Orlowski:
Although it took millions of people around the world to compel the Gray Lady to describe the anti-war movement as a "Second Superpower", it took only a handful of webloggers to spin the alternative meaning to manufacture sufficient PageRank™ to flood Google with Moore’s alternative, neutered definition.
Indeed, if you were wearing your Google-goggles, and the search engine was your primary view of the world, you would have a hard time believing that the phrase "Second Superpower" ever meant anything else.
To all intents and purposes, the original meaning has been erased. Obliterated, in just seven weeks.
"Noise," they call it. Too much blogging getting in the way of the "real" news, the news that’s "fit to print". Given the reaction of bloggers to Orlowski’s charges–even those who mock his assertions of "googlewashing" and his weak comparison with Orwell’s "Newspeak", even those who rail mightily against Orlowski’s brushing aside of Moore and Ito’s theories as "neutered" and " vague and elusive"–you might be forgiven for laying charges of "self-hating blogger" at their virtual doors. Because in the end, most seem to agree with Orlowski’s basic assertion–that Tyler’s piece and its commentary should be privileged by search engines, and if not for all the "noise"–soon we’ll be calling it "chatter"–it certainly would be.
Consider Doc Searls’ excellently well-thought-out analysis of the situation ("Maybe it’s about the ratio of linkable to unlinkable pages" and one or two posts every day since). As Searls correctly notes, the NYTimes–along with many other news sites–has a nasty habit of hiding its archives from search engines to protect their for-pay content (anything older than 7-days at the NYTimes) and to force surfers to use their interface to search for articles (they get more ad impressions that way). So it’s no surprise that Moore’s article fares better on search engines than Tyler’s–Tyler’s isn’t on the search engines at all. Searls notes that if the big news outlets want to sequester their stories behind "paywalls" when search engines–particularly Google–provide probably the most-used interface to information on the web, they can hardly complain when information that is publicly searchable becomes more well-known than the information they have, for all intents and purposes, pulled out of circulation.
But what’s striking is that Searls–a renowned blogger (#91 in the Blogosphere Ecosystem)–sees this relative invisibility of "real" news outlets like the NYTimes in comparison with the blogosphere as a problem to be solved, suggesting:
Here’s a thought. What would happen if the archives of all the print publications out there were open to the Web, linkable by anybody, and crawlable by Google’s bots? Would the density of blogs "above the fold" (on page one) of Google searches go down while hard copy sources go up? I’ll betcha it would.
My point: Maybe this isn’t about "gaming" algorithms, but rather about a situation where one particular type of highly numerous journal has entirely exposed archives while less common (though perhaps on the whole more authoritative) others do not.
It’s time for The New York Times and the other papers to step forward, join the real world and correct the problem. Expose the archives. Give them permanent URLs. Let in the bots. Let their writers, and their reputations, accept the credit they are constantly given and truly deserve.
And he’s not alone. PuddingBowl is glad to read that "Google might finally be doing something about the problem of ‘blog noise.’" Ryan Lowe believes his undeservedly high number of hits from Google is "proof that Google is far from perfect." Virulent Memes points out that while some bloggers might have something of value to say, "the majority (myself included)… use this or that personal publishing system to deposit their neural bilge into the noosphere?" and suggests that Google "tweak the PageRank system, as they regularly do, to mark down blogs a notch or two." Fernando Pereira writes:
…Google’s ranking system stops at the edge of print and so may present a biased view of authority. This may not be evident to the average Google user, or NYT reader, and it is worth saying. The NYT may be creating a problem for itself by locking up it back issues, but the problem still stands in other areas. Influential writing in many areas is not available online, especially older writing. Blogosphere advocates may huff and puff about the shortsightedness of paper distribution, but the central issue is knowledge, not publishing tactics. Users of Google and other search engines need to be aware of the outgoing "links" into the print world and the implicit bias that not following them imposes on knowledge seekers. Google is an amplifying instrument that makes more obvious the edges of knowledge networks.
And so on. These are all, for the most part, "thoughtful critiques" (in Pereira’s words) but they take for granted 1) the existence of blog "noise", and b) that blog noise is a problem in need of fixing. But why do they (we?) consider blogs "noise" rather than part of the "signal". Why do we have an inferiority complex with relation to the NYTimes and other professional outlets?
One reason, I think, is that we all implicitly consider blogging a form of, or at least an extension of, journalism. As such, we consign ourselves to the "poor cousin" role, lacking the institutional resources–funding, access to sources, editorial review, etc.–of professional news outlets. While there are some formalistic similarities–topical stories, temporal arrangement of information–I think we do a disservice to both ourselves and to professional journalists in this comparison. To ourselves because considering ourselves "journalists-lite" denigrates what we actually do, and to journalists because it portrays their training, standards, and professionalism as unimportant to the task at hand.
But writing to different standards than those of professional journalists is no reason we should have to hold our hats in hand and beg for scraps from the Internet table. Google and other search engines are not tools for disseminating news stories. We don’t complain that a search for a book like Our Final Hour: A Scientist’s Warning brings up results from e-commerce sites like Amazon, Shopping2, and Target, rather than a link to the current review in the NYTimes–or, if we do complain, we don’t blame either Google or e-commerce sites for "store noise".
The criticism of Google is, as far as I can see, a kind of elitism: blogs are "noise" because they get in the way of access to the "real" information published by the pros. Google comes in for a hit because of its powerful PageRank system, which provides an indirect assessment of the relevance of a sites content by looking at the number of sites that link to it, along with their relevance. If Searls links to a story on, say, emergent democracy, that link counts for more than if I link to it, because Google rightly sees Searls’ site as more relevant to the topic of emergent democracy. Because linking to stuff is one of the things that bloggers do best–and because there are simply so many of us–our cumulative evaluations directly contribute to the working of Google’s search engine. But for many, there is a sense that the popular nature of this participation in the flow of information is in-and-of-itself discrediting.
This, to me, is disheartening. While the NYTimes and other outlets are useful sources of information, they are only useful for the purposes they are designed for. Large commercial news outlets rarely provide useful commentary and analysis, for instance. Their coverage of issues relating to science and technology are particularly hampered both by a lack of specialized knowledge (those journalists studied journalism, after all, not paleontology or physics) and by their rather poor assessment of their readers’ levels of comprehension and interest. And, of course, there’s the Jayson Blair incident, which has given a) the NYTimes a reason to clean house, and b) a free pass to every other newspaper to ignore the Jayson Blairs on their own staffs. In short, they are useful and I agree with Searls that they should be making an effort to improve access, but they are neither the only nor the best source of information on the ‘net, regardless of the quality (or lack thereof) of their news coverage.