Friday, February 17, 2006

Journalism from a software perspective

On Feb. 9, while reading up on the web framework Django, my eye gravitated toward an unfamiliar acronym in this sentence: “Django focuses on automating as much as possible and adhering to the DRY principle.”

So what’s DRY? To programmers, DRY means “Don’t Repeat Yourself,” and the link explaining the principle led eventually to this rather elegant statement: ''Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.''

My study of content management systems returned me to the Django page, but within minutes I found myself drifting back to this simple-but-powerful concept: Express every individual bit of knowledge in a clear and authoritative manner. It whispered in my ear, tugged at my sleeve, told me there was more in play here than geek esoterica.

But duty called, so to move on I scribbled the principle on a piece of sketchpad paper and pinned it to the corkboard behind my desk with this question: How Could We Apply This to 21st Century Journalism?

Because to be blunt, modern journalism – not to mention the larger culture – is in desperate need of some clear and authoritative factual statements.

Where we are
One of the great ironies of modern life is how post-modern conservatives become when the topic turns to the media. The Bible, Adam Smith, warped timber – these are articles of faith, received wisdom. Conservatives don’t generally challenge such statements with much fervor, nor should we expect them to. The goal of conservatism isn’t the questioning of authority but the bolstering it, typically against the critiques of the offensive, silly or malicious elements in society.

Yet when it comes to discussions of the media, even the most rock-ribbed of John Birchers turn downright existential.

Objectivity? Impossible! The very terms of the discussion render it so.

To 21st century American conservatives, any media claim of objectivity represents an overtly political act. What’s more, they say, the claim of journalistic objectivity is actually a partisan political act because – as any rational person can plainly see – the media is biased in favor of the Democratic Party.

Which leads us, ipso facto, to a surprisingly radical conclusion: Since the post-modern conservative critique has now eliminated even the possibility of journalistic objectivity, and since – as any rational person can plainly see – liberal media bias serves the Democratic Party, then the most essential media reform of the 21st century is the creation of a separate, subjective, distinctly partisan, pro-GOP press to compete with the old-line “mainstream media.”

This idea – that the antidote to a subjective-but-dishonest liberal media is an overtly subjective conservative media – isn’t new. FOX News and The Washington Times predate the current administration, as do numerous conservative press critics. What’s new are concepts that NYU journalism professor Jay Rosen calls “de-certification” and “rollback.” Both are now operative principles in both the Bush White House and the larger conservative movement.

De-certification rejects the notion that journalists have any unique standing to critically examine or publicly challenge the statements of political leaders. It identifies Big Journalism as just another special interest, and treats reporters as special pleaders. De-certification identifies press “spin” (coverage) as just another message in competition with the conservative message. It rejects the idea that reporters can be objective, or that critical coverage can be anything other than partisan.

Rollback is the implementation of de-certification. It is Scott McClellan repeating the same plainly non-responsive statement no matter the question. It is the release of the Dick Cheney hunting accident story to the local paper rather than the Washington press corps. It is President Bush pointing out that nobody elected the person asking him questions.

Though elements of this conservative critique may well be worth a larger discussion, it is their net effect that concerns me. Together, they portend a future in which the mass media will present Americans not just with competing viewpoints, but with competing facts. In the worst-case scenario, these polarized partisan presses will present factual claims that are mutually exclusive.

Which raises the question: Is American benefiting from its now-Balkanized mass media? Would subjectivity in mass media be helpful or harmful? And with critics on both sides of the political spectrum united in the belief that human objectivity is not possible, is there any way that those of us in the journalism business can steer our profession back toward something resembling a common frame of reference?

The stakes are visible in this study: In 2004, FOX News viewers were far more likely to vote for President Bush. These viewers were also far more likely to believe statements about Iraq that were factually untrue, and each of these inaccuracies negated Democratic critiques of the administration’s foreign policies.

It is now clear to me that simply appealing to the good faith of media consumers will never allow us to address this status quo. Reporters, editors and producers will never be able to regain objective credibility across partisan lines by making reforms in the way we report or package the news. Professionalism is good, but it won’t change the basic equation.

Two types of objectivity
Which is why in 2005 I began proposing that an optimistic vision of our future requires that journalists stop thinking about news as a craft and start thinking about news as an informational system.

I was covering science at the time, and you can’t do that very long without recognizing that objectivity wasn’t an impossibility for the biologists I covered – it was just another factor in their experiments. They controlled for it, and then they documented those controls for all to see. Not even Heisenberg’s Uncertainty Principle, the ultimate statement of observer-subjectivity, derails the scientific concept of objectivity.

Why? Because unlike journalistic objectivity, which proposes itself to be an artificial perspective, scientific objectivity is a documented process. A requirement of that process is that it be recorded clearly enough that findings are repeatable for all observers (in the case of laboratory experiments) or clearly controlled for the observer’s subjective perspective (field observation of a single event or series of events). When viewed from a distance, this process of objectivity varies for each individual discipline, but its philosophy is constant: Always be aware of the subjectivity of the observer, use agreed-upon standards, and show your work.

In other words, scientists have created a system of objectivity, and by abiding within its rules, civilization has flourished. Scientific objectivity allows a physicist in Oslo to derive a bit of knowledge that a physicist in Kyoto can apply to a larger experiment. While scientists do test each other’s findings, science does not re-invent wheels. This is why there is only one Uncertainty Principle – Heisenberg’s.

Compare this to modern journalism.

By our standards, if Al Gore took up physics and claimed he had derived an Uncertainty Principle, journalists leaving his press conference would be expected to call the White House for a response. The story announcing the Gore Uncertainty Principle (GUP) would likely point out that the Heritage Foundation has a competing Uncertainty Principle (HFUP), then noting in passing that that someone named Heisenberg had done similar work in the 1920s. Being journalistically objective, most versions of this story would report each of these claims as limited facts (the fact being that individuals had stated the claims) without attempting to evaluate those claims.

Along the way, we’d quote Gore saying why his GUP reaffirms the principles of participatory democracy, while a Heritage Foundation spokesman would opine about how the GUP gets it entirely, backwards wrong: the HFUP clearly proves that President Bush won both Florida in 2000 and Ohio in 2004.

A week later, a major media outlet might attempt to write a follow-up piece critically examining the claims, and if the reporter had any scientific expertise, this new story would likely conclude that Heisenberg’s Uncertainty Principle is the only one that matters, and that the partisan versions of this essential theory of quantum physics are, at best, irrelevant.

This story would be immediately assailed as biased, of course. Conservative viewers, watching their network, would reach one conclusion. Liberals another. And while this echo-chamber effect might be comforting for both groups, it’s hardly the prescription for creating an informed, constructive national debate on any subject.

Rethinking journalism
On December 9, 2005, I left this comment on a particularly contentious PressThink thread:

We need to create some kind of new information tool that helps us manage these situations, so that basic facts can be established and stipulated. If we don't trust the government and we don't trust the media and we don't trust each other, how can we get anywhere? We know how to build websites and blogs and news wires ... but how do (we) build trust in the 21st century?

Five days later I wrote a lengthy post (“21st century trust … the techno-geek way!”) at Xark! trying to answer my own question. And in early January, I actually proposed in another PressThink thread that journalists publicly evaluate their confidence in the factual content they were publishing.

The tricky part is that being explicit about confidence means editors would have to accept greater accountability. If I've overrated my 12-miners-alive story a 7 and it reverses, I look pretty damned stupid. Then again, if I'm systematically underbidding my confidence to prevent being revealed as wrong later, I'm not doing much to build my credibility. You want an incentive for people to be candid and thorough, and I think this might provide it.

To be truly useful, such a system would need to be keyed to something, whether it's a number system or a color code or a bar graph or a slider. Whatever. A 5 rating should mean the same thing to the reader as to the editor. The beauty of the web is that editors don't have to redundantly explain this stuff in print -- rather, they can post the rating and know that anybody who isn't sure what it means can click and find out exactly what it means. And the more specific the better.

Some found the idea interesting. Kinda. Sorta. Most didn’t. But even though some people I respect – namely Paul Lukasik and Steve Lovelady – have rather graciously tried to tell me that I’ve got quite overboard with such thinking, I have the sneaking suspicion that the problem with most proposed solutions for our current media malaise is that they don’t go far enough.

The other problem is that they’ve thrown out the baby with the bathwater when it comes to objectivity. Fine: Let’s junk journalistic objectivity and its Halfling brethren “news judgment” and “fairness.” But let’s not concede the intellectual ground to competing subjective visions without first exploring the possibilities of a more scientific form of objectivity. Not a particularly enlightened perspective or state of being, but a transparent process.

How it might work.
Imagine for a moment that your next word processor came with an annoying “intelligent agent” feature that recognized any declarative statement of fact you ever wrote and then asked you to cite its definitive source. An incredible pain in the ass, yes.

But now imagine that, as a reader, every document you were ever asked to evaluate came to you as rich hypertext, with each summary fact transparently sourced all the way back to its original, definitive expression. Would you treat its claims differently than you would a document that arrived without that kind of depth behind it?

I’d wager you would. Sure, most writers cite sources, even if they don’t expressly name them. But are the sources definitive, or are they just dueling “facts” – on-the-public-record but never actually challenged or verified?

But back to our imagining. Anyone, given unlimited time and resources, could produce dry, boring, factual articles that are nevertheless elaborately festooned with hypertext-footnotes. Someone with zero understanding of how the modern mediascape works might even prescribe this as a solution for what ails us.

Realistically though, most reporters and editors will never have the time or resources to produce such exhaustive fact-check formatting on deadline. Even with modern
Web search-engines, checking a relatively simple statement back to its “single, unambiguous, authoritative representation” is exceedingly time-consuming and tedious task. Allow me to demonstrate:

“North Charleston, despite being one of the youngest cities in the state, is also among the largest.”

Now. Time me.

Two minutes: “As a means of bringing government closer to the people, an incorporation referendum was held on April 27, 1971. On June 12, 1972, after a series of legal battles, the South Carolina Supreme Court upheld the referendum results and North Charleston became a city.” (http://www.northcharleston.org/AboutUs/History.aspx)

Four minutes: “Incorporated in 1972, it is South Carolina's youngest city of any size.” (http://www.northcharleston.org/AboutUs/LocationMap.aspx).

Seven minutes: 2000 US Census population (via http://factfinder.census.gov/servlet/GCTTable?_bm=n&_lang=en&mt_name=DEC_2000_PL_U_GCTPL_ST7&format=ST-7&_box_head_nbr=GCT-PL&ds_name=DEC_2000_PL_U&geo_id=04000US45): 79,641

Eight minutes: Charleston, 96,650; Columbia 116,278;

Nine minutes. None other found.

So there we are: Almost 10 minutes of searching for a basically benign statement. The sources look pretty good, too – but they still aren’t anywhere close to the single, unambiguous, authoritative representations that the DRY principle calls for.

For instance, when the North Charleston city website calls itself “South Carolina’s youngest city of any size,” is that independent of the term “town?” It certainly doesn’t take into account the municipal soap opera that has been the recent history of the Town of James Island, which has been incorporated and disbanded twice in the last decade (James Island is currently unincorporated, which wouldn’t precisely invalidate this statement of fact). Beyond that, can the town of North Charleston be trusted to provide authoritative statements about itself?

Neither is the information up-to-date. There’s a 2003 census estimate that I found that shows North Charleston with roughly 81,500 residents… but that’s at least three years old now, and it’s an estimate. It doesn’t change the statement I made, but now I’m foundering. Which one would I pick as the authoritative representation of the original bit of knowledge?

Given this quick searching, perhaps I would edit my statement: “North Charleston, despite being the youngest city in the state, is also its third-largest.” The sentence is actually three factual statements: 1. North Charleston is the most recently incorporated municipality in South Carolina; 2. North Charleston’s population is estimated at roughly 81,500 people; 3. Only two other municipalities (Columbia and Charleston) in SC have larger populations. So my searching has marginally strengthened my statement and the hypertext footnoting may have improved your willingness to believe its veracity.

And yet in no way have I met the standards of the DRY Principle. I’ve wasted valuable time bolstering a sentence that – even when upgraded – makes the same point specifically that it originally made generally. And the items to which I point as my proof lack truly authoritative status. No doubt I’ll be fielding pointless phone calls from miffed James Islanders, who interpret the statement differently and want to argue.

Even under the most cursory examination, my DRY experiment is a tremendous timewasting flop.

All of which demonstrates why a real DRY fact-base would be tremendously valuable.

The trouble with search
Google is far from the definitive source most people imagine it to be. Just try updating your website and Googling the changes for proof. In fact, no web search engine can meet this standard, because the people writing the search algorithms aren’t the same people managing the data. So while web search points us toward facts, it cannot, as a system, create truly authoritative factual statements.

We need another tool. In fact, we need several of them.

  1. We need a curated fact-base. From raw data like census reports to statements contained in magazine articles, we need a database of primary factual statements that have been sourced and verified according to transparent and universally recognized standards.
  2. We need a system by which new primary factual statement may be reviewed and added to the factbase.
  3. We need a system by which all facts within the database can be reviewed and updated automatically. Such a system would also connect changes of primary fact to secondary statements such as “North Charleston is the state’s third-largest city.”

And then there’s No. 4:

  1. We need an intelligent word processing tool that automatically relates each factual claim to its original, unambiguous, authoritative statement.

No. 4 is the idea transports DRY Principle Journalism from the impractical to the sublime. Why? Because relevant factual statements tend to become pyramids over time. Down at the bottom? Census figures. Incorporation records. Later comes a statement, like mine, that combines census figures and incorporation records. Eventually, you reach statements like this one: “Along with its relative youth and rapid growth comes crime. North Charleston violent crime rate was among the highest in the United States in 2005 (ranked No. 79 for US municipalities).” Facts correlate, interrelate, expand and contrast.

If I write using DRY-principle facts, then each level of complexity I ascend becomes its own DRY-principle statement.

With the right tools linking the DRY factbase to my word processor, I’d know if my statement was generally correct, generally incorrect, or questionable. As I write, the built-in analyzer would search the factbase for relevant facts, perhaps listing them in a scrolling window beside my word processing field. At the end of the article, I’d probably edit by scanning back over the cited links generated by my intelligent agent, check to see if there were any obvious ways to improve the factual rigor of my article, and then press save.

Of course, if I’m reporting, my job is to generate new facts. How might DRY help me there?

Well, for starters it might let me know whether my subject is actually news – or just news to me. It would guard against me making factual and context errors. Perhaps we could even train it to recognize and challenge certain types of logical fallacies or misleading rhetorical devices.

But the most important role such an agent might play for a reporter is that it would recognize new, unsupported factual statements, note their cited sources, and apply to the factbase process.

Memory and power
On Sept. 1, 2005, with his administration beginning to come under fire for its response to the Katrina disaster, President Bush told reporters “I don’t think anyone anticipated the breech of the levees.”

With an intelligent agent dynamically connecting the DRY factbase to their word processors, reporters would have known this statement to be factually incorrect before they had finished typing the closing quotation mark. Why? Because multiple previous articles and disaster exercises had done exactly that – predicting with great accuracy the impact of a Katrina-like storm.

Yes, we all need an ever-expanding database of original-source facts, stated clearly and authoritatively. But the trick to making such a thing useful would be to embed in our writing tools the kinds of pattern-seeking software that first recognizes declarative grammar and then applies the words as search terms.

Positive correlations might stream into the word processor’s “hits” window as green supporting citations. But contradictory facts – like FEMAs previous disaster exercises – would flash red.

At the very least, a reporter following these protocols would know that the President’s statement was less than rigorously true. So too would anyone following the story at home. What’s the point of building the world’s greatest factual reference and not making it public to the world?

The President should have access to the factbase as well – if only to get his story straight before he goes out to meet the press.

And if the President wishes to contend that the factbase is wrong, well – we should be able to build feedback mechanisms that allow that, as well.

Administration
So, imagine for a moment that our discovery informatics wizards could develop the right interface. And imagine that our systems geniuses could invent the right storage, cross-referencing and retrieval processes. Imagine that our best archivists and data specialists could create a transparent system for batch-converting the huge volume of new data that would soon flood into your fact base. Imagine that the wise among us could create fair and practical ways or making sure the factbase stays accurate.

OK then: How would we pay for it?

One answer might be that the nation’s media outlets could work cooperatively on such a system, much in the same way that that competitors work together to make the Associated Press. The project is too large for any single participant, but if each worked together, each would benefit.

Colleges and universities? Sure. Research and development labs? You bet. Anyone with an interest in the expansion and vetting of information could benefit.

Governments?

Well, that’s another question.

Regardless of who would pay and how much such a system would cost, I see nothing it what I propose here that exceeds the theoretical capabilities of existing or developing technologies. And if science is any guide, then the value of having solid factual information at the world’s disposal – without having to independently verify each individual bit of knowledge – would be a tremendous economic multiplier.

I believe a system like this will be within our reach within a decade.

Would it be a magic bullet? No. So much of what passes for fact is actually only “facty.” How much of our political reality is based on guesses, attitudes, opinions? A DRY factbase and standards-based journalism wouldn’t change that.

But creating a standard repository of “single, unambiguous, authoritative representations” of knowledge would be a transformative technology both for journalism and society. Not because it would expand knowledge – but because it would allow the creation of a system of mutually agreed-upon, standards-based journalism and communication.

Some people would choose to stay outside such a system. They would challenge its validity, appeal to fear, appeal to divine authority. They could appeal to “truthiness” just as they do today.

But by making such a system open-source, and by inviting everyone to participate in monitoring it, you would move truthiness from the heart of the culture to its periphery.

People like me will still write to persuade. We will still argue over which facts are relevant.

But no longer will you have to trust me to see the relative value in what I have to say.

And that would be the biggest improvement in communication I could ever imagine.