Business intelligence visualization guru Stephen Few wrote an interesting analysis of a Malcolm Gladwell talk he attended during a SAS Institute conference. Key idea:
Our former problems were usually solved by digging up and revealing the right information. He used Watergate as an example, pointing out that key information was hidden, and the problem was solved when Washington Post journalists Woodward and Bernstein were finally able to uncover this information that had been concealed. Modern problems, on the other hand, are not the result of missing or hidden information, Gladwell argued, but the result, in a sense, of too much information and the complicated challenge of understanding it. Enron was his primary example. The information about Enron’s practices was not kept secret. In fact, it was published in several years’ worth of financial reports to the SEC, totaling millions of pages. The facts that led to Enron’s rapid implosion were there for anyone who was interested to see, freely available on the Internet, but weren’t understood until a journalist spent two months reading and struggling to make sense of Enron’s earnings, which led him to discover that they existed only as contracts to buy energy at a particular price in the future, not as actual cash in the bank. The problems that we face today, both big ones in society like the current health care debate and smaller ones like strategic business decisions, do not exist because we lack information, but because we don’t understand it. They can be solved only by developing skills and tools to make sense of information that is often complex. In other words, the major obstacle to solving modern problems isn’t the lack of information, solved by acquiring it, but the lack of understanding, solved by analytics.
For Austin’s public sector, there are three related problems.
First, there is no notion that data is a public good that Austin’s citizens are entitled to have. Access to data is presently a discretionary privilege granted in the event that a request clears freedom of information requirements and/or bureaucratic obstacles. This level of protection makes sense when it comes to individualized records, but it does not make sense for the types of data sets that are useful for most policy discussions (think water demand in the WTP4 debate.)
Second, because there isn’t an affirmative expectation that data sets should to be gathered and put out to the public, many opportunities to assemble useful data sets are not explored. What factors predict a good cop or a water main bursting? Data on these can be gathered, put out in the public in a way that does not compromise individual privacy and allows for crowd-sourced crunching and accountability. There is no expectation and thus, there are no systems in place to create these data products. These are just as essential as the budget (which if you think about it, is also a data product) and need to start being considered basic elements of local government transparency just like an annual, line-by-line budget is a basic expectation.
Third, our overall innumeracy as a society gets in the way of understanding the nuances revealed by data analysis. I am thinking for example of the controversy about Austin Energy’s renewable push and how it is absent any concept of volatility or dissection into the rate of annualized increases.
The first two problems are easier to solve, but perhaps if there is more data and associated analysis out there that affects daily lives, then the median voter will start getting more interested in things like standard deviations. Just think about the number of statistical terms used in sports, and you get a sense that non-quants can get into numbers if they see the link to insight they care about.