maxjmartin.com

The Signal and the Noise

Book notes for “The Signal and the Noise”, The Art and Science of Prediction by Nate Silver

Notes: Enjoyed this look at prediction. Was very maths-light, anecdote heavy, but still had some useful insights in there.

Highlights: It was hard to tell the signal from the noise. The story the data tells us is often the one we’d like to hear, and we usually make sure that it has a happy ending. loc 128

failures like these have been fairly common in political prediction. A long-term study by Philip E. Tetlock of the University of Pennsylvania found that when political scientists claimed that a political outcome had absolutely no chance of occurring, it nevertheless happened about 15 percent of the time. loc 225

Prediction is indispensable to our lives. Every time we choose a route to work, decide whether to go on a second date, or set money aside for a rainy day, we are making a forecast about how the future will proceed—and how our plans will affect the odds for a favorable outcome. loc 274

Human beings have an extraordinary capacity to ignore risks that threaten their livelihood, as though this will make them go away. loc 447

If you make the wrong assumptions, your model may be extraordinarily wrong. One assumption is that each mortgage is independent of the others. In this scenario, your risks are well diversified: if a carpenter in Cleveland defaults on his mortgage, this will have no bearing on whether a dentist in Denver does. Under this scenario, the risk of losing your bet would be exceptionally small—the equivalent of rolling snake eyes five times in a row. Specifically, it would be 5 percent taken to the fifth power, which is just one chance in 3,200,000. This supposed miracle of diversification is how the ratings agencies claimed that a group of subprime mortgages that had just a B+ credit rating on average38—which would ordinarily imply39 more than a 20 percent chance of default40—had almost no chance of defaulting when pooled together. The other extreme is to assume that the mortgages, instead of being entirely independent of one another, will all behave exactly alike. That is, either all five mortgages will default or none will. Instead of getting five separate rolls of the dice, you’re now staking your bet on the outcome of just one. There’s a 5 percent chance that you will roll snake eyes and all the mortgages will default—making your bet 160,000 times riskier than you had thought originally. loc 477

Important to keep in mind when making predictions, often things are correlated in some way, which means you cannot just blindly update, since you risk double-counting.

Risk, as first articulated by the economist Frank H. Knight in 1921, is something that you can put a price on. Say that you’ll win a poker hand unless your opponent draws to an inside straight: the chances of that happening are exactly 1 chance in 11. This is risk. It is not pleasant when you take a “bad beat” in poker, but at least you know the odds of it and can account for it ahead of time. In the long run, you’ll make a profit from your opponents making desperate draws with insufficient odds. Uncertainty, on the other hand, is risk that is hard to measure. You might have some vague awareness of the demons lurking out there. You might even be acutely concerned about them. But you have no real idea how many of them there are or when they might strike. Your back-of-the-envelope estimate might be off by a factor of 100 or by a factor of 1,000; there is no good way to know. This is uncertainty. Risk greases the wheels of a free-market economy; uncertainty grinds them to a halt. loc 528

The housing boom of the 1950s, however, had almost nothing in common with the housing bubble of the 2000s. The comparison helps to reveal why the 2000s became such a mess. The postwar years were associated with a substantial shift in living patterns. Americans had emerged from the war with a glut of savings and into an age of prosperity. There was a great demand for larger living spaces. Between 1940 and 1960, the homeownership rate surged to 62 percent from 44 percent, with most of the growth concentrated in the suburbs. Furthermore, the housing boom was accompanied by the baby boom: the U.S. population was growing at a rate of about 20 percent per decade after the war, about twice its rate of growth during the 2000s. This meant that the number of homeowners increased by about 80 percent during the decade—meeting or exceeding the increase in housing prices. In the 2000s, by contrast, homeownership rates increased only modestly: to a peak of about 69 percent in 2005 from 65 percent a decade earlier.51 Few Americans who hadn’t already bought a home were in a position to afford one. loc 551

I evaluated nearly 1,000 predictions that were made on the final segment of the show by McLaughlin and the rest of the panelists. About a quarter of the predictions were too vague to be analyzed or concerned events in the far future. But I scored the others on a five-point scale ranging from completely false to completely true. The panel may as well have been flipping coins. I determined 338 of their predictions to be either mostly or completely false. The exact same number—338—were either mostly or completely true. loc 860

Hedgehogs are type A personalities who believe in Big Ideas—in governing principles about the world that behave as though they were physical laws and undergird virtually every interaction in society. Think Karl Marx and class struggle, or Sigmund Freud and the unconscious. Or Malcolm Gladwell and the “tipping point.” Foxes, on the other hand, are scrappy creatures who believe in a plethora of little ideas and in taking a multitude of approaches toward a problem. They tend to be more tolerant of nuance, uncertainty, complexity, and dissenting opinion. If hedgehogs are hunters, always looking out for the big kill, then foxes are gatherers. Foxes, Tetlock found, are considerably better at forecasting than hedgehogs. loc 947

One of Tetlock’s more remarkable findings is that, while foxes tend to get better at forecasting with experience, the opposite is true of hedgehogs: their performance tends to worsen as they pick up additional credentials. Tetlock believes the more facts hedgehogs have at their command, the more opportunities they have to permute and manipulate them in ways that confirm their biases. loc 1011

The FiveThirtyEight forecasting model started out pretty simple—basically, it took an average of polls but weighted them according to their past accuracy—then gradually became more intricate. loc 1081

The wide distribution of outcomes represented the most honest expression of the uncertainty in the real world. The forecast was built from forecasts of each of the 435 House seats individually—and an exceptionally large number of those races looked to be extremely close. loc 1090

The further down the ballot you go, the more volatile the polls tend to be: polls of House races are less accurate than polls of Senate races, which are in turn less accurate than polls of presidential races. Polls of primaries, also, are considerably less accurate than general election polls. During the 2008 Democratic primaries, the average poll missed by about eight points, far more than implied by its margin of error. loc 1100

A Senate candidate with a five-point lead on the day before the election, for instance, has historically won his race about 95 percent of the time—almost a sure thing, even though news accounts are sure to describe the race as “too close to call.” By contrast, a five-point lead a year before the election translates to just a 59 percent chance of winning—barely better than a coin flip. loc 1108

If you forecast that a particular incumbent congressman will win his race 90 percent of the time, you’re also forecasting that he should lose it 10 percent of the time.28 The signature of a good forecast is that each of these probabilities turns out to be about right over the long run. loc 1146

“When the facts change, I change my mind,” the economist John Maynard Keynes famously said. “What do you do, sir?” loc 1168

Wasserman will usually maintain the same rating after the interview. As hard as he works to glean new information from the candidates, it is often not important enough to override his prior take on the race. Wasserman’s approach works because he is capable of evaluating this information without becoming dazzled by the candidate sitting in front of him. A lot of less-capable analysts would open themselves to being charmed, lied to, spun, or would otherwise get hopelessly lost in the narrative of the campaign. Or they would fall in love with their own spin about the candidate’s interview skills, neglecting all the other information that was pertinent to the race. loc 1277

But Sanders provides the Dodgers with the most valuable kind of information—the kind of information that other people don’t have. loc 1721

Good innovators typically think very big and they think very small. New ideas are sometimes found in the most granular details of a problem where few others bother to look. And they are sometimes found when you are doing your most abstract and philosophical thinking, considering why the world is the way that it is and whether there might be an alternative to the dominant paradigm. Rarely can they be found in the temperate latitudes between these two spaces, where we spend 99 percent of our lives. loc 1854

In 1814, Laplace made the following postulate, which later came to be known as Laplace’s Demon: We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes. loc 1948

This is the process by which modern weather forecasts are made. These small changes, introduced intentionally in order to represent the inherent uncertainty in the quality of the observational data, turn the deterministic forecast into a probabilistic one. For instance, if your local weatherman tells you that there’s a 40 percent chance of rain tomorrow, one way to interpret that is that in 40 percent of his simulations, a storm developed, and in the other 60 percent—using just slightly different initial parameters—it did not. loc 2075

the for-profit weather forecasters rarely predict exactly a 50 percent chance of rain, which might seem wishy-washy and indecisive to consumers.41 Instead, they’ll flip a coin and round up to 60, or down to 40, even though this makes the forecasts both less accurate and less honest.42 Floehr also uncovered a more flagrant example of fudging the numbers, something that may be the worst-kept secret in the weather industry. Most commercial weather forecasts are biased, and probably deliberately so. In particular, they are biased toward forecasting more precipitation than will actually occur43—what meteorologists call a “wet bias.” The further you get from the government’s original data, and the more consumer facing the forecasts, the worse this bias becomes. Forecasts “add value” by subtracting accuracy. loc 2270

Earthquakes cannot be predicted? This is a book about prediction, not a book that makes predictions, but I’m willing to stick my neck out: I predict that there will be more earthquakes in Japan next year than in New Jersey. And I predict that at some point in the next one hundred years, a major earthquake will hit somewhere in California. loc 2510

If you’re speaking with a seismologist: A prediction is a definitive and specific statement about when and where an earthquake will strike: a major earthquake will hit Kyoto, Japan, on June 28. Whereas a forecast is a probabilistic statement, usually over a longer time scale: there is a 60 percent chance of an earthquake in Southern California over the next thirty years. The USGS’s official position is that earthquakes cannot be predicted. They can, however, be forecasted. loc 2515

there have been a number of medium-size ones; between 1960 and 2009, there were about fifteen earthquakes that measured between 5.0 and 5.9 on the magnitude scale in the area surrounding the city.31 That works out to about one for every three years. According to the power law that Gutenberg and Richter uncovered, that means that an earthquake measuring between 6.0 and 6.9 should occur about once every thirty years in Tehran. loc 2576

Hough’s 2009 book, Predicting the Unpredictable: The Tumultuous Science of Earthquake Prediction, is a history of efforts to predict earthquakes, and is as damning to that enterprise as Phil Tetlock’s study was to political pundits. loc 2641

What this means is that if San Francisco is forecasted to have a major earthquake every thirty-five years, it does not imply that these will be spaced out evenly (as in 1900, 1935, 1970). It’s safer to assume there is a 1 in 35 chance of an earthquake occurring every year, and that this rate does not change much over time regardless of how long it has been since the last one. loc 2672

In reality, when a group of economists give you their GDP forecast, the true 90 percent prediction interval—based on how these forecasts have actually performed20 and not on how accurate the economists claim them to be—spans about 6.4 points of GDP (equivalent to a margin of error of plus or minus 3.2 percent).* When you hear on the news that GDP will grow by 2.5 percent next year, that means it could quite easily grow at a spectacular rate of 5.7 percent instead. Or it could fall by 0.7 percent—a fairly serious recession. Economists haven’t been able to do any better than that, and there isn’t much evidence that their forecasts are improving. The old joke about economists’ having called nine out of the last six recessions correctly has some truth to it; one actual statistic is that in the 1990s, economists predicted only 2 of the 60 recessions around the world a year ahead of time. loc 2998

How does an indicator that supposedly had just a 1-in-4,700,000 chance of failing flop so badly? For the same reason that, even though the odds of winning the Powerball lottery are only 1 chance in 195 million,30 somebody wins it every few weeks. The odds are hugely against any one person winning the lottery—but millions of tickets are bought, so somebody is going to get lucky. Likewise, of the millions of statistical indicators in the world, a few will have happened to correlate especially well with stock prices or GDP or the unemployment rate. If not the winner of the Super Bowl, it might be chicken production in Uganda. But the relationship is merely coincidental. loc 3058

A forecaster should almost never ignore data, especially when she is studying rare events like recessions or presidential elections, about which there isn’t very much data to begin with. Ignoring data is often a tip-off that the forecaster is overconfident, or is overfitting her model—that she is interested in showing off rather than trying to be accurate. loc 3151

Large errors like these have been fairly common. Between 1965 and 2009,47 the government’s initial estimates of quarterly GDP were eventually revised, on average, by 1.7 points. That is the average change; the range of possible changes in each quarterly GDP is higher still, and the margin of error48 on an initial quarterly GDP estimate is plus or minus 4.3 percent. That means there’s a chance that the economy will turn out to have been in recession even if the government had initially reported above-average growth, or vice versa. The government first reported that the economy had grown by 4.2 percent in the fourth quarter of 1977, for instance, but that figure was later revised to negative 0.1 percent.49 So we should have some sympathy for economic forecasters.50 It’s hard enough to know where the economy is going. But it’s much, much harder if you don’t know where it is to begin with. loc 3185

Who needs theory when you have so much information? But this is categorically the wrong attitude to take toward forecasting, especially in a field like economics where the data is so noisy. Statistical inferences are much stronger when backed up by theory or at least some deeper thinking about their root causes. loc 3241

If you’re looking for an economic forecast, the best place to turn is the average or aggregate prediction rather than that of any one economist. My research into the Survey of Professional Forecasters suggests that these aggregate forecasts are about 20 percent more accurate61 than the typical individual’s forecast at predicting GDP, 10 percent better at predicting unemployment, and 30 percent better at predicting inflation. This property—group forecasts beat individual ones—has been found to be true in almost every field in which it has been studied. And yet while the notion that aggregate forecasts beat individual ones is an important empirical regularity, it is sometimes used as a cop-out when forecasts might be improved. The aggregate forecast is made up of individual forecasts; if those improve, so will the group’s performance. loc 3252

He (Hanson) is also an advocate of a system he calls “futarchy” in which decisions on policy issues are made by prediction markets72 rather than politicians. loc 3319

Eventually, the government reported that a total of about fifty-five million Americans had become infected with H1N1 in 2009—about one sixth of the U.S. population rather than one half—and 11,000 had died from it.43 Rather than being an unusually severe strain of the virus, H1N1 had in fact been exceptionally mild, with a fatality rate of just 0.02 percent. Indeed, there were slightly fewer deaths from the flu in 2009–10 than in a typical season. loc 3490

If a group of influential designers decide that brown will be the hot color next year and start manufacturing lots of brown clothes, and they get models and celebrities to wear brown, and stores begin to display lots of brown in their windows and their catalogs, the public may well begin to comply with the trend. But they’re responding more to the marketing of brown than expressing some deep underlying preference for it. The designer may look like a savant for having “anticipated” the in color, but if he had picked white or black or lavender instead, the same process might have unfolded. loc 3600

If you compare the number of children who are diagnosed as autistic64 to the frequency with which the term autism has been used in American newspapers,65 you’ll find that there is an almost perfect one-to-one correspondence (figure 7-4), with both having increased markedly in recent years. loc 3607

could be cause or could be effect

So he placed $80,000—his entire life savings less a little he’d left over for food and tuition—on the Lakers to win the NBA championship. If he won his bet, he’d make half a million dollars. If he lost it, it would be back to working double shifts at the airport. loc 3876

This seems silly, should have bet a portion of his bankroll only (using Kelly Criterion for ex. https://www.gwern.net/Prediction%20markets#how-much-to-bet) based on the amount of edge he thought he had. With a 20k bet he would have made 250k, a life-changing amount of money, and still have enough to either lock in profit as the competition continued, or make other bets. This would greatly reduce his variance (lose it all or win big!) — see taleb randomness etc.

The prudent thing for a gambler would have been to hedge his bet. For instance, Voulgaris could have put $200,000 on Portland, who were 3-to-2 underdogs, to win Game 7. That would have locked in a profit. If the Blazers won, he would make more than enough from his hedge to cover the loss of his original $80,000 bet, still earning a net profit of $220,000.9 If the Lakers won instead, his original bet would still pay out—he’d lose his hedge, but net $320,000 from both bets combined.* loc 3898

Successful gamblers, instead, think of the future as speckles of probability, flickering upward and downward like a stock market ticker to every new jolt of information. When their estimates of these probabilities diverge by a sufficient margin from the odds on offer, they may place a bet. loc 3926

To most people, the sort of things that Voulgaris observes might seem trivial. And in a sense, they are: the big and obvious edges will have been noticed by other gamblers, and will be reflected in the betting line. So he needs to dig a little deeper. loc 3958

Studies have found, for instance, that about 4 percent of married partners cheat on their spouses in any given year,33 so we’ll set that as our prior. loc 4056

A professional sports bettor like Voulgaris might place a bet only when he thinks he has at least a 54 percent chance of winning it. This is just enough to cover the “vigorish” (the cut a sportsbook takes on a winning wager), plus the risk associated with putting one’s money into play. And for all his skill and hard work—Voulgaris is among the best sports bettors in the world today—he still gets only about 57 percent of his bets right. It is just exceptionally difficult to do much better than that. loc 4241

While television coverage has been a great boon to poker, it leaves many casual players with misleading impressions about the right way to play it, focusing too much on the results and not enough on the correct decision-making process. “It’s not very common that you can narrow someone’s holdings down to one hand,” loc 5097

when a field is highly competitive, it is only through this painstaking effort around the margin that you can make any money. There is a “water level” established by the competition and your profit will be like the tip of an iceberg: a smallG

One disturbing example is that members of Congress, who often gain access to inside information about a company while they are lobbied and who also have some ability to influence the fate of companies through legislation, return a profit on their investments that beats market averages by 5 to 10 percent per year,33 a remarkable rate that would make even Bernie Madoff blush. loc 5620

These statistics represent a potential complication for efficient-market hypothesis: when it’s not your own money on the line but someone else’s, your incentives may change. Under some circumstances, in fact, it may be quite rational for traders to take positions that lose money for their firms and their investors if it allows them to stay with the herd and reduces their chance of getting fired.70 There is significant theoretical and empirical evidence71 for herding behavior among mutual funds and other institutional investors.72 “The answer as to why bubbles form,” Blodget told me, “is that it’s in everybody’s interest to keep markets going up.” loc 5836

I pay quite a bit of attention to what the consensus view is—what a market like Intrade is saying—when I make a forecast. It is never an absolute constraint. But the further I move away from that consensus, the stronger my evidence has to be before I come to the view that I have things right and everyone else has it wrong. loc 6018

Some theorists have proposed that we should think of the stock market as constituting two processes in one.98 There is the signal track, the stock market of the 1950s that we read about in textbooks. This is the market that prevails in the long run, with investors making relatively few trades, and prices well tied down to fundamentals. It helps investors to plan for their retirement and helps companies capitalize themselves. Then there is the fast track, the noise track, which is full of momentum trading, positive feedbacks, skewed incentives and herding behavior. Usually it is just a rock-paper-scissors game that does no real good to the broader economy—but also perhaps also no real harm. It’s just a bunch of sweaty traders passing money around. However, these tracks happen to run along the same road, as though some city decided to hold a Formula 1 race but by some bureaucratic oversight forgot to close one lane to commuter traffic. Sometimes, like during the financial crisis, there is a big accident, and regular investors get run over. loc 6028

Sophomoric forecasters sometimes make the mistake of assuming that just because something is hard to model they may as well ignore it. Good forecasters always have a backup plan—a reasonable baseline case that they can default to if they have reason to worry their model is failing. (In a presidential election, your default prediction might be that the incumbent will win—that will do quite a bit better than just picking between the candidates at random.) loc 6582

There is a tendency in our planning to confuse the unfamiliar with the improbable. The contingency we have not considered seriously looks strange; what looks strange is thought improbable; what is improbable need not be considered seriously. loc 6858

When plotted on a double-logarithmic scale, the relationship between the frequency and the severity of terror attacks appears to be, more or less,47 a straight line. This is, in fact, a fundamental characteristic of power-law relationships: loc 7042

Judging by the death tolls of attacks from 1979 through 2009, for instance, a power-law model like Clauset’s could be taken to imply there is about a 10 percent chance of an attack that would kill at least 10,000 people in a NATO country over the next decade. There is a 3 percent chance of an attack that would kill 100,000, and a 0.6 percent chance of one that would kill one million or more. loc 7163