Class 24 – Tuesday, May 3, 2011

From Maura:
We started out talking again about independent events and calculating probabilities. We went through the coin tossing example – again – and saw the pattern of multiplication. Then we did a few other examples. Suppose a survey shows that 9 out of 10 school children choose cheese as their favorite pizza. If we choose three children at random, what is the probability that all three will have cheese as their favorite? The class was stumped, so we backed up and talked about choosing one child at random. This meant explaining what the “9 out of 10″ means and how it applies in general, not to any particular 10 children in a room. They saw that the probability is 90%, so then we moved on to choosing the three children. Then they saw that it was (0.9)^3 or about 73%. Next we did problem 12.5 from the text: given that about 23% of US college presidents are women, what are the odds that a randomly selected group of 12 colleges will all be led by women? Once we re-phrased it that way, they saw that it would be (0.23)^12 or, as the calculator reads, 2.19….E-08. Once again we had to interpret that as 0.000 000 0219. Is it 1 in 50 million? It’s close enough. We noted that giving the statistic as 1 in 50 million was more intuitive than 0.0000002.19%

Next I started talking about contingency tables. We worked through the first example in the book, which is taken from an actual medical journal article. As usual, part of the work is understanding what the different pieces of the situation represent. We broke it down then built the table and noted that we can add across and down to get subtotals. We had just enough time to start talking about false positives and false negatives – mostly to get the idea across and to explain about the consequences of each.

From Ethan:

When I met Maura between classes we realized that we were both pleased that the semester is almost over. Students aren’t supposed to know that teachers think that sometimes too. But they don’t seem to be reading this blog …

I spent most of the class on independence, trying to establish the idea, not the formalism. I worked extensively with an example from the weather. Suppose the web says there’s a 40% chance of rain in Boston today (as in fact it does).

One student appreciated the philosophical question involved in stating such a probability. You can’t compute it by counting cases, as with coins, dice or cards.

The web says the same thing for Brookline. I asked for the probability that it would rain in both towns. It took a little work (but not too much) to establish that the probability would be 40%, since the towns are adjacent and have the same weather (with high probability!).

I filled in the 2×2 table

                           rain              dry
               rain         40%          0%
               dry           0%          60%

which they all were willing to believe. I noted that the columns summed to 40% and 60% respectively, since that took into account Boston weather whatever happened in Brookline.

We then found that there was a 90% chance of rain in San Francisco, and started on the same question — what’s the probability of rain in both places? I quickly discovered that the numbers were too confusing (too random?) so moved instead to Detroit, and assumed the probability of rain there was also 40%.

It was clear (after a while) that the probability of rain in both Boston and Detroit should be less than 40%, but no one had a clear idea of how to figure out what it was. I fumbled around a bit looking for a convincing argument, and hit on this, which seemed to work. We think of the probabilites as percentages. When we’re confused about percentage calculations it often helps to start with 100 — in this case, 100 days. On 40 of those days it will rain in Boston. Since the weather in Detroit is independent of that in Boston (note intuitive meaning of “independent”) it will rain in Detroit on 40% of those 40 days, so 16 days. That means a 16/100 = 16% chance of rain in both places. We easily filled in the table:

                           rain              dry
               rain         16%          24%
               dry           24%          36%

and then the table for San Francisco

                           rain              dry
               rain         36%          54%
San Francisco
               dry           4%           6%

No one was surprised at the small numbers in the last row, since it’s rarely dry in San Francisco.

We checked all the row and column sums; they seemed to make sense. One student noticed that we were just multiplying the probabilities. I applauded that, but said (and think) that remembering a rule (“multiply probabilities”) is a lot less valuable than being able to think about the entries in the table. The former is easy to forget and doesn’t really explain much. It’s good for professionals, but not for ordinary people who just want to make occasional sense of questions like these.

Moved on to break-the-bank: why double your bet won’t work. Successive turns of the roulette wheel are independent, so the probability of n even money losses in a row is 2^n. For the third day in a row they didn’t remember (quickly) that 1 in 2^10 is about 1 in 1000. Maybe they will from now on. So doubling your bet means a large chance of winning a little and a small chance of losing a lot – in fact, of losing it all since you have a finite bankroll, which might even be less than the house limit.

Spent the last few minutes exploring the book of odds. Never got to false positives, which are on the homework for next time. So the homework will be a real struggle, but that struggle will make the class more productive.

blog home page