In honor of the end of the semester and the arrival of spring, here is my first, and so far only, attempt at writing an exam. This was given as a take-home midterm by a friend teaching a statistics-for-people-who-don't-like-math course at a school which, to protect the innocent, I'll call the University of Winnemac. (It's a long story.) I'm fond of it, but the students at Winnemac *hated* it, and apparently none of them got any of the jokes. Solutions are available upon request. (Note that Problem 5 is a simplified version of the "Carnival Booth" algorithm due to Samidh Chakrabarti and Aaron Strauss.)

Problems are longer and harder than exercises; they also count for twice as much. Some questions are in multiple choice format, but you should always show your work.

A designer measures the height of a hundred models, randomly chosen from the runways in Milan. The sample mean height is 5.85 feet, with a sample
standard deviation of 0.15 feet.

(a) What is the 95% confidence interval for the mean height, in feet, of
Milanese models?

(b) What is the standard deviation of their height in inches?

(c) What is the confidence interval for their height in inches?

(d) What is the confidence interval (in feet or inches) for their height while
wearing three-inch platform shoes?

(e) The standard deviation of their height in platforms?

Every week the market price for frozen concentrated orange juice (FCOJ) futures either goes up or down, with equal probability. A market analyst obtains a list of 1024 FCOJ traders. At the beginning of the year he sends them a letter announcing a free trial period of his new market prediction service; half the letters say the market will rise that week, and half that it will fall. The next week he discards the names of the traders to whom he made the wrong prediction, and repeats the process. Thus after *k* weeks, the remaining names on the list have received *k* correct predictions in a row for free.

(a) What is the probability that any given trader is still on the list after
seven weeks?

(b) How many names are still on the list after seven weeks?

What are the probabilities of getting the following sequences of heads and tails from 30 consecutive tosses of a fair coin?

- HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
- HTHTHTHTHTHTHTHTHTHTHTHTHTHTHT
- HHTTTTHHTHHHHHTHHHTTHTHTTHTTTH

Mutual fund annual rates of return are normally distributed. What fraction of funds will have returns between one and three standard deviations above the mean in a given year?

A survey of cats in J. Random College Town finds their weight is normally distributed, with a mean of 9 pounds and a standard deviation of 1.5 pounds. The same survey finds that the weight of cat owners is also normally distributed, with a mean of 150 pounds and a standard deviation of 15 pounds. Describe the distribution of the weight of cat-owners holding their cats, assuming feline and human weights are independent random variables.

In Exercises 6--8, A, B and C are three events. P(A) = 0.75, P(B) = 0.65 and P(C) = 0.40. Note: Drawing Venn diagrams is *not required* to solve these problems, but it may help.

(a) What is the *smallest* possible value of P(A *and* B)?

(b) The largest possible value?

Which of the following statements *could* be true?

- C is the complement of A
- B completely contains A
- C is the complement of (A
*and*B) - C is the complement of (A
*or*B) - A and B are mutually exclusive

(a) If C is (A *and* B), what is P(B *or* A)?

(b) If C = (A *and* B), what is P(B|A)?

In his book I Am Sickened by Your Ignorance, the critic Orpheus Bruno declared that "no more than one poem in ten thousand is truly great; the rest might as well be shopping lists". Bruno was subsequently abducted by renegade experimental psychologists and made to rate a large number of randomly-selected poems from 0 to 100, and also to say which ones were "truly great". He, of course, ignored the 0--100 scale entirely, preferring a boundless scale to mirror his boundless magnificence. His ratings, in fact, were normally distributed (or, as he said, "followed the law of the immortal Gauss"): the mean rating was 45, with a standard deviation of 16, and poems which scored 93 or above were "truly great". What is the proportion of truly great poetry?

Professor Sheila Nagig of the Miskatonic University Department of Statistics
refuses to give tests to her students, saying that most students who get high
scores are just lucky, not knowledgeable, so the test isn't informative. Pressed by the Dean to explain herself, she argues as follows. Consider a test with 100 yes-or-no questions. A student's degree of knowledge of the subject (say, finite-temperature canonical quantum gravity) can be measured by the probability *p* of their answering a given question correctly. Assuming the questions are independent (which is the case on Prof. Nagig's tests), a student's score is therefore a binomial random variable, *B*(*p,* 100). Normally a passing score is 70% correct, or 70 questions out of 100. Note that someone who knows nothing and guesses completely at random has *p* = 0.5. (In the following, you may use the normal approximation if you wish.)

(a) What is the probability of a student passing if their *p* = 0.5?

(b) What is the probability of a student passing if their *p* = 0.7?

(c) Assume that one American in a million has a finite-temperature canonical quantum gravity *p* of 0.7, and the rest have p = 0.5, i.e., they know nothing about it. What is the probability that a random American who passes a test in the subject knows nothing about it?

(d) Explain what is wrong with Prof. Nagig's argument.

Glenn and Glenda Martingale have a successful angora sweater business, and, in a fit of vertical integration, buy an angora goat farm. As you know, the most important trait of an angora goat is its fuzziness, measured in hairs per square millimeter. The Martingales, being statistically sophisticated, determine the fuzziness of their goats as follows. For each goat in the herd, fuzziness is measured at a random spot on its body, and then averaged across all goats in the herd. These are their results.

Sample mean fuzziness | 10.30 |

Sample standard deviation | 1.21 |

Low end of 95% confidence interval | 9.99 |

High end of 95% confidence interval | 10.61 |

Assume, like the Martingales, that fuzziness is normally distributed. Calculate the number of goats in the herd, i.e., the number of samples. Round
to the nearest goat. You may assume that there are a *lot* of goats.

**Extra credit**. Can you re-do the calculation, *without* assuming the number of samples is large?

The International Group of Angora Fanciers (IGAF) stipulates that only wool
from goats whose fuzziness is at least 9.09 can be used to make angora sweaters. Assume that the fuzziness is normally distributed with mean 10.30 and standard deviation 1.21 (as in the previous problem).

(a) What is the probability that a random goat on the farm is fuzzy enough for IGAF?

(b) What is the probability that at least 25 out of a random group of 30 goats meets the IGAF standard? **Calculate this exactly, using the binomial distribution.**

(c) Repeat the previous calculation using the normal approximation.

In his book Alchemical Management: Getting the Lead Out and the Gold In, Alex Cagliostro, the famous business consultant, profiles ten companies that achieved excellence after adopting his system of alchemical management (seminars available through appointment with Cagliostro Consulting PLC). The rival consultancy of Hooke, Waterhouse, Comstock and Root points out that there were seventy-two other companies which tried alchemical management without achieving excellence.

(a) Assuming these 82 firms form a representative sample (aside: is that reasonable?), calculate a 95% confidence interval for the proportion of
excellent firms among alchemically-managed companies.

(b) It is known that 12% of all firms achieve excellence. Test HWCR's claim that excellence is no more common among alchemically-managed firms than among non-alchemical companies. State the null and alternative hypotheses. Is this a one-sided or two-sided test? Calculate the *p*-value.

Airport security cannot give a detailed screening to everybody trying to fly by plane, so they select a fraction for detailed screening. Suppose there are
four kinds of passengers: innocent-looking law-abiding citizens, suspicious-looking law-abiding citizens, suspicious-looking terrorists and innocent-looking terrorists. Security officials decide to screen all suspicious-looking passengers, and a random 2% of all innocent-looking people just to be safe. 10% of the total population is suspicious-looking, but 80% of all terrorists are.

(a) What is the probability that no one in a group of four random terrorists
will be screened the next time they fly?

(b) Say a terrorist has *evaded scrutiny* if they have taken five flights without being screened once. What is the probability that a terrorist is innocent-looking, given that he has evaded scrutiny?

(c) Supposing one has a group of four terrorists who have all evaded scrutiny, what is the probability that *none* of them will be screened the next time they fly?

(d) Repeat the calculation in (c), supposing that airport security ignores who looks suspicious or innocent, and screens 12% of all passengers completely at random.

Posted at April 23, 2004 18:46 | permanent link