Prob Notes

Chapter 1: The Sample Space
HOMEWORK PROBLEMS 1, 2, 3, 4, 9, 12, 13, 14, 18, 19 Must establish possible results of experiment or observation in question Events = the results of experiments or observations Compound / Decomposable Events vs. Simple Indecomposable Events - Compound events can be decomposed into multiple simple events - Events that are not mutually exclusive can occur simultaneously - A compound event is an aggregate of certain simple events The simple events, representing every thinkable outcome, define the idealized experiment - Will be called the points - Every indecomposable result is represented by one and only one sample point Sample Space: the aggregate of all sample points All events connected with a given experiment can be described as aggregates of sample points - SS every thinkable outcome of experiment is completely described by one and only one sample point Collection of all sample points representing outcomes where A has occurred completely describes the event DEF: event same meaning as an aggregate of sample points - New ones can be defined in terms of two or more given ones Algebra of Points Sets (the formal algebra of events) Capital letters denote events (sets of sample points) x A denotes that a point is contained in an event x G (the sample space) for all x. A = B two events consist of exactly the same points Events generally defined by conditions of their points A=0 denotes no point satisfies a specified set of conditions. (The event A contains no sample points; is impossible.) All events have corresponding event defined by the condition that it does not occur. Contains all points not contained in the original. Called the complementary event (or negation) of A, denoted A. Note G=0. 1
Given any two events A, B: can associate two new events defined by the conditions that both occur, or either occur or both occur. - AB : both A and B occur - A U B : either A or B or both occur AB contains all sample points common to both. If A & B exclude each other, then theres no common points and event AB is impossible. AB=0; A and B are mutually exclusive. AB : A but not B occurs AB : neither A nor B occurs A U B : at least one occurs; contains all sample points except those belonging to neither AB: simultaneous occurrence of A and B For every collection of events, two new events are defined - the intersection (or simultaneous realization) the aggregate of sample points belonging to all given sets - the union (or realization of at least one) aggregate of sample points belonging to at least one A, B, C are mutually exclusive if no two points have a point in common, AB=0, AC=0, BC = 0. To express that A cant occur w/o B occurring; occurrence of A occurrence of B; every point of A is contained in B. A subset of B and B subset of A. A B or B is implied by A. When A is a subset of B B A is used rather than BA to denote the event B but not A occurs. B A contains all points in B but not A. A = G A AA=0 Examples (a) A and B mutually exclusive: A B and vice versa. So AB = 0 means A is a subset of B and B is a subset of A. (b) A AB : occurrence of A but not both. So A AB = AB. Discrete Sample Space: iff it contains only finitely many points which can be arranged into a simple sequence -> probabilities of events obtained by additions Probabilities in Discrete Sample Spaces: Preparations:
Inapplicability of idealized scheme should permit discovery of assignable causes for discrepancies. Examples: (a) Distinguishable balls: 1/27 (b) Indistinguishable balls: Bose-Einstein statistics vs. Maxwell-Boltzmann Statics
Chapter 2: Elements of Combinatorial Analysis

HOMEWORK PROBLEMS Exercise: 2, 4, 5, 9, 11, 15, 17, 25, 26, 27, 30, 31, 32, 33, 34, 37, 44 Binomial: 5, 8, 10, 11, 21, 22, 23, 24
1. Preliminaries
Pairs. With m elements a1, , am and n elements b1, , bn, it is possible to form mn pairs (aj, bk) containing one element from each group. This can be shown by arranging pairs in a rectangular array in the form of a multiplication table with m rows and n columns. Examples (a) Bridge. (b) Seven Way Lamps. Multiplets. Given n1 elements a1, , an1 and n2 elements b1, , bn2, etc., Proof. Reformulated. r successive selections (decisions) with exactly nk choices possible at the kth step can produce a total of n1 n2 nr different results. Examples (c) Multiple Classifications. (d) Treatments. (e) Balls and Cells. (f) Flag Display. R flags of different color to be displayed on n poles in a row. How many ways can this be done? Assume flags on each pole are in a definite
order from top to bottom. Display can be planned by making r successive decisions for the individual flags. With first flag, can choose one among n poles. Placement of the first divides the pole selected into two segments, so there are n + 1 possible choices for the second flag. Each placement creates a new choice for the next. So n(n + 1)(n + 2) (n + r 1) different displays are possible.
2. Ordered Samples
Given a population of n elements, any ordered arrangement aj1, aj2, , ajr of r symbols is called an ordered sample of size r drawn from population. Two procedures are possible: (1) Sampling with Replacement. Each selection is made from the entire population, so the same element can be drawn more than once. Samples are arrangements in which repetition is permitted. Each of the r elements can be chosen in n ways, so the number of possible samples is nr. (2) Sampling without Replacement. Once an element is chosen it is removed from the population, so the same element cannot be drawn more than once. Samples are arrangements without repetitions. The sample size (r) cannot exceed the population size (n). Each selection reduces the population remaining for the next selection. Have n possible choices for the first element, but only n-1 for the second, n-2 for the third, etc. There are n(n - 1)(n - r + 1) choices in all. Notation: (n)r = n(n - 1)(n - r + 1). Note that (n)r = 0 for integers r, n such that r > n. Theorem. For a population of n elements and a prescribed sample size r, there exist nr different samples with replacement and (n)r samples without replacement. Special Case. r = n. In sampling without replacement, sample size n includes the whole population and represents a reordering (or permutation) of its elements. Then, the n different elements can be ordered in (n)r = (n)n = n(n-1)2*1 different ways. Written as n! Corollary. The number of different orderings of n elements is n! = n(n-1)2*1 Examples
(a) Ordered Sample of Human Population. Three persons form an ordered sample from the human population: B days are a sample from the population of calendar days; Ages are a sample of three numbers. (b) Sequence of Letters. Sequence of ten letters represents a sample from the population. Repetitions are permitted so there are 2610 possible. On the other hand, printing press letters exist both conceptually and physically in the form of type. Assume there are 1000 pieces of type available for each letter. The printer has to choose 10 pieces of type, with repetitions excluded. A word can be set up in (26,000)10 different ways (dont get it)!!! (c) Samples from the Human Population. Two people can form a sample size of two drawn from the human population. At the same time can form a sample of size one drawn from the population of all couples. The sample size is defined only in relation to a given population. (d) Ordering and Sampling in Practice.
Drawing r elements from a population of n in succession is an experiment with possible outcomes that are samples of size r. The number of them depends on whether or not replacement is used. In either case, experiment can be described by a sample space where each individual point represents of size r. Usually we assign all samples equal probabilities and speak of them as random samples. When applied to samples or selections the word random has a unique meaning. Random choice is used to imply all outcomes have equal probability. For random samples of fixed size r, random is used to imply that all possible samples have the same probability; 1/nr when replacement is used and 1/(n)r without replacement; n denotes size of the population from which the sample is drawn. If n is large and r is relatively small, the ratio (n)r/nr is near unity, leading us to expect in these circumstances the two ways of sampling are practically equivalent. Exercise: In sampling without replacement the probability for any fixed element of the population to be included in a random sample of size r is
With replacement, the probability that an element be included at least once is 1 (1 1/n)r
3. Examples
All special cases of the problem: random sample of size r is taken from a population of n with replacement. Seek the probability of the event that in the sample not element appears twice, i.e., that the sample could have been obtained without replacement. There are nr different samples in all. (n)r satisfy the stipulated condition. Assuming all arrangements are equally likely, the probability of no repetition in our sample is
Examples (a) Random Sampling Numbers. Given a population of 10 digits, every succession of five digits represents a sample of size r =5. Assuming all arrangements have probability of 10-5, the probability that five consecutive random digits are all different is p = (10)510-5. (b) Balls and Cells. Place n balls into n cells. The probability that cell will be occupied is n!/nn. Car Accident Problem. If in a city seven accidents occur each week, then assuming all possible distributions are equally likely, practically all weeks will contain days in with two or more accidents. On average only one week out of 165 will show a uniform distribution of one accident per day. (7!/77). Example reveals an unexpected characteristic of pure randomness.
(c) Elevator Problem. Elevator starts with r = 7 passengers & stops at n = 10 floors. Whats the probability that no two passengers leave at the same floor? Assume all arrangements of discharging the passengers are equiprobable. Then p=10-7(10)7. Numerator: 10 x 9 x 8 x 7 x 6 x 5 x 4 (these are the remaining options for each successive passenger, after another exits; first to leave has 10 choices, 2nd has 9, etc.)
(d) Birthdays. Birthdays of r people form a sample of size r from the population of all days in the year. Take a random selection of people as equivalent to random selection of birthdays and consider year as 365 days. The probability that all birthdays are different equals (365)r/365r. Methods of approximating the calculation discussed in text. Exercise 25: Given 30 people, find the probability that among the 12 months there are 6 containing 2 birthdays and 6 containing 3 birthdays. #S = 3012 #A = BI(12; 6) 30!/(2!)6(3!)6.
4. Subpopulations and Partitions

Population of size n denotes an aggregate of n elements without regard to their order. Two populations are different only if one contains an element the other does not. Consider a subpopulation of size r. By arbitrarily numbering the elements, it becomes an ordered sample of size r. There are r! different ways to number r elements. Theorem 1. A population of n elements possess BI(n; r) different subpopulations of size r n This is the number of different ways in which a subset of r elements can be chosen. Such a subset is uniquely determined by the n r elements not belonging to it. There are exactly as many subpopulations of size r as there are subpopulations of size n r. Hence for 1 r n, it must be that BI(n; r) = BI(n; n r). Examples (a) Bridge and Poker. Probability that a poker hand contains 5 different face values.
Probability of 13 different face values for bridge.
(b) Senators. Each of 50 states has 2 senators. In a committee of 50 senators chosen at random: (1) What is the probability that a given state is represented? Its better to calculate the probability of the complementary event, q. (The probability that a given state is not represented). There are 100 senators, with 98 not from the given state. Numerator is BI(98; 50); denominator is BI(100; 50).
(2) What is the probability that all states are represented? A committee including one senator from each state can be chosen in 250 different ways. So probability that all states are included is that over BI(100; 50). Can estimate result using Stirlings formula. (c) Occupancy Problem. Consider again a random distribution of r balls in n cells (so that each of the nr arrangements has probability n-r). To find the probability, pk, that a specified cell contains exactly k balls (k = 0, 1, 2, , r), note that the k balls can be chosen in BI(r; k) ways, and the remaining r k balls can be placed in the remaining n 1 cells in (n 1)r-k ways. It follows that: pk = This is a special case of the binomial distribution.
Relationships. Distinguishable and Indistinguishable Elements. Subpopulations and Corresponding Ordered Samples. Examples (d) Flag Display.
(e) Ordering with Two Kinds of Elements.
Number of Shortest Polygonal Paths.
Theorem 2. Let r1, , rk be integers such that r1 + r2 + + rk = n, ri 0.
The number of ways in which a population of n elements can be divided into k ordered parts (partitioned into k subpopulations) of which the first contains r1 elements, the second r2 elements, etc., is n!/ r1! r2! rk! (called multinomial coefficients).
Proof. Examples (g) Bridge. Whats the probability that each player has an ace?
(h) Dice. Throw 12 once whats the probability each face appears exactly twice?
5. Application to Occupancy Problems

Result. Proof. Examples (a)
(b) Partial Derivatives.
(c) Balls and Cells. Configurations.
6. The Hypergeometric Distribution

Many problems can be reduced to the following form: In a population of n elements n1 are red and n2 = n n1 are black. A group of r elements is chosen at random. What is the
probability qk that the group so chosen will contain exactly k red elements? (Where k can be any integer between n1 and r, which ever is smaller).1 To find qk: note that the chosen group contains k red and r k black elements.
End up with two formulas (second is derived from the first):
Examples: (a) Quality Inspection. Estimating Size of Subpopulation. Can use HGD to draw inference about the likely magnitude of n1. Assume n1 is known. Then what is the probability of k. qk is given by formula (b) Senators.
(c) Estimating Population Size. Capture Data. Estimating Population Size. Maximum Likelihood Estimate.
Random selection (w/o repetition) made among objects of two distinct types. Distinction between this a binomial distribution: here there is no replacement.
10
(d) Bridge. Generalization to Populations w/ Several Classes of Elements.
7. Examples for Waiting Times

Variation of the occupancy problem: randomly placing balls into n cells, but without knowing r (the number of balls); place one by one until the prescribed situation arises Two Situations Considered: (1) Random placing of balls continues until for the first time ball is places in a cell that is already occupied (2) Fix a cell and continue procedure as long as the fixed cell remains empty Applied: Birthday problem: (1) (ex 3.d) selecting people randomly one by one, how many people have to be sampled in order to find a pair with a common birthday; n = 365 days are the cells; (2) waiting for my birthday to turn up in the sample Key problem Collecting coupons Coins and dice Class: Get formula for type one and type two problem, and looked at how to calculate the median of each; something about cumulative distribution
Chapter 5: Conditional Probability

HOMEWORK PROBLEMS 3, 7, 11, 12, 23, 24
1. Conditional Probability
Preparatory Examples. Suppose population of N people includes NA colorblind and NH female. Let A be event that person chosen at random is colorblind, and H be event that person is female. Then: P{A} = NA/N P{H} = NH/N
11
The probability that a person chosen at random from the subpopulation consisting of females is colorblind equals NHA/NH, where NHA is the number of color blind females. P{AH} denotes the probability of the event A assuming the event H. P{AH} = NAH/NH = P{AH}/P{H}. All subpopulations can be considered populations in their own right. Use the term subpopulation to indicate that theres a larger population in the back of mind. When the subpopulation, is the only population of interest, P{AH} reduces to P{A}. Bridge. Once cards are dealt N knows his own cards, and only cares about the distribution of the remaining 39 cards. The aggregate of those cards could be introduced as a new sample space, but its more convenient to consider then in conjunction with the given distribution of Ns 13 cards (call it event H) and then talk about the probability of an event A (e.g., S has two aces) assuming H. Defined. Let H be an event with positive probability. For an arbitrary event A: P{AH} = P{AH}/P{H} is called the conditional probability of A on the hypothesis H (or for given H / if it is known that H has occurred). If all sample points are equiprobable, P{AH} is the ration NAH/NH of the number of sample points common to A and H, to the number of points in H. When hypothesis has 0 probability, conditional probabilities remain undefined. Probabilities in the original sample space are sometimes called absolute probabilities. Taking conditional probability of an event w/r/t a particular hypothesis H amounts to choosing H as a new sample space with probabilities proportional to the original ones; P{H} the proportionality factor is necessary to reduce the total probability of the new sample space to unity. Note, all general theorems on probabilities are valid for conditional probabilities w/r/t to any particular H. Fundamental relation for the probability of the occurrence of either A or B or both takes the form: P{A BH} = P{AH} + P{BH} P{ABH}. The conditional probability formula is often used in the form P{AH} = P{AH}P{H}.
12
Called the theorem on compound probabilities. Can generalize to 3 events by taking H = BC as hypothesis and reapplying the formula, so that P{ABC} = P{ABC} P{BC} P{C} Let H1, , Hn be a set of mutually exclusive events of which one necessarily occurs (so the union is the whole sample space). Then any event A can occur only in conjunction with some Hj. So A = AH1 AH2 AHn. AHj are mutually exclusive, so their probabilities sum. By the compound probabilities formula, P{A} = P{AHj} P{Hj} This is useful when evaluating P{AHj} is easier than evaluating P{A}. Examples (a) Sampling Without Replacement. Take an ordered sample of 1, 2, , n from a population of n elements. Let i be the first element drawn (event H). What is the probability that the next element drawn is j (event A)? P{AH} = 1/n(n - 1) P{AH} = 1/(n 1) Expresses fact that second choice refers to population of n 1 elements, each with equal probability of being chosen. Leads to a natural definition of random sampling: whatever the first r choices, at the (r + 1)st step each of the remaining n r elements has probability 1/(n r) to be chosen. (b) Balls and Cells. 4 balls successively placed into 4 cells; all 44 arrangements equally probable. Given that the first 2 balls are in different cells (event H), what is the probability that 1 cell contains at least 3? Given H, A can occur in two ways, so P{AH} = 24-2 = 1/8. Event H contains 12 42 points and AH contains 12 2 points. (c) Sex Distribution. Family with two children. Four possibilities (sample points): bb, bg, gb, gg. Each has probability. Given that a family has a b, whats the probability both are boys? AH = {bb}; H = {bb, bg, gb}. So probability is 1/3. In 1/3 of the families with character H we can expect that A will also occur.
13
(d) Stratified Populations.2 Human population consisting of subpopulations, or strata, H1, H2, (could be race, age, etc.). Let pj be probability that an individual chosen at random belongs to Hj. If qj is the probability that an individual in Hj is left handed (i.e., qj is the conditional probability of the event A on the hypothesis that an individual belongs to Hj). Probability that an individual is left handed is p1q1 + p2q2 + (special case of formula 1.8?). Given LH, the CP of belonging to stratum Hj is P{HjA} = (pjqj)/(p1q1 + p2q2 + )
2. Probabilities Defined by Conditional Probabilities. Urn Models

Problems in previous section took probabilities in the sample space for granted. Many experiments are described by specifying certain conditional probabilities, meaning that theoretically the probabilities in the sample space are to be derived from the given conditional probabilities. The following examples reveal the general scheme more effectively than a direct description could. Examples: (a) Taking Turns. Flashback. (I,(5.b)) 3 players a, b, c take turns at a game, according to the following rules: a and b play first, with c out; c plays winner; game continues this way until one player wins twice in succession, thereby winning the game. Possible outcome scheme given (describing the points of the sample space). Could be the case that no player ever wins twice in a row, and the game would go indefinitely according to one of two patterns. No probabilities were assigned. (b) Families. stratified population (c) Urn Models for Aftereffect Consider: industrial plant liable to accidents. Can view occurrence of an accident as game of chance (fate has an urn of red and black balls; at regular time intervals a ball is drawn at random; red signifies an accident). If chance of accident remains constant in time, the composition of the urn is always the same. But its conceivable that each accident has an aftereffect, which either increases or
2
Stratified population is completely described by stating the absolute probabilities p j of the several strata, and the qj of the characteristic LH within each stratum.
14
decreases the chance of new accidents. Corresponds to an urn whose composition changes according to certain rules that depend on the outcome of successive drawings. Urn Model: an urn contains b black and r red balls. Ball is drawn at random. Its replaced, and c balls of the opposite color are added. New random drawing is made from the urn, now containing r + b + c + d balls. This procedure is repeated. Here, c and d are arbitrary integers; may be chosen negative, except that in this case the procedure may terminate after finitely many drawings for lack of balls. Choosing c = -1 and d = 0, gives the model of random drawings without replacement; which terminates after r + b steps. The cs and ds change the urn composition. Note that description specifies conditional probabilities from which certain basic probabilities are to be calculated. Typical point in the sample space corresponding to n drawings may be represented by a sequence of n letters B and R. The event black at first drawing, the aggregate of all sequences starting with B, has probability b/(b + r). If the first ball is B, the (conditional) probability of a black ball at the second drawing is (b +c)/ (b + r + c + d). The absolute probability of the sequence B, B (the aggregate of the sample points starting with BB), is (by 1.5)3: b/(b + r) * (b + c)/(b + r + c + d). Using inductively defined sequence, the probabilities of all sample points can be calculated (and all add to unity). Special Case: Polyas Urn Scheme: Characterized by d = 0, c > 0. After each drawing the number of balls of the color drawn increases, while balls of the opposite color remain unchanged in number. The drawing of either color increases the probability of the same color at the next drawing. (Model of contagious disease: every occurrence increases the probability of further occurrences). Probability that of n = n1 + n2 drawings, the first n1 ones result in black balls and the remaining n2 ones in red balls is given by: b (b + c) (b + 2c)(b + n1 c - c) r (r + c) (r + n2 c - c) (b + r) (b + r + c) (b + r + 2c) (b + r + n c - c) Now consider any ordering of n1 black and n2 red. In calculating the probability that n drawings result in this particular order of colors, the same factors in the above are encountered but rearranged in a new order. Follows that all possible sequences of n1 black and n2 red have the same probability (a characteristic property of this scheme).
P{AH} = P{AH}P{H} 15
To obtain the probability pn1,n that n drawings result in n1 black and n2 red in any order, must multiply the quantity above by binomial(n; n1), (the number of possible orderings). Use of general binomial coefficients allows to write in either of the following forms: pn1,n = BI(n1 1 + b/c; n1) BI(n2 1+ r/c; n2) = BI(-(b/c); n1) BI((-r/c); n2) BI(n 1 + (b + r)/c; n) BI(-(b + r)/c; n)
Ehrenfest Model. Another special case of interest contained in our urn model. In original description model envisioned two containers I and II and k particles distributed in them. Particle is chosen at random and moved to the other container. Procedure is repeated. What's the distribution of particles after n steps? Can reduce to an urn model by calling container I particles R and other B. Then each drawing, ball drawn is replaced by a ball of the opposite color, i.e., c = -1, d = 1. Process can continue as long as wanted (when theres no R balls, B ball is drawn automatically & replaced by R one. Special case of c = 0, d > 0 could be used to model a safety campaign. Each time accident occurred (R ball was drawn) would mean to push the campaign harder and when no accident occurs, slacken the campaign, and the probability of an accident increases. (d) Urn Models for Stratification. Spurious Contagion. Suppose each person is liable to accident & their occurrence is determined by random drawings from an urn. Suppose though no aftereffect exists. Urn composition remains unchanged throughout the process. Proneness may vary from person to person (or by profession); where each has its own urn. Suppose there are just two types of people & and that their numbers in the total population stand in the ratio 1:5. Consider an Urn I containing r1 red and b1 black & Urn II with r2 and b2. Experiment of choosing a person at random and observing the number of accidents he has during n time units has counter part: a die is thrown; if ace appears chose Urn I, otherwise chose Urn 2. In each case n random drawings with replacement are selected from the Urn. (Situation of insurance company accepting a new subscriber). (e) Note on Bayess Rule. 3. Stochastic Independence (not covered) Definition 1
16
Examples (a) (b) (c) (d) (e) Definition 2
4. Product Spaces. Independent Trials
Chapter 6: The Binomial and Poisson Distributions

HOMEWORK PROBLEMS 2, 5, 6, 8, 19
Bernoulli Trials
Repeated independent trials are called Bernoulli if: (1) there are only two possible outcomes for each trial; and (2) their probabilities remain the same throughout the trials. The two probabilities are typically denoted p and q, and refer to the outcome with probability p as success, S, and to the other as failure, F. Both probabilities are nonnegative, and p + q = 1. Sample space of each individual trial has two points, S and F. Sample space of n Bernoulli trials contains 2n points (each of which are a succession of n symbols S and F, and represent one possible outcome of the compound experiment). The trials are independent, so the probabilities multiply. The probability of any specified sequence, is the product obtained by replacing the symbols S and F by p and q, respectively. So, P{(SSFSFFFS)} = pppqpq qqp Examples
17
Coin Tosses. Successive tosses of a fair coin (p = q = ). If the coin is unfair, the successive tosses are still assumed to be independent, so the model is of Bernoulli trials in which the probability p for success can have an arbitrary value. Urn Problems. Repeated random drawings from an urn of constant composition also represents Bernoulli trials. Bernoulli Reductions. Bernoulli trials also arise from more complicated experiments, if there is a decision not to distinguish among several outcomes, describing any possible result simply as A or non-A. Dice Rolls. Could do this for a dice problem, where the interest is the distinction between ace (S) and non-ace (F). With a fair dice, p = 1/6. Could also distinguish between even and odd, leading to Bernoulli trials with p = . With an unfair die, the successive throws would still form Bernoulli trials, but the corresponding probabilities p would be different. Reductions are also used in sampling practice and quality control setting, where the Bernoulli scheme represents an ideal standard, which may never be fully attainable.
The Binomial Distribution

Total Number of Successes in a Bernoulli Trial: May be interested in the total number of successes produced in a succession of n Bernoulli trials, denoted Sn, but not in their order. The number of successes can be 0, 1, , n. What are their corresponding probabilities? The event n trials result in k successes and n k failures can happen in as many ways as k letters S can be distributed among n places. So the event of interest contains bi(n; k) points. Each of those points has a probability pkqn-k. Theorem: Let b(k; n, p) be the probability that n Bernoulli trials with probabilities p for success and q = 1 p for failure result in k successes and n k failures. Then: b(k; n, p) = bi(n; k) pkqn-k The probability of no success is qn and the probability of at least one success is 1 - qn. Treat p as a constant, then b(k; n, p) = P{Sn = k}
18
Sn is a random variable, and the function b(k; n, p) = bi(n; k) pkqn-k is the distribution of this random variable. It is referred to as the binomial distribution.4 Note: b(0; n, p) + b(1; n, p) + + b(n; n, p) = (q + p)n = 1 (as required by the notion of probability) Examples: (a) Weldons Dice Data. Let experiment consist of throwing twelve dice, where 5s and 6s are counted as success. With a fair dice the probability of success is p = 1/3, and the number of successes should follow the binomial distribution b(k; 12; 1/3). Observed data provided in book does not match up well; suggests dice used had a bias with probability of success p = .3377. (b) Flashbacks. Card Guessing. IV, 4 [@107] had a binomial distribution (p = n-1) Occupancy Problem. Balls & Cells. Saw a special case of the binomial distribution in the occupancy problem II, (4.c) with (p = n-1); @ page 35. There a random distribution of r balls in n cells was considered. Each of the nr possible arrangements had probability n-r. The task was to find the probability, pk, that a specified cell contains exactly k balls (k = 0, 1, , r). (c) Number of trials needed to ensure the probability for at least one success be greater than specified. How many trials with p = .01 must be performed to ensure that the probability for at least one success be or greater? Seek the smallest integer n for which 1 (q)n 1/2 , or n log(q) log 2 (Solution: n 70) (d) Supply Problem. Suppose that n = 10 workers intermittently use electric power, and we want to estimate the total load to be expected. To get a crude approximation, imagine that at any given time each worker has the same probability p of requiring a unit of power. Working independently, the probability of exactly k workers requiring power at the same time should be b(k; n, p). If on average a worker uses power for 12 minutes per hour, can set p = 1/5. (The number of 12 minute increments in an hour). The probability of 7 or more workers requiring power at the same time is then:
4
Binomial refers to the fact that the distribution represents the kth term of the binomial expansion of (q + p)n.
19
b(7; 10, .2) + + b(10; 10, .2) That number gives the probability of an overload if the power supply is adjusted to six power units. (e) Testing Vaccines. (f) Another Statistical Test.
3. The Central Term and the Tails

b(k; n, p) = BI(n; k) pk qn-k. b(k; n, p)___ b(k 1; n, p) = (n k + 1)p kq = 1 + (n - 1) p k kq
b(k; n, p) is greater than the preceding one for k < (n + 1)p and > for k > (n + 1)p b(k; n, p) > b(k + 1; n, p) for k < (n + 1)p < k > (n + 1)p If (n + 1)p = m is an integer, then b(m 1; n, p) = b(m; n, p). Theres exactly one integer m such that (n + 1)p 1 < m (n + 1)p Theorem As k goes from 0 to n, the terms of b(k; n, p) increase monotonically, then decrease monotonically, reaching their greatest value when k = m, except that b(m 1; n, p) = b(m; n, p) when m = (n + 1) p. b(m; n, p) is called the central term.
4. The Law of Large Numbers
5. The Poisson Approximation

Examples
20
(a) (b) Empirical Illustration. (c) Birthdays. (d) Defective Items. (e) Centemarians. (f) Misprints, Raisins, Etc.
6. The Poisson Distribution
7. Observations Fitting the Poisson Distribution
8. Waiting Times. The Negative Binomial Distribution

Examples (a) The Problem of Banachs Match Boxes.
(b) Generalization: Table Tennis.
9. The Multinomial Distribution

Examples: (a)
(b) Sampling.
21
(c) Multiple Bernoulli Trials.
Chapter 9: Random Variables

Can say that the # Sn of successes in n Bernoulli Trials is a function defined on the sample space. To each of the 2n points in this space corresponds a # Sn. A function defined on a sample space is called a random variable. It includes a unique rule which associates a # X with any sample point. In discrete space can theoretically tabulate any random variable X by enumerating in some order all points of the space and associating with each the corresponding value of X. A random variable is really like a random function in that the independent variable is a point in the sample space; i.e., it is the outcome of an experiment. Suppose X is a random variable and let x1, x2, (the range of X) be the values it assumes.5 The aggregate of all sample points on which X assumes the fixed value xj forms the event that X = xj. That events probability is denoted P{X = xj}. The function P{X = xj} = f(xj), (j = 1, 2, ) is called the (probability) distribution of the random variable X. Observe: f(xj) 0 and f(xj) = 1. Terminology Applied: Bernoulli Trials The # of successes, Sn is a random variable with probability distribution {b(k; n, p)} The # of trials up to & including the 1st success is a random variable with the distribution {qk-1p} Joint Distribution: Let X & Y be random variables defined on the same sample space, assuming values x1, x2, and y1, y2, , with corresponding probability distributions {f(xj)} and {g(yk)}. The aggregate of points in which two conditions, X = xj & Y = yk, are satisfied form an event with probability denoted by P{X = xj, Y = yk}. The function P{X = xj, Y = yk} = p(xj, yk) is the joint probability distribution of X and Y. Observe: p(xj, yk) 0, and
5
(j, k = 1, 2, )
xj will typically be an integer here
22
j,k p(xj, yk) = 1. Marginal Distributions: For every fixed j: p(xj, y1) + p(xj, y2) + p(xj, y3) + = P{X = xj} = f(xj) For every fixed k: p(x1, yk) + p(x2, yk) + p(x3, yk) + = P{Y = yk) = g(yk) Adding the probabilities in individual rows and columns of a joint distribution table gives the probability distributions of X and Y. The notation of joint distributions can be extended to systems of more than 2 variables. Examples: (a) 3 Balls, 3 Cells. (b) Multinomial Distribution. (c) Geometric Distributions. (d) Randomized Sampling.
Conditional Probability: The conditional probability of the event Y = yk given that X = xj (with f(xj) > 0): P{Y = yk X = xj} = P(xj, yk) / f(xj) A # is associated with every value of X, so the above defines a function of X, called the conditional distribution of Y for a given X, denoted P{Y = yk X}. In general, this is different from g(yk), indicating inference can be drawn from values of X to those of Y and vice versa. The two variables are (stochastically) dependent. The strongest degree of dependence is when Y is a function of X, in which case, the value of X uniquely determines Y. This means in each row of the joint distribution table all entries but one would be zero. If p(xj, yk) = f(xj)g(yk) combos of xj, yk, the events X = xj and Y = yk are independent. In this case, joint distribution assumes the form of a multiplication table, and we speak of independent random variables. Occurs in connection with independent trials.
23
The joint distribution of X and Y determines the distributions of X and Y, but cant calculate the joint distribution of X and Y from their marginal distributions. Two variables X and Y with the same distributions may or may not be independent. Formal Recapitulation: [Definition]. A random variable X is a function defined on a given sample space, that is, an assignment of a real number to each sample point. The probability distribution of X is the function defined in (1.1). If two random variables X and Y are defined on the same sample space, their joint distribution is given by (1.3) and assigns probabilities to all combinations (xj, yk) of values assumed by X and Y. This notation carries over, in an obvious manner, to any finite set of variables X, Y, , W defined on the same sample space. There variables are called mutually independent if, for any combination of values (x, y, , w) assumed by them, P{X = x, Y = y, , W = w} = P{X = x}P{Y = y} P{W = w}. . [ more on independence and what it means to be a random variable ] Examples (e) Bernoulli trials with variable probabilities. [ ] (f) [ ] (g) Note on Pairwise Independence
Expectations
Let X be a random variable assuming values x1, x2, with corresponding probabilities f(x1), f(x2), The expected value (mean) of X is E(X) = xk f(xk), provided the series converges absolutely. If xk f(xk) diverges, X has no finite expectation. Calculation of expectations of functions such as X2.6 In general, P(X2 = xk2) f(xk). Instead, it equals f(xk) + f(-xk) and E(X2) is defined as the sum of: xk2{f(xk) + f(-xk)}. E(X2) = xk2 f(xk), provided the series converges. Theorem 1:
A new random variable assuming values xk2
24
Any function (x) defines a new random variable (X). If (X) has finite expectation, then E((X)) = (xk)f(xk). The series converges absolutely iff E((X)) exists. For any constant a, E(aX) = aE(X). The sum of several random variables defined on the same sample space is a new random variable. Possible values and corresponding probabilities can be found from the joint distribution of the Xv and thus can calculate E(X1 + + Xn). Theorem 2: [Simple procedure for calculating E(X1 + + Xn)]: If X1, , Xn are random variables with expectations, then the expectation of their sum exists and is the sum of their expectations: E(X1 + + Xn) = E(X1) + + E(Xn). E(X) + E(Y) = j,k xj p(xj, yk) + j,k yk p(xj, yk) E(X + Y) = j,k (xj + yk) p(xj, yk) Theorem 3: [Multiplication rule for mutually independent variables]: If X and Y are mutually independent random variables with finite expectations, then their product is a random variable with finite expectation and E(XY) = E(X)E(Y). E(XY) = j,k xj yk f(xj) g(yk) = {j xj f(xj)} {k yk g(yk)}
(can rearrange because of absolute convergence) Theorem holds for any # of mutually independent random variables.
Expectation of a Conditional Probability Distribution. X, Y random variables with the joint distribution P{X = xj, Y = yk} = p(xj, yk) (j = 1, 2, )
The conditional expectation E(YX) of Y for given X is the function which at place xj assumes the value k yk P{Y = yk X = xj} = k yk p(xj, yk) / f(xj)
25
Only meaningful if the series converges absolutely and f(xj) > 0 j. E(YX) is a new random variable. To calculate its expectation: multiply k yk P{Y = yk X = xj} by f(xj) and sum over xj. Result: E(E(YX) = E(Y)
Examples and Applications

(a) Binomial Distribution. (b) Poisson Distribution. (c) Negative Binomial Distribution.
(d) Waiting Times in Sampling Population of N distinct elements, sampled w/ replacement. Because of repetition, random sample of size r will usually have fewer than r distinct elements. As sample size increases, new elements will enter the sample more and more rarely. Interested in the sample size Sr necessary for the acquisition of r distinct elements.7 Call a drawing successful if it results in adding a new element to the sample. Sr is then the # of drawings up to and including the rth success. Put Xk = Sk+1 Sk. Then Xk 1 is the number of unsuccessful drawings between the kth and the (k + 1)st success. During those drawings the population contains N k elements that have not yet entered the sample. So Xk 1 is the # of failures preceding the first success in Bernoulli trials with p = (N k)/N. Therefore, E(Xk) = 1 + q/p = N/(N k) Since Sr = 1 + X1 + + Xr-1, we get E(Sr) = N{1/N + 1/(N 1) + + 1/(N r + 1)}
As a special case, consider population of N = 365 possible birthdays; here S r represents the number of people sampled up to the moment where the sample contains r different birthdays. Similar interpretation is possible with random placement of balls into cells. Problem is also of particular interest to coupon collectors and other items where the acquisition can be compared to random sampling.
26
(e) An Estimation Problem (f) Application to a Statistical Test
The Variance
Terms Definition Examples Poisson Distribution Binomial Distribution Covariance; Variance of a Sum Definition Theorem 1 Theorem 2 Examples Binomial Distribution 1. Bernoulli Trials with Variable Probabilities 2. Card Matching 3. Sampling Without Replacement
Chapter 7 The Normal Approximation to the Binomial Distribution

1. The Normal Distribution
Definition - Normal Density Function - Normal Distribution Function Lemma 1 Lemma 2 Note on Terminology
2. Orientation: Symmetric Distributions

Using normal distribution as approximation to the binomial with p = 2 Approximation Theorem Approximation Theorem Bounds for Error
27
3. The DeMoivre-Laplace Limit Theorem Theorem 1 Example Theorem 2 Note on Optional Stopping 4. Examples (a) P = and n = 200. Probability that in 200 tosses of a coin, the number of heads deviates from 100 by at most 5 (P{95 Sn 105} (b) (c) (d) (e) (f) A Competition Problem: (g) Random Digits: (h) Sampling 5. Relation to the Poisson Approximation Examples (a) (b) A Telephone Trunking Problem 6. Large Deviations* Lemma Proof Theorem Proof 7. Problems Assigned for Solution (1) (3) Find probability that among 10,000 random digits the digit 7 appears not more than 968 times. (5) Find # k such that probability is about .5 that the # of heads obtained in 1,000 tosses of a coin will be between 490 and k. (7) 10,000 coin tosses resulting in heads 5,400 times. Is coin skew? (10) Normal approximation to the Hypergeometric distribution.
28

Prob Notes

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Prob Notes

Caricato da

Copyright:

Formati disponibili

Chapter 1: The Sample Space

Chapter 2: Elements of Combinatorial Analysis

4. Subpopulations and Partitions

Probability of 13 different face values for bridge.

(e) Ordering with Two Kinds of Elements.

Number of Shortest Polygonal Paths.

Theorem 2. Let r1, , rk be integers such that r1 + r2 + + rk = n, ri 0.

5. Application to Occupancy Problems

(b) Partial Derivatives.

(c) Balls and Cells. Configurations.

6. The Hypergeometric Distribution

End up with two formulas (second is derived from the first):

(d) Bridge. Generalization to Populations w/ Several Classes of Elements.

7. Examples for Waiting Times

Chapter 5: Conditional Probability

2. Probabilities Defined by Conditional Probabilities. Urn Models

Examples (a) (b) (c) (d) (e) Definition 2

4. Product Spaces. Independent Trials

Chapter 6: The Binomial and Poisson Distributions

The Binomial Distribution

3. The Central Term and the Tails

4. The Law of Large Numbers

5. The Poisson Approximation

6. The Poisson Distribution

7. Observations Fitting the Poisson Distribution

8. Waiting Times. The Negative Binomial Distribution

(b) Generalization: Table Tennis.

9. The Multinomial Distribution

(c) Multiple Bernoulli Trials.

Chapter 9: Random Variables

xj will typically be an integer here

A new random variable assuming values xk2

Examples and Applications

(e) An Estimation Problem (f) Application to a Statistical Test

Chapter 7 The Normal Approximation to the Binomial Distribution

2. Orientation: Symmetric Distributions

Potrebbero piacerti anche