Probability Topics
Tree and Venn Diagrams
OpenStaxCollege
[latexpage]
Sometimes, when the probability problems are complex, it can be helpful to graph the situation. Tree diagrams and Venn diagrams are two tools that can be used to visualize and solve conditional probabilities.
Tree Diagrams
A tree diagram is a special type of graph used to determine the outcomes of an experiment. It consists of “branches” that are labeled with either frequencies or probabilities. Tree diagrams can make some probability problems easier to visualize and solve. The following example illustrates how to use a tree diagram.
In an urn, there are 11 balls. Three balls are red (R) and eight balls are blue (B). Draw two balls, one at a time, with replacement. “With replacement” means that you put the first ball back in the urn before you select the second ball. The tree diagram using frequencies that show all the possible outcomes follows.
The first set of branches represents the first draw. The second set of branches represents the second draw. Each of the outcomes is distinct. In fact, we can list each red ball as R1, R2, and R3 and each blue ball as B1, B2, B3, B4, B5, B6, B7, and B8. Then the nine RR outcomes can be written as:
R1R1R1R2R1R3R2R1R2R2R2R3R3R1R3R2R3R3
The other outcomes are similar.
There are a total of 11 balls in the urn. Draw two balls, one at a time, with replacement. There are 11(11) = 121 outcomes, the size of the sample space.
a. List the 24 BR outcomes: B1R1, B1R2, B1R3, …
a. B1R1B1R2B1R3B2R1B2R2B2R3B3R1B3R2B3R3B4R1B4R2B4R3B5R1B5R2B5R3B6R1B6R2B6R3B7R1B7R2B7R3B8R1B8R2B8R3
b. Using the tree diagram, calculate P(RR).
b. P(RR) = \(\left(\frac{3}{11}\right)\left(\frac{3}{11}\right)\) = \(\frac{9}{121}\)
c. Using the tree diagram, calculate P(RB OR BR).
c. P(RB OR BR) = \(\left(\frac{3}{11}\right)\left(\frac{8}{11}\right)\) + \(\left(\frac{8}{11}\right)\left(\frac{3}{11}\right)\) = \(\frac{48}{121}\)
d. Using the tree diagram, calculate P(R on 1st draw AND B on 2nd draw).
d. P(R on 1st draw AND B on 2nd draw) = P(RB) = \(\left(\frac{3}{11}\right)\left(\frac{8}{11}\right)\) = \(\frac{24}{121}\)
e. Using the tree diagram, calculate P(R on 2nd draw GIVEN B on 1st draw).
e. P(R on 2nd draw GIVEN B on 1st draw) = P(R on 2ndB on 1st) = \(\frac{24}{88}\)
= \(\frac{3}{11}\)
This problem is a conditional one. The sample space has been reduced to those outcomes that already have a blue on the first draw. There are 24 + 64 = 88 possible outcomes (24 BR and 64 BB). Twentyfour of the 88 possible outcomes are BR. \(\frac{24}{88}\) = \(\frac{3}{11}\).
f. Using the tree diagram, calculate P(BB).
f. P(BB) = \(\frac{64}{121}\)
g. Using the tree diagram, calculate P(B on the 2nd draw given R on the first draw).
g. P(B on 2nd drawR on 1st draw) = \(\frac{8}{11}\)
There are 9 + 24 outcomes that have R on the first draw (9 RR and 24 RB). The sample space is then 9 + 24 = 33. 24 of the 33 outcomes have B on the second draw. The probability is then \(\frac{24}{33}\).
In a standard deck, there are 52 cards. 12 cards are face cards (event F) and 40 cards are not face cards (event N). Draw two cards, one at a time, with replacement. All possible outcomes are shown in the tree diagram as frequencies. Using the tree diagram, calculate P(FF).
Total number of outcomes is 144 + 480 + 480 + 1600 = 2,704.
P(FF) = \(\frac{144}{\text{144 + 480 + 480 + 1,600}}=\frac{144}{2,704}=\frac{9}{169}\)
An urn has three red marbles and eight blue marbles in it. Draw two marbles, one at a time, this time without replacement, from the urn. “Without replacement” means that you do not put the first ball back before you select the second marble. Following is a tree diagram for this situation. The branches are labeled with probabilities instead of frequencies. The numbers at the ends of the branches are calculated by multiplying the numbers on the two corresponding branches, for example, \(\left(\frac{3}{11}\right)\left(\frac{2}{10}\right)=\frac{6}{110}\).
If you draw a red on the first draw from the three red possibilities, there are two red marbles left to draw on the second draw. You do not put back or replace the first marble after you have drawn it. You draw without replacement, so that on the second draw there are ten marbles left in the urn.
Calculate the following probabilities using the tree diagram.
a. P(RR) = ________
a. P(RR) = \(\left(\frac{3}{11}\right)\left(\frac{2}{10}\right)=\frac{6}{110}\)
b. Fill in the blanks:
P(RB OR BR) = \(\left(\frac{3}{11}\right)\left(\frac{8}{10}\right)\text{ }+\text{ (___)(___) }=\text{ }\frac{48}{110}\)
b. P(RB OR BR) = \(\left(\frac{3}{11}\right)\left(\frac{8}{10}\right)\) + \(\left(\frac{8}{11}\right)\left(\frac{3}{10}\right)\) = \(\frac{48}{110}\)
c. P(R on 2ndB on 1st) =
c. P(R on 2ndB on 1st) = \(\frac{3}{10}\)
d. Fill in the blanks.
P(R on 1st AND B on 2nd) = P(RB) = (___)(___) = \(\frac{24}{100}\)
d. P(R on 1st AND B on 2nd) = P(RB) = \(\left(\frac{3}{11}\right)\left(\frac{8}{10}\right)\) = \(\frac{24}{100}\)
e. Find P(BB).
e. P(BB) = \(\left(\frac{8}{11}\right)\left(\frac{7}{10}\right)\)
f. Find P(B on 2ndR on 1st).
f. Using the tree diagram, P(B on 2ndR on 1st) = P(RB) = \(\frac{8}{10}\).
If we are using probabilities, we can label the tree in the following general way.
 P(RR) here means P(R on 2ndR on 1st)
 P(BR) here means P(B on 2ndR on 1st)
 P(RB) here means P(R on 2ndB on 1st)
 P(BB) here means P(B on 2ndB on 1st)
In a standard deck, there are 52 cards. Twelve cards are face cards (F) and 40 cards are not face cards (N). Draw two cards, one at a time, without replacement. The tree diagram is labeled with all possible probabilities.
 Find P(FN OR NF).
 Find P(NF).
 Find P(at most one face card).
Hint: “At most one face card” means zero or one face card.
 Find P(at least on face card).
Hint: “At least one face card” means one or two face cards.
 P(FN OR NF) = \(\frac{\text{480}}{\text{2,652}}\text{ + }\frac{\text{480}}{\text{2,652}}\text{ = }\frac{\text{960}}{\text{2,652}}\text{ = }\frac{\text{80}}{\text{221}}\)
 P(NF) = \(\frac{40}{51}\)
 P(at most one face card) = \(\frac{\text{(480 + 480 + 1,560)}}{\text{2,652}}\) = \(\frac{2,520}{2,652}\)
 P(at least one face card) = \(\frac{\text{(132 + 480 + 480)}}{\text{2,652}}\) = \(\frac{\text{1,092}}{\text{2,652}}\)
A litter of kittens available for adoption at the Humane Society has four tabby kittens and five black kittens. A family comes in and randomly selects two kittens (without replacement) for adoption.
 What is the probability that both kittens are tabby?
a.\(\left(\frac{1}{2}\right)\left(\frac{1}{2}\right)\) b.\(\left(\frac{4}{9}\right)\left(\frac{4}{9}\right)\) c.\(\left(\frac{4}{9}\right)\left(\frac{3}{8}\right)\) d.\(\left(\frac{4}{9}\right)\left(\frac{5}{9}\right)\)
 What is the probability that one kitten of each coloring is selected?
a.\(\left(\frac{4}{9}\right)\left(\frac{5}{9}\right)\) b.\(\left(\frac{4}{9}\right)\left(\frac{5}{8}\right)\) c.\(\left(\frac{4}{9}\right)\left(\frac{5}{9}\right)+\left(\frac{5}{9}\right)\left(\frac{4}{9}\right)\) d.\(\left(\frac{4}{9}\right)\left(\frac{5}{8}\right)+\left(\frac{5}{9}\right)\left(\frac{4}{8}\right)\)
 What is the probability that a tabby is chosen as the second kitten when a black kitten was chosen as the first?
 What is the probability of choosing two kittens of the same color?
a. c, b. d, c. \(\frac{4}{8}\), d. \(\frac{32}{72}\)
Suppose there are four red balls and three yellow balls in a box. Three balls are drawn from the box without replacement. What is the probability that one ball of each coloring is selected?
\(\left(\frac{4}{7}\right)\left(\frac{3}{6}\right)\) + \(\left(\frac{3}{7}\right)\left(\frac{4}{6}\right)\)
Venn Diagram
A Venn diagram is a picture that represents the outcomes of an experiment. It generally consists of a box that represents the sample space S together with circles or ovals. The circles or ovals represent events.
Suppose an experiment has the outcomes 1, 2, 3, … , 12 where each outcome has an equal chance of occurring. Let event A = {1, 2, 3, 4, 5, 6} and event B = {6, 7, 8, 9}. Then A AND B = {6} and A OR B = {1, 2, 3, 4, 5, 6, 7, 8, 9}. The Venn diagram is as follows:
Suppose an experiment has outcomes black, white, red, orange, yellow, green, blue, and purple, where each outcome has an equal chance of occurring. Let event C = {green, blue, purple} and event P = {red, yellow, blue}. Then C AND P = {blue} and C OR P = {green, blue, purple, red, yellow}. Draw a Venn diagram representing this situation.
Flip two fair coins. Let A = tails on the first coin. Let B = tails on the second coin. Then A = {TT, TH} and B = {TT, HT}. Therefore, A AND B = {TT}. A OR B = {TH, TT, HT}.
The sample space when you flip two fair coins is X = {HH, HT, TH, TT}. The outcome HH is in NEITHER A NOR B. The Venn diagram is as follows:
Roll a fair, sixsided die. Let A = a prime number of dots is rolled. Let B = an odd number of dots is rolled. Then A = {2, 3, 5} and B = {1, 3, 5}. Therefore, A AND B = {3, 5}. A OR B = {1, 2, 3, 5}. The sample space for rolling a fair die is S = {1, 2, 3, 4, 5, 6}. Draw a Venn diagram representing this situation.
Forty percent of the students at a local college belong to a club and 50% work part time. Five percent of the students work part time and belong to a club. Draw a Venn diagram showing the relationships. Let C = student belongs to a club and PT = student works part time.
If a student is selected at random, find
 the probability that the student belongs to a club. P(C) = 0.40
 the probability that the student works part time. P(PT) = 0.50
 the probability that the student belongs to a club AND works part time. P(C AND PT) = 0.05
 the probability that the student belongs to a club given that the student works part time. \(P\text{(}C\text{}PT\text{)} = \frac{P\text{(}C\text{ AND }PT\text{)}}{P\text{(}PT\text{)}} = \frac{0.05}{0.50} = 0.1\)
 the probability that the student belongs to a club OR works part time. P(C OR PT) = P(C) + P(PT) – P(C AND PT) = 0.40 + 0.50 – 0.05 = 0.85
Fifty percent of the workers at a factory work a second job, 25% have a spouse who also works, 5% work a second job and have a spouse who also works. Draw a Venn diagram showing the relationships. Let W = works a second job and S = spouse also works.
A person with type O blood and a negative Rh factor (Rh) can donate blood to any person with any blood type. Four percent of African Americans have type O blood and a negative RH factor, 5−10% of African Americans have the Rh factor, and 51% have type O blood.
The “O” circle represents the African Americans with type O blood. The “Rh“ oval represents the African Americans with the Rh factor.
We will take the average of 5% and 10% and use 7.5% as the percent of African Americans who have the Rh factor. Let O = African American with Type O blood and R = African American with Rh factor.
 P(O) = ___________
 P(R) = ___________
 P(O AND R) = ___________
 P(O OR R) = ____________
 In the Venn Diagram, describe the overlapping area using a complete sentence.
 In the Venn Diagram, describe the area in the rectangle but outside both the circle and the oval using a complete sentence.
a. 0.51; b. 0.075; c. 0.04; d. 0.545; e. The area represents the African Americans that have type O blood and the Rh factor. f. The area represents the African Americans that have neither type O blood nor the Rh factor.
In a bookstore, the probability that the customer buys a novel is 0.6, and the probability that the customer buys a nonfiction book is 0.4. Suppose that the probability that the customer buys both is 0.2.
 Draw a Venn diagram representing the situation.
 Find the probability that the customer buys either a novel or anonfiction book.
 In the Venn diagram, describe the overlapping area using a complete sentence.
 Suppose that some customers buy only compact disks. Draw an oval in your Venn diagram representing this event.
a. and d. In the following Venn diagram below, the blue oval represent customers buying a novel, the red oval represents customer buying nonfiction, and the yellow oval customer who buy compact disks.
b. P(novel or nonfiction) = P(Blue OR Red) = P(Blue) + P(Red) – P(Blue AND Red) = 0.6 + 0.4 – 0.2 = 0.8.
c. The overlapping area of the blue oval and red oval represents the customers buying both a novel and a nonfiction book.
References
Data from Clara County Public H.D.
Data from the American Cancer Society.
Data from The Data and Story Library, 1996. Available online at http://lib.stat.cmu.edu/DASL/ (accessed May 2, 2013).
Data from the Federal Highway Administration, part of the United States Department of Transportation.
Data from the United States Census Bureau, part of the United States Department of Commerce.
Data from USA Today.
“Environment.” The World Bank, 2013. Available online at http://data.worldbank.org/topic/environment (accessed May 2, 2013).
“Search for Datasets.” Roper Center: Public Opinion Archives, University of Connecticut., 2013. Available online at http://www.ropercenter.uconn.edu/data_access/data/search_for_datasets.html (accessed May 2, 2013).
Chapter Review
A tree diagram use branches to show the different outcomes of experiments and makes complex probability questions easy to visualize.
A Venn diagram is a picture that represents the outcomes of an experiment. It generally consists of a box that represents the sample space S together with circles or ovals. The circles or ovals represent events. A Venn diagram is especially helpful for visualizing the OR event, the AND event, and the complement of an event and for understanding conditional probabilities.
The probability that a man develops some form of cancer in his lifetime is 0.4567. The probability that a man has at least one false positive test result (meaning the test comes back for cancer when the man does not have it) is 0.51. Let: C = a man develops cancer in his lifetime; P = man has at least one false positive. Construct a tree diagram of the situation.
Homework
Use the following information to answer the next two exercises. This tree diagram shows the tossing of an unfair coin followed by drawing one bead from a cup containing three red (R), four yellow (Y) and five blue (B) beads. For the coin, P(H) = \(\frac{2}{3}\) and P(T) = \(\frac{1}{3}\) where H is heads and T is tails.
Find P(tossing a Head on the coin AND a Red bead)
 \(\frac{2}{3}\)
 \(\frac{5}{15}\)
 \(\frac{6}{36}\)
 \(\frac{5}{36}\)
Find P(Blue bead).
 \(\frac{15}{36}\)
 \(\frac{10}{36}\)
 \(\frac{10}{12}\)
 \(\frac{6}{36}\)
a
A box of cookies contains three chocolate and seven butter cookies. Miguel randomly selects a cookie and eats it. Then he randomly selects another cookie and eats it. (How many cookies did he take?)
 Draw the tree that represents the possibilities for the cookie selections. Write the probabilities along each branch of the tree.
 Are the probabilities for the flavor of the SECOND cookie that Miguel selects independent of his first selection? Explain.
 For each complete path through the tree, write the event it represents and find the probabilities.
 Let S be the event that both cookies selected were the same flavor. Find P(S).
 Let T be the event that the cookies selected were different flavors. Find P(T) by two different methods: by using the complement rule and by using the branches of the tree. Your answers should be the same with both methods.
 Let U be the event that the second cookie selected is a butter cookie. Find P(U).
Bringing It Together
Use the following information to answer the next two exercises. Suppose that you have eight cards. Five are green and three are yellow. The cards are well shuffled.
Suppose that you randomly draw two cards, one at a time, with replacement.
Let G_{1} = first card is green
Let G_{2} = second card is green
 Draw a tree diagram of the situation.
 Find P(G_{1} AND G_{2}).
 Find P(at least one green).
 Find P(G_{2}G_{1}).
 Are G_{2} and G_{1} independent events? Explain why or why not.

 P(GG) = \(\left(\frac{5}{8}\right)\left(\frac{5}{8}\right)\) = \(\frac{25}{64}\)
 P(at least one green) = P(GG) + P(GY) + P(YG) = \(\frac{25}{64}\) + \(\frac{15}{64}\) + \(\frac{15}{64}\) = \(\frac{55}{64}\)
 P(GG) = \(\frac{5}{8}\)
 Yes, they are independent because the first card is placed back in the bag before the second card is drawn; the composition of cards in the bag remains the same from draw one to draw two.
Suppose that you randomly draw two cards, one at a time, without replacement.
G_{1} = first card is green
G_{2} = second card is green
 Draw a tree diagram of the situation.
 Find P(G_{1} AND G_{2}).
 Find P(at least one green).
 Find P(G_{2}G_{1}).
 Are G_{2} and G_{1} independent events? Explain why or why not.
Use the following information to answer the next two exercises. The percent of licensed U.S. drivers (from a recent year) that are female is 48.60. Of the females, 5.03% are age 19 and under; 81.36% are age 20–64; 13.61% are age 65 or over. Of the licensed U.S. male drivers, 5.04% are age 19 and under; 81.43% are age 20–64; 13.53% are age 65 or over.
Complete the following.
 Construct a table or a tree diagram of the situation.
 Find P(driver is female).
 Find P(driver is age 65 or overdriver is female).
 Find P(driver is age 65 or over AND female).
 In words, explain the difference between the probabilities in part c and part d.
 Find P(driver is age 65 or over).
 Are being age 65 or over and being female mutually exclusive events? How do you know?

<20 20–64 >64 Totals Female 0.0244 0.3954 0.0661 0.486 Male 0.0259 0.4186 0.0695 0.514 Totals 0.0503 0.8140 0.1356 1  P(F) = 0.486
 P(>64F) = 0.1361
 P(>64 and F) = P(F) P(>64F) = (0.486)(0.1361) = 0.0661
 P(>64F) is the percentage of female drivers who are 65 or older and P(>64 and F) is the percentage of drivers who are female and 65 or older.
 P(>64) = P(>64 and F) + P(>64 and M) = 0.1356
 No, being female and 65 or older are not mutually exclusive because they can occur at the same time P(>64 and F) = 0.0661.
Suppose that 10,000 U.S. licensed drivers are randomly selected.
 How many would you expect to be male?
 Using the table or tree diagram, construct a contingency table of gender versus age group.
 Using the contingency table, find the probability that out of the age 20–64 group, a randomly selected driver is female.
Approximately 86.5% of Americans commute to work by car, truck, or van. Out of that group, 84.6% drive alone and 15.4% drive in a carpool. Approximately 3.9% walk to work and approximately 5.3% take public transportation.
 Construct a table or a tree diagram of the situation. Include a branch for all other modes of transportation to work.
 Assuming that the walkers walk alone, what percent of all commuters travel alone to work?
 Suppose that 1,000 workers are randomly selected. How many would you expect to travel alone to work?
 Suppose that 1,000 workers are randomly selected. How many would you expect to drive in a carpool?

Car, Truck or Van Walk Public Transportation Other Totals Alone 0.7318 Not Alone 0.1332 Totals 0.8650 0.0390 0.0530 0.0430 1  If we assume that all walkers are alone and that none from the other two groups travel alone (which is a big assumption) we have: P(Alone) = 0.7318 + 0.0390 = 0.7708.
 Make the same assumptions as in (b) we have: (0.7708)(1,000) = 771
 (0.1332)(1,000) = 133
When the Euro coin was introduced in 2002, two math professors had their statistics students test whether the Belgian one Euro coin was a fair coin. They spun the coin rather than tossing it and found that out of 250 spins, 140 showed a head (event H) while 110 showed a tail (event T). On that basis, they claimed that it is not a fair coin.
 Based on the given data, find P(H) and P(T).
 Use a tree to find the probabilities of each possible outcome for the experiment of tossing the coin twice.
 Use the tree to find the probability of obtaining exactly one head in two tosses of the coin.
 Use the tree to find the probability of obtaining at least one head.
Use the following information to answer the next two exercises. The following are real data from Santa Clara County, CA. As of a certain time, there had been a total of 3,059 documented cases of AIDS in the county. They were grouped into the following categories:
Homosexual/Bisexual  IV Drug User*  Heterosexual Contact  Other  Totals  

Female  0  70  136  49  ____ 
Male  2,146  463  60  135  ____ 
Totals  ____  ____  ____  ____  ____ 
Suppose a person with AIDS in Santa Clara County is randomly selected.
 Find P(Person is female).
 Find P(Person has a risk factor heterosexual contact).
 Find P(Person is female OR has a risk factor of IV drug user).
 Find P(Person is female AND has a risk factor of homosexual/bisexual).
 Find P(Person is male AND has a risk factor of IV drug user).
 Find P(Person is female GIVEN person got the disease from heterosexual contact).
 Construct a Venn diagram. Make one group females and the other group heterosexual contact.
The completed contingency table is as follows:
Homosexual/Bisexual  IV Drug User*  Heterosexual Contact  Other  Totals  

Female  0  70  136  49  255 
Male  2,146  463  60  135  2,804 
Totals  2,146  533  196  184  3,059 
 \(\frac{255}{3059}\)
 \(\frac{196}{3059}\)
 \(\frac{718}{3059}\)
 0
 \(\frac{463}{3059}\)
 \(\frac{136}{196}\)

Answer these questions using probability rules. Do NOT use the contingency table. Three thousand fiftynine cases of AIDS had been reported in Santa Clara County, CA, through a certain date. Those cases will be our population. Of those cases, 6.4% obtained the disease through heterosexual contact and 7.4% are female. Out of the females with the disease, 53.3% got the disease from heterosexual contact.
 Find P(Person is female).
 Find P(Person obtained the disease through heterosexual contact).
 Find P(Person is female GIVEN person got the disease from heterosexual contact)
 Construct a Venn diagram representing this situation. Make one group females and the other group heterosexual contact. Fill in all values as probabilities.
Glossary
 Tree Diagram
 the useful visual representation of a sample space and events in the form of a “tree” with branches marked by possible outcomes together with associated probabilities (frequencies, relative frequencies)
 Venn Diagram
 the visual representation of a sample space and events in the form of circles or ovals showing their intersections