
1. What is a twoway table (2 points)
2. What direction does a row go in, and what direction does a column go?
3. What is the first thing you should do when starting to interpret a twoway table?
1. A table where individuals have been classified according to two variables.
The variables may be categorical or quantitative.
 2. Row: Horizontal
 Column: Vertical
3. Make an extra row and an extra column for the totals. Then total each column and row up. Also, get a final total of all individuals in the table to put on the bottom right corner.

1. What is the marginal distribution of a twoway table?
2. What is the conditional distribution of a twoway table?
3a. How many people does this table describe, and how many have played video games?
3b. How would you calculate the marginal distribution of grades, and how would this look in a table? (visualize)
4. How would you figure out the conditional distribution for players and nonplayers, and what would this look like in a table? (visualize)
1. The row totals and column totals in a twoway table give the marginal distribution of the two individual variables.
It is clearer to present these distributions as percents of the table total.
Marginal distribution tells us nothing about the relationship between the variables.
2. There are two sets of conditional distributions in a twoway table. The distribution of the row variable for each fixed value of the column variable, and vice versa.
To find the conditional distribution of the row variable for one specific value of the column variable, look only at that one column in the table. Find each entry in the column as a percent of the column total.
There is a separate conditional distribution for each value of the other variable
 3a. Add up all the people in the table.
 736 + 450 + 193 + 205 + 144 + 80 = 1808
 This table describes 1808 people.
 736 + 450 + 193 = 1379
 1379 people played video games
3b. A's and B's: 736 + 205 / 1808 = 0.5205 x 100 = 52.05%
C's: 450 + 144 / 1808 = 0.3285 x 100 = 32.85%
D's and F's: 193 + 80 / 1808 = 0.151 x 100 = 15.1%
 The complete marginal distribution for grades is:

 4. There are 1379 players (736 + 450 + 193).
 Of these, 53.37% earned A's and B's. (736/1379) x 100
C's: 450 / 1379 x 100 = 32.63%
D's and F's: 193 / 1379 x 100 = 14.00
Now do the same process for nongames
There are 429 nongamers (205 + 144 + 80)
 Of these, 47.79% earned A's and B's
 (205/429) x 100


There are two types of significance tests for inference from a twoway table.
Which one do we do in this class, what is it used for and what is another name for it?
Chisquared test of independence (also known as the chisquared test of association) is used to find evidence for a relationship (dependence, connection, link etc.) between two variables.
*The other test is a chisquared test of homogeneity, but we don't touch on it in this course.

1. What are the conditions for a chisquared test of independence?
2. In the worksheet, which numbers are the observed counts and which are the expected counts?
1a. The sample is representative of the population
1b. All expected counts are at least 5. A few isolated expected counts below 5 do not matter, provided that they are not all in one row or all in one column.
*Need to check on Minitab to find out*
2. Observed counts: 736, 450, 193, 205, 144 and 80
Expected counts: 717.7, 453.1, 208.2, 223.3, 140.9 and 64.8

1. What is a table of observed counts?
2. What does the sample of teenage boys who played games and didn't play games represent?
 1. Observed counts are simply the numbers from a sample that are put into a table.

2. The sample of teenaged boys represents the population of all teenaged boys at all similar schools.

In the population of all 1418 yearold boys at similar schools, is there a relationship between gaming status and grades. That is, do grades depend on gaming status?
You know what to do!
 STATE:
 How strong is the evidence that grades depend on gaming status
PLAN: *NO PARAMETER TO WORRY ABOUT
 Ho: Grades do not depend on gaming status
 Ha: Grades depend on gaming status
We'll use a χ² test of association because we're looking for a relationship between two variables.
 SOLVE:
 State the Conditions
 1. Representative sample
 2. All expected counts are at least 5. A few isolated expected counts below 5 do not matter, provided that they are not all in one row or column.
 Checking the Conditions
 1. No information is given about the selection process concerning how they chose the boys. In practice we would contact the person who collected the data to find out.
 2. All expected counts exceed 5. For example, 717.7

 Test stat: 6.739
 pvalue: 0.034
*Always use pearson pvalue*
 CONCLUDE:
 Since the pvalue is between 0.01 and 0.05, we have strong evidence that grades depend on gaming status in the population of all teenage boys at similar schools


1. What type of variables do we use for a chisquare test?
2. Does a chisquare test establish causation?
3. What is a lurking variable for the association between gaming status and grades?
1. We can use quantitative or categorical variables.
2. No, only correlation.
3. Higher household income for the gamers could be causing them to get better grades, so we can't claim causation. Higher income households have more money for games and also for tutors to help with school

1. What do the expected counts mean? Use gaming and grades as an example.
2. How is the chisquare statistic calculated (generally speaking)?
3. What does it mean when the chisquare statistic is large and if its small?
1. The expected counts are the numbers of boys we would expect in each cell if grades were the same for gamers and nongamers.
 If grades were the same for gamers and nongamers, then the grade distribution for gamers would be the same as the grade distribution for nongamers and this would be estimated by the marginal distribution of grades.

For example, 52.046% of 1379 (total number of gamers) would be expected to get A's and B's.
The same would go for the expected counts for nongamers. If grades were the same for nongamers as they were for gamers, then 52.045% of 429 (nongamers) would mean that 223.3 students would get A's and B's.
2. The chisquare statistic compares each observed count with the corresponding expected count.
 ex.

 Don't memorize specific formula, just have an idea how it works.
3. If the observed counts are very different from the expected counts, then the chisquared statistic will be large and we will conclude that grades depend on gaming status.
If the observed counts are close to the expected counts, then the chisquared statistic will be small, and we will conclude that we can't claim that grades depend on gaming status.

1. What is the pvalue if the chisquared test statistic is 6.739 with 2 df?
2. What is the pvalue if the chisquared test statistic is 14.53 with 10 df?
3. How do we figure out the degrees of freedom (df) for a chisquared test involving a twoway table?
4. Calculate df for the gaming example.
1. pvalue = between 0.25 and 0.05
 2. pvalue = 0.15

3. In a chisqured test involving a twoway table that has r rows and c columns, the degrees of freedom are given by:
(r1)(c1)
4. 2 rows, 3 columns
(21)(31) =
1 x 2 =
2 df

1. People often try to use chisquare tests as soon as they see a twoway table.
But what are the three things that need to be in place in order to use a chisquare table?
1. The numbers are counts of individuals (not percentages/summarized data)
2. Each individual is counted exactly once
3. The conditions are satisfied (representative sample and expected counts all at least 5...)

