# learn by doing 2

Matched Pairs: In this lab you will learn how to conduct a matched pairs T-test for a population mean using StatCrunch. We will work with a data set that has historical importance in the development of the T-test.

Some features of this activity may not work well on a cell phone or tablet. We highly recommend that you complete this activity on a computer.

Here are the directions, grading rubric, and definition of *high-quality feedback* for the *Learn by Doing* discussion board exercises.

A list of StatCrunch directions is provided at the bottom of this page.

### CONTEXT

#### GOSSET’S SEED PLOT DATA

William S. Gosset was employed by the Guinness brewing company of Dublin. Sample sizes available for experimentation in brewing were necessarily small. At that time, Gosset contacted a famous statistician Karl Pearson (1857-1936) and was told that there were no techniques for developing probability models for small data sets. Gosset studied under Pearson, and the outcome of his study was perhaps the most famous paper in statistical literature, “The Probable Error of a Mean” (1908), which introduced the T-distribution.

Since Gosset was employed by Guinness, any work he produced would be owned by Guinness, so he published under a pseudonym, “Student”; hence, the T-distribution is often referred to as *Student’s T-distribution*.

To illustrate his analysis, Gosset used the results of seeding 11 different plots of land with two different types of seed: regular and kiln-dried. He wanted to determine if drying seeds before planting increased plant yield. Since different plots of soil may be naturally more fertile, this confounding variable was eliminated by using the matched pairs design and planting both types of seed in all 11 plots.

The resulting data (corn yield in pounds per acre) are as follows.

Plot | Regular seed | Kiln-dried Seed |
---|---|---|

1 | 1903 | 2009 |

2 | 1935 | 1915 |

3 | 1910 | 2011 |

4 | 2496 | 2463 |

5 | 2108 | 2180 |

6 | 1961 | 1925 |

7 | 2060 | 2122 |

8 | 1444 | 1482 |

9 | 1612 | 1542 |

10 | 1316 | 1443 |

11 | 1511 | 1535 |

We use these data to test the hypothesis that kiln-dried seed yields more corn than regular seed.

Because of the nature of the experimental design (matched pairs), we are testing the difference in yield.

Plot | Regular seed | Kiln-dried Seed | Difference |
---|---|---|---|

1 | 1903 | 1609 | 294 |

2 | 1935 | 1915 | 20 |

3 | 1910 | 1611 | 299 |

4 | 2496 | 2463 | 33 |

5 | 2108 | 2180 | â€“72 |

6 | 1961 | 1925 | 36 |

7 | 1660 | 2122 | -462 |

8 | 1444 | 1482 | â€“38 |

9 | 1612 | 1542 | 70 |

10 | 1316 | 1443 | â€“127 |

11 | 1511 | 1535 | â€“24 |

Note that the differences were calculated: *regular *âˆ’ *kiln-dried*.

### VARIABLES

*Regular seed*: regular seeds that were traditionally used for planting*kiln-dried*: seed that were kiln-dried before planting

### DATA

Download the *seed* (Links to an external site.) data file, and then upload the file into StatCrunch.

### PROMPT

- State the hypotheses and define the parameter.
- Checking conditions: Since Gosset invented the T-distribution, we will assume that his sample meets the conditions and proceed with the T-test. Regardless, answer these questions to demonstrate your understanding of the conditions for use of the T-model.
But first you will need to review the dotplots for the data (opens in a new tab).

- Which graph is used to check conditions? Why?
- What do we look for in the graph to verify that conditions are met?
- What else do we need to know about the sample of seeds before using the T-test?

- Use StatCrunch to find the T-score and the P-value. Hint: as you work through the StatCrunch directions, keep in mind that we want to calculate the differences as
*regular*âˆ’*kiln-dried*. So you will choose*Regular seed*for Sample 1 and*kiln-dried seed*for Sample 2. (directions)

Copy and paste the information in the StatCrunch output window into your initial post. - State a conclusion based on the context of this scenario.