D.E. Logo
Distance Education Statistics
Lab G


Eric Parslow
and
Russell T. Hurlburt
University of Nevada, Las Vegas
Hurlburt photo
College Logo


This lab will consist of 5 parts:
  1. Discriminating between two-independent-samples (Chapter 11) tests and two-dependent-samples (Chapter 12) tests
  2. Practice calculating the pooled variance and the standard error of the differences between two means (Chapter 11)
  3. Practice calculating the standard error of the differences (Chapter 12)
  4. Cumulative review
  5. The quiz




Part I. DISCRIMINATING BETWEEN TWO-INDEPENDENT-SAMPLES (CHAPTER 11) TESTS AND TWO-DEPENDENT-SAMPLES (cHAPTER 12) TESTS

Chapter 11 describes two-independent-sample tests. In such situations, there is no statistical relationship between the persons in each sample.

The problem might say:

'...10 subjects were randomly assigned to one group, and a different 10 subjects were randomly assigned to the other group...'

Chapter 12 describes two-dependent-sample tests. In such situations there is some statistical connection or relationship between the persons in each sample.

The problem might say:

'...10 subjects were given IQ tests. Then they received the pill. Then these same subjects were given the IQ test again...' (repeated measures design)

or

'...10 subjects were given the pill and then had their IQ's measured. The identical twins of these 10 subjects received a placebo and then had their IQ's measured...' (related subjects design)

or

'...20 subjects received a screening IQ test. We flipped a coin for the two individuals who received the highest scores on this screening test, assigning one of them to Group A and the other to Group B. Then we flipped a coin for the next two highest scores, again assigning one to Group A and the other to Group B. We continued to randomly assign one of each pair to each of the two groups...' (matched pair design)


Are these two-independent-sample tests (Chapter 11) or two-dependent-sample tests (Chapter 12)?

Q1. An experimenter is investigating whether men or women see more movies. She takes a random sample of men and a random sample of women and asks them how many movies they saw in the last month. Their responses were as follows [imagine a table here]. Can we conclude that men or women see more movies? This problem is a
a. two-independent-samples design
b. two-dependent-samples design

Q2. An experimenter is investigating whether men or women watch more movies. She takes a random sample of married couples and asks the men and women individually how many movies they watched in the last month. Their responses were as follows [imagine a table here]. Can we conclude that men or women see more movies? This problem is a
a. two-independent-samples design
b. two-dependent-samples design

Answer each problem on a piece of scratch paper before pressing the Continue button.


















































Q1. An experimenter is investigating whether men or women see more movies. She takes a random sample of men and a random sample of women and asks them how many movies they saw in the last month. Their responses were as follows [imagine a table here]. Can we conclude that men or women see more movies? This problem is a
a. two-independent-samples design
b. two-dependent-samples design

Q2. An experimenter is investigating whether men or women watch more movies. She takes a random sample of married couples and asks the men and women individually how many movies they watched in the last month. Their responses were as follows [imagine a table here]. Can we conclude that men or women see more movies? This problem is a
a. two-independent-samples design
b. two-dependent-samples design

Answer: Q1 has involves two different samples. There is no implication that these two samples are related. Therefore this is a Chapter 11 independent-samples design

Q2 is a Chapter 12 related-subjects design because a man in the male sample is in fact related to a particular woman in the female sample (his wife), and vice versa.



PART II. PRACTICE CALCULATING THE POOLED VARIANCE AND THE STANDARD ERROR OF THE DIFFERENCES BETWEEN TWO MEANS (CHAPTER 11)

If the samples are independent, you will be computing t from the formula

To determine the denominator of t, you must determine variance within each group, the pooled variance, and the standard error of the difference between two means.

1. Determine the variances within each group. Use a formula from Chapter 5 such as the one below, or use the statistics supplied in the problem:

variance formula from Chapter 5:

2. Pool the two variances together using one of the following formulas from Chapter 11:

for unequal n:
for equal n:

3. Determine the standard error of the difference between two means using one of the following formulas from Chapter 11:

for unequal n:

for equal n:


Example: Compute the standard error of the difference between two means for a problem that has two groups. The standard deviations of the two groups are s1=4 and s2=3, and the sample sizes are both 4.

Answer:

1. Determine the variances within each group. The problem gives s1 = 4, so s12 = 16. Also s2 = 3, so s22 = 9

2. Determine the pooled variance. Because n1 = n2 = 4, we can use the simpler formula:



3. Determine the standard error of the difference between two means. Because n1 = n2 = 4, we can use the simpler formula. Remember that n is equal to n1 (and to n2), so n = 4 in this example






PART III. PRACTICING CALCULATING THE STANDARD ERROR OF THE DIFFERENCES (CHAPTER 12)

If the samples are dependent, you will be computing t from the formula

To determine the denominator of t, you must determine the difference scores, determine the standard deviation of those differences, and then determine the standard error of the mean of the differences

1. Determine the difference scores. This requires creating a new column in the original data. The difference score for the first point is the first value of X in the first group minus the first value of X in the second group, and so on.

2. Determine the standard deviation of the differences from the equation. Remember that this formula will first require determining , the mean of the difference scores, and then creating two new columns, one for D - and one for (D - )2


3. Determine the standard error of the mean of the differences from the equation


Example: Compute the standard error of the mean of the differences for the following data:

X1X2
12
77
43
24

Answer:

1. Determine the difference scores. Produce a new column for the differences "D"
X1X2D
12-1
77 0
43+1
24-2


2. Compute the mean of the difference scores by adding down the D column and dividing by n, the number of pairs. Thus = -2/4 = -.5.

3. Create two new columns, one for the one for D - and one for (D - )2
X1X2D D - (D - )2
12-1-.5.25
77 0.5.25
43+11.52.25
24-2-1.52.25
-205.0


4. Compute the standard deviation of the difference scores.

so = 1.291

5. Compute the standard error of the mean of the differences





Review:
Chapter Test Test statistic To compute denominator df
11 2-independent-sample 1. Within-group variances
2. Pooled variance
3. St. err. of diff. betw. 2 means
n1 + n2 - 2
12 2-dependent-sample 1. Difference scores
2. Stand. dev. of diffs
3. St. err. of means of diff.
n - 1


PART IV. CUMULATIVE REVIEW
Test your skill on the following six problems. Answer each problem on a piece of scratch paper before pressing the Continue button.


#1. A professor is studying the effectiveness of live instruction vs. instruction over the Internet. He has a group of 25 students attend a one-week class of live instruction and then gives them a test on the material. He then has the same students do a different but equivalent one-week class over the Internet. He administers another test to assess their knowledge after the Internet class. His question is whether there is any difference in learning between the two methods.

a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.





















































Answer#1. A professor is studying the effectiveness of live instruction vs. instruction over the Internet. He has a group of 25 students attend a one-week class of live instruction and then gives them a test on the material. He then has the same students do a different but equivalent one-week class over the Internet. He administers another test to assess their knowledge after the Internet class. His question is whether there is any difference in learning between the two methods.

a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.

Explanation: This asks a Yes/No questions ('is there any difference') so it is a hypothesis test of some kind. This design measures the same people in two conditions, which is a repeated measures design. Therefore this is (f) a two-dependent-sample t test.



#2. A retirement home houses 400 residents. The home manager wishes to know the average IQ of the residents, but she cannot afford to test all 400. She decides to administer an I.Q. test to random sample of 15 of the residents. They scored as follows [imagine a table here]. What can the manager say about the average I.Q. of his residents?
a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.



















































Answer #2. A retirement home houses 400 residents. The home manager wishes to know the average IQ of the residents, but she cannot afford to test all 400. She decides to administer an I.Q. test to random sample of 15 of the residents. They scored as follows [imagine a table here]. What can the manager say about the average I.Q. of his residents?
a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.

Explanation: This asks what can be said about the population mean IQ (of all 400 residents) based on a sample (of 15 residents). This is therefore (b) a confidence interval.



#3. A professor is studying the effectiveness of live instruction vs. instruction over the Internet. He traditionally teaches an entire course to a class of 25 students, while a different group of 9 students take the same course over the Internet. His question is whether there is any difference in learning between the two methods.

a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.



















































Answer #3. A professor is studying the effectiveness of live instruction vs. instruction over the Internet. He traditionally teaches an entire course to a class of 25 students, while a different group of 9 students take the same course over the Internet. His question is whether there is any difference in learning between the two methods.

X: 45, 60, 80, 82, 83, 85, 88, 89, 90, 93, 97, 99
a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.

Explanation: This asks a Yes/No questions ('is there any difference') so it is a hypothesis test of some kind. This design measures two different groups of people. Therefore this is (e) a two-independent-sample t test.



#4. A musician submits a song to a record company to do a recording. The company says the song is too long (3.8 minutes). The disheartened musician decides to go home and check the song lengths of every song in his record collection (1000 songs). He finds they have a mean of 3.2 minutes, with a standard deviation of .7 of a minute. He goes back to the record company to argue in favor of his song. What percentage of recorded songs can he say are generally longer than his?
a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.



















































Answer #4. A musician submits a song to a record company to do a recording. The company says the song is too long (3.8 minutes). The disheartened musician decides to go home and check the song lengths of every song in his record collection (1000 songs). He finds they have a mean of 3.2 minutes, with a standard deviation of .7 of a minute. He goes back to the record company to argue in favor of his song. What percentage of recorded songs can he say are generally longer than his?
a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.

Explanation: The problem's answer will be a percentage, and it's not a confidence interval for a proportion, so the question refers to (a) the area under a distribution.



#5. Sandra runs an automobile repair company with two shops, one in Johnstown and the other in Jamestown. She wishes to know whether her mechanics in Johnstown charge more than do her mechanics in Jamestown. She takes a random sample of 25 charges in her Johnstown shop, and finds that the average charge was $150, with a standard deviation of $55. Then she takes a random sample of 25 charges in her Jamestown shop, and finds that the average charge was $137, with a standard deviation of $40. Is there a significant difference in the charges at the two shops?
a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.



















































Answer #5. Sandra runs an automobile repair company with two shops, one in Johnstown and the other in Jamestown. She wishes to know whether her mechanics in Johnstown charge more than do her mechanics in Jamestown. She takes a random sample of 25 charges in her Johnstown shop, and finds that the average charge was $150, with a standard deviation of $55. Then she takes a random sample of 25 charges in her Jamestown shop, and finds that the average charge was $137, with a standard deviation of $40. Is there a significant difference in the charges at the two shops?
a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.

Explanation: This asks a Yes/No questions ('is there a significant difference') so it is a hypothesis test of some kind. This design measures two different groups of charges. Therefore this is (e) a two-independent-sample t test.




#6. Each year a local radio station has a fundraiser. In the years gone by, the station has kept close records of how much money has been raised. Each year they raise an average of 2500 dollars, with a standard deviation of 600 dollars. For the past 3 years, they have tried a new format for the fundraiser, and have raised an average of 2000 dollars during these 3 years. Is the amount raised during the last three years significantly different from previous years, or is it simply a normal fluctuation?
a. % area under the curve.
b. confidence interval.
c. one-sample z test.
d. one-sample t test.
e. two-independent-sample t test.
f. two-dependent-sample t test.



















































Answer #6. Each year a local radio station has a fundraiser. In the years gone by, the station has kept close records of how much money has been raised. Each year they raise an average of 2500 dollars, with a standard deviation of 600 dollars. For the past 3 years, they have tried a new format for the fundraiser, and have raised an average of 2000 dollars during these 3 years. Is the amount raised during the last three years significantly different from previous years, or is it simply a normal fluctuation?
a. % area under the curve.
b. confidence interval.
c. 1 sample z test.
d. 1 sample t test.
e. 2 sample t test.

Explanation: This asks a Yes/No questions ('is the amount ... significantly different') so it is a hypothesis test of some kind. This design measures one set of outcomes (the last three years) and compares them to the known population ('close records from years gone by'). Therefore this is a one-sample test. Because sigma is given ($600) in the problem, it is (c) a one-sample z test.


PART V: THE QUIZ

In the quiz you will be expected to:
1. Be able to determine when a design is independent samples and when it is dependent samples.
2. Be able to identify all of the types of questions presented so far (cumulative review)
3. Know when to compute the pooled variance and the standard error the differences between two means, and when to compute the standard deviation of the differences and the standard error of the means of the differences.
4. Be able to compute the pooled variance and the standard error the differences between two means given two sets of sample data.
5. Be able to compute the standard deviation of the differences and the standard error of the means of the differences given two sets of sample data.

If you are not comfortable with each of these tasks, you will want to either review this lab again or seek further guidance in Chapters 11 and 12 in the textbook.

Once you are comfortable with these tasks, you are prepared to take the quiz.


You will need the quiz password, which is
7654321