Statistics: T-Distribution and T-Tests
Review on Hypothesis Testing
The objective of the hypothesis testing procedure is to find out
whether we can reject the null hypothesis. To reject or not to reject the null
hypothesis is a decision made by the researcher. This decision must be based
upon an explicitly stated criterion. One has to specify exactly under what
conditions that one is willing to reject the null hypothesis. The criterion
comes from the research result. It is based upon a critical statistical
information from the sample observed. It could be an individual score (if there
is one person in the sample, N = 1) or a mean (if there are multiple persons in
the sample, N > 1). One wants to
create a dilemma between the sample statistic (i.e., the observation) and the
null hypothesis. This dilemma or conflict can be represented in terms of
probability or likelihood. The researcher wants to show that if the null
hypothesis were true, a sample statistic (score or mean) as extreme as the one
observed would be highly unlikely. How can we find this probability?
Now let’s think about why we
need the comparison distribution. What is the comparison distribution? What information
does it give us? What kind of use do we put it to? The
comparison distribution is the distribution that the sample must belong to if
the null hypothesis is true. By pretending that the sample had been
drawn from the population described by the comparison distribution, we are
assuming that the null hypothesis is true. The distribution is a distribution
of probability. Some scores on the distribution are more likely, such as those
around the mean. We are interested in those rare scores, those extreme scores
at the tails of the distribution. If we can show that the sample statistic
falls in the region of the distribution that is extremely rare, for instance
less than 5% or 1% out of the entire distribution, then we can say that the
observed sample statistic is unlikely if the null hypothesis is true. This
satisfies the condition we set up for rejecting the null hypothesis.
Another issue is why we have
to deal with different distributions concerning the same population. We have a
distribution of individual scores and distribution of means. Which one should
we use? It depends on the sample statistic. If the sample statistic is a single
score (N = 1), we need to compare it with the distribution of single scores. If
the sample statistic is a sample mean (N >1), then we have to compare it
against a distribution of means.
T-Test
for a Single
Sample
We have learnt how to do
hypothesis testing in situations where we know the mean and variance
(therefore standard deviation) of the comparison distribution. But in
real-world research we don’t have that much information about the population
distributions. We have to estimate the means and standard deviation of the
populations we are interested in from the data we collect in the experiment.
This process is part of inferential statistics because we are making inferences
about populations, which we cannot actually measure, from actual observations
made on samples drawn from the populations.
This time, we learn
hypothesis testing in a situation where we know the mean of the comparison
distribution, but don’t know the variance. Therefore we have to estimate
the variance and standard deviation from
the sample statistics.
Example 1:
I believe that single
parents spend more time with their children than parents on average. Census
data show that parents on average spend 2 hours each day with their
children. I interviewed 6 single parents and found that on average they
spend 3 hours each day with their children (see following data). Do single
parents actually spend more time with their children than parents on average?
Data
Name hour spend with child
Jim 4
Mary 2
Joe 3
Kim 3
Kevin 2
Judy 4
To summarize the raw data:
X (X - M)
(X - M)2
4 1 1
2 -1 1
3 0 0
3 0 0
2 -1 1
4 1 1
M = (4 + 2 + 3 + 3 + 2 +
4)/6 = 3
SS = (1 + 1 + 1 + 1) = 4
SD2 = S(X - M)2/N-1 = SS/N-1 = 4/5 = 0.80
Step 1: state the research
and null hypotheses in terms of populations.
Pop. 1: single
parents
Pop. 2: average
parents
Research/alternative hypothesis (HA): single
parents spend more time w/kids than average parents
Null hypothesis (H0): single
parents spend same time as average parents
|
Reality: H0 true |
H0 false |
Decision: reject H0 |
Type I error |
Correct decision |
Do not reject H0 |
Correct decision |
Type II error |
What
would be type I error in this
context?
What
would be type I error in this
context?
What
would be statistical power in this
context?
According
to null hypothesis, what is the average amount of time spent by single parents?
Mu
for single parents will be 2 hours, same as average parents.
Step 2: Determine the characteristics of the comparison distribution, the distribution that the sample score would have come from given the null hypothesis is true. This distribution is called the comparison distribution, because you are going to compare the research result (the mean time spent with children by the 6 single parents who have been actually observed) against this distribution to find out how likely the experimental result would have occurred.
To represent the distribution of means of time
spent with children with a sample size of 6 among average parents, one has to
use a t distribution, not a z or normal distribution, because one does
not know the variance of this comparison distribution and has to estimate it.
The t distribution is a variant of the z distribution.
How do you estimate the variance of the comparison
distribution? You have to start with the research data, because these are all
you know about the population. The research results are informative about the
population because you assume the sample observed in the experiment has been
randomly drawn from the population. If there is a lot of variance in the
population, you would find a lot of variance in the sample. If there is little
variance in the population, you would also find little variance in the sample.
First you estimate the distribution of time spent with children with by parents
on average (this is the distribution of individual scores, not means).
S2 = S(X - M)2/(N - 1) =
SS/(N - 1)
= 4/5 = 0.8
Here (N - 1) is called the degree of freedom
(df) for estimating the population variance.
The df gives the
number of data points free to vary once the mean is determined.
Our comparison
distribution is the distribution of means, rather than distribution of single
scores. So we have to estimate the variance of the distribution of means.
SM2 = S2/N = 0.8/6 = 0.13
SM =Ö SM2 = Ö0.13
= 0.365
And we know the mean of the distribution of
the individual scores for average parents, which is also the distribution of
means of this population. So we have determined the necessary characteristics
of the comparison population.
Step 3: determine the cut off points.
We will use a 5% level of
significance cutoff
point. If the likelihood for getting a sample mean as extreme as or more
extreme than the one we get in the experiment is less than 5 percent, then we
will reject the null hypothesis. We need to figure out the lowest t score for the
highest 5% of the distribution.
Cut off: t (5) >= 2.015
Step 4. Now we need to calculate the position of
your sample statistic on the comparison distribution. In this particular case,
we need to transform the sample mean (M) into a t score using the mean and
standard deviation of the comparison distribution (correct selection of this
standard deviation is critical since we have several Standard Deviations
available).
t = (M - μ)/SM
=
(3-2)/0.365 = 2.74
Step 5. Deciding whether or not to reject the null hypothesis.
Example 2:
In order to show that the workshops offered in "Project Impact (the anti-alcoholism program)" has effectively reduced the number of beers students consume at each party, a researcher measured the number of beers drank at a particular party by 5 students who have participated in the "Project Impact" workshops (see the following data). The researcher knows that beer consumption at UW-Stout is normally distributed, with a mean of 4 but does not know the standard deviation or variance of this population. Can we reject the null hypothesis at the .05 level?
Name
# beer
Jim
1
Mary
2
Joe
3
Kim
1
Kevin
1
Descriptive statistics:
X (X - M) (X - M)
Jim
1 -0.6 0.36
Mary 2 0.4 0.16
Joe 3
1.4 1.96
Kim 1
-0.6 0.36
Kevin 1
-06 0.36
M
= (1 + 2 + 3 + 1 + 1)/5 = 1.6
SS = 3.2
S2 =
SS/(N-1) = 3.2/4 = 0.8
SM2
= S2 /N = 0.8/5 = 0.16
SM =
sqr (0.16) = 0.4
HA: Participants to
the workshop drink less on average than the average students
H0: Participants
to the workshop drink same amount on average than the average students
T-Test for Dependent Means
In some
situations, even though we do not know the population mean, we are quite ready
to assume what the population mean would be if the null hypothesis is true. One
such situation is when we are dealing with difference score. We deal with
difference scores when we study change, e.g., change in behavior, thoughts or
physiological functioning. For instance when after we give patients a
particular therapy, we want to find out how effective the therapy is, in other
words, how much the patients have benefited from the therapy. How can we measure
this change? The answer is “by calculating the difference between the patients’
performance before and after the therapy.” We are trying to show that the
therapy is useful (the research hypothesis). According to the null hypothesis
it is not. What would be the difference between before and after the therapy is
the null hypothesis were true? The answer is “Zero.”
Example 1
A researcher wants to test
the effectiveness of a new therapy for snake phobia, a condition where persons
have irrational fear of harmless snakes. He believes that this therapy would
effectively reduce people's fear of snakes. He recruited 6 snake phobic
subjects. Before giving the treatment, he measured each subject's fear of snake
on a 0 to 10 scale, from not afraid at all to extremely afraid, with
intermediate levels of fearfulness in between. Then he gave each subject
treatment. Immediately after the treatment, he measured the subjects' fear of
snake again, on the same 0 to 10 scale.
The following are the data the researcher has collected. Do the
appropriate statistical test with the significant level of p < .05. Show the
five steps of hypothesis testing.
Subject #
Fear before therapy
Fear after therapy
1 9
2
2
10
3
3 9
2
4
8
3
5
10 4
6
9
1
Summarize the data:
Subject # Fear before Fear
after difference deviation from mean
sqrd deviation
1 9 2 7 7 - 6.67 = 0.33 0.11
2 10 3 7 7 - 6.67 = 0.33 0.11
3 9 2 7 7 - 6.67 = 0.33 0.11
4 8 3 5 5 - 6.67 = - 1.67 2.79
5 10 4 6
6 - 6.67 = - 0.67 0.45
6 9 1 8 8 - 6.67 = 1.33 1.77
Mdiff(mean of
difference) = (7 + 7 + 7 + 5 + 6 + 8)/6 = 6.67
SS of difference = 0.11 + 0.11 + 0.11 + 2.79 + 0.45 + 1.77 = 5.34
SD2 of difference = 5.34/(6-1) = 1.068
SD of difference = 1.033
Step 1: restate research and null hypothesis in terms of populations
Pop 1
Pop 2
H1
H0
Step 2: determine the characteristic of the comparison distribution.
Since we are using an estimated
population variance, the comparison distribution is a t distribution.
The Mean of the comparison
distribution: 0.
Determine the variance of
the comparison distribution: estimate from the sample
S2
= SS/N-1 = 5.34/(6-1) = 1.068
S2
is the estimated population variance. It is a Roman character rather than Greek
because it is estimated from the sample statistics. This is a convention among
statisticians.
N - 1 is the degree of
freedom for estimating the population variance.
Have we got the variance of
the comparison distribution? No. S2
is the estimated variance of the distribution of individual difference scores.
But our comparison distribution is distribution of mean difference scores.
SM2
= S2/N = 1.068/6 = 0.178
SDM = Ö
SM2
= Ö0.178 = 0.42
Now we have determined that
the comparison distribution is a t distribution with 5 degrees of freedom, with
a mean of 0 and a standard deviation of 0.43.
Step 3: determine the cut of
points.
One-tailed test.
P < .05
Df = 5
t score cutoff = 2.015
Step 4: convert the
experimental result into a t score
t = (6.67 - 0)/0.42 = 15.88
Step 5: decide whether to
reject null hypothesis or not
Example 2
A researcher believes that
testosterone increases aggressiveness in male animals. He obtained 3 male rats
as his experimental subjects and gave them a large dosage of testosterone. He
measured the number of times that each rat attached other rats before and after
the testosterone injection. The following are data the researcher has
collected. Do the appropriate statistical test with the significant level of p
< .01. Show the five steps of hypothesis testing
Subject # # of attacks # attacks
before
injection after injection diff
dev. (dev.)2
1 2 9 -7 -2 4
2 2 6 -4 1 1
3 5 9 -4 1 1
_______________________________________________________________________________
Descriptive Stats:
Mdiff
= - 15/3 = - 5
SSdiff
= 6
S2
= SS/N-1 = 6/2 = 3
SM2
= S2/N = 3/3 = 1, SDM = Ö
SM2
= 1
t = [(-5) – 0]/1 = -5
df = 2
T-Test for Independent Means
Example 1:
A Researcher is interested in the effect of experience on people's fearfulness when engaging in scary activities (such as parachute-jumping). He thought experienced individuals would be less fearful than novice when engaging in scary activities. So he did the following experiment. 6 people who are going to undertake their first parachute-jumping from an airplane in their lives (the first-timers) and 6 people who have trained in a parachuting club for more than one year (experienced parachuters) have been selected. The experimenter measured each subject's fearfulness just before he/she jumped from the plane on a scale of 1 to 10, 1 means not fearful at all, 10 means extremely fearful. The following are the data the researcher has collected from the experiment.
Subject #
Experience
Fearfulness
1
0
9
2
0 10
3
0
9
4
0
8
5
0
10
6 0 9
7
1
2
8
1
3
9 1 2
10
1
3
11
1
4
12
1 1
Step 1: restate the research and null hypotheses in terms of
populations.
Pop1:
Pop 2:
H1
H0
Step 2: determine the characteristics of the comparison distribution.
We have to decide which distribution shall be used as the comparison distribution on the basis of what kind of data we have.
Here we have a measure of the same variable (fearfulness) from two different samples, representing two different populations (experienced vs. inexperienced). We are trying to show that the two populations have different mean fearfulness. We show this by rejecting the H0, which says that the two populations do not have different fearfulness.
This means if the null hypothesis is true, the difference between the
means of the population would be 0. In our research, the data consist of two
sample means. If the null hypothesis is true, the difference between the sample
means should be 0 in the long run. Therefore our comparison distribution should
be a distribution of differences
between two sample means, the sample mean from the experienced
population and the sample mean from the inexperienced population. Under the
null hypothesis we assume this distribution of differences between the two
samples means have a mean of 0. Apparently we don't have the information about
the variance of this distribution. We have to estimate it from what we know
about the populations.
First of all, when we have a sample of scores, we can estimate the mean
and the variance of the population the sample comes from. Here we have two sample
of scores, one from the experienced population, one from the inexperienced
population. According to the research hypothesis, these two samples should have
come from different populations, the distributions of which have different
means and maybe different standard deviations. However, we are working under
the null hypothesis, so it is assumed that the two samples have been drawn
randomly from the same population. The null hypothesis says that the two
populations should have the same mean and standard deviation on fearfulness.
Therefore the estimates of means and variances from the two samples should
apply to the same distribution, i.e. a distribution with the same mean and same
variance. What this means is that using the two samples, we can make two separate estimates of the same distribution. The two
estimates will not yield identical estimated means and variances, because of
the random errors involved in the scores in the sample. However, under the null
hypothesis, the samples should have been drawn from the same population, and
any differences in the estimate are only due to random error.
Let's do this estimating first. What information do we have already,
and starting what we already know, what population parameters can we estimate
first? These first estimates may not be
the parameters that we want eventually, but they get us closer to them. First
we can get two separate estimates of the population variances from the two
samples. Is this an estimate of the distribution of individual scores or means?
Individual scores because the estimates were based on individual scores from
the samples.
The inexperienced group:
Fearfulness
(X)
(X-M) (X-M)2
9 - 0.17 0.03
10
0. 83
0.69
9 - 0.17 0.03
8 - 1.17 1.37
10
0.83 0.69
9 - 0.17 0.03
___________________________________________
SS1 = 2.84
M1 = (9 + 10 + 9 + 8 + 10 + 9)/6 = 55/6 = 9.17
S12 =
SS/(N-1) = 2.84/5 = 0.57
The experienced group
Fearfulness
(X)
(X-M) (X-M)2
2
- 0.5
0.25
3
0.5 0.25
2
- 0.5
0.25
3
0.5
0.25
4
1.5
2.25
1 - 1.5 2.25
_________________________________________
SS2 = 5.5
M2 = (2 + 3 + 2 + 3 + 4 + 1)/6 = 15/6 =2.5
S22 =
SS/(N-1) = 5.5/5 = 1.1
So we have made separate estimates of the population variances from the
two samples. According to the null hypothesis, these should be estimates of the
same distribution, because the two populations have the same distribution, i.e.
with the same mean and same variance. Why
do we need two separate estimates from two samples rather than just one
estimate from one sample? Because estimate based on limited information involve
a lot of random error. Any sample we use may not be representative of the
population. We need to use as much information as possible to reduce random
error and arrive at an estimate as precise as possible. We pool the two
estimates to get an average. However, this is not a simple average (adding them
up and divide by two), where each estimate contributes half of the value. The
pooled estimate is a weighted average, weighted by the proportion of their sample size. If the two samples differ
in size, the larger sample contains more information useful for the estimate,
and should be give a larger weight in the pooled estimate. If the two samples
have the same the size, then they should contribute equally to the pooled
estimate.
Spooled2 =
df1 / dftotal *S12 +df2
/ dftotal *S22
In this formula, dftotal
is the total degrees of freedom between the estimated variances from two
samples. So dftotal = df1 + df2 = 5 + 5 = 10
df2 / dftotal and
df1 / dftota are
called weights for the estimated variance from each sample. We need to weight
them because we want the estimate from the sample that has more subjects, i.e.
larger df, to contribute more to the pooled estimate of variance.
Going back to our study, Spooled2 = df1 / dftotal *S12
+df2 / dftotal *S22 = 5/10 * 0.57 +
5/10 * 1.1 = 0.84
But his pooled estimate of variance is the variance of a distribution
of single scores, which does not match our data, which consist of two sample
means (more precisely, differences between pairs of sample means). Fortunately
we already know how to get the mean and variance of distribution of means of a
given sample size.
Remember according to the H0, the two populations are from the same
distribution. We have estimated the variance of this distribution, Spooled2
. Now we need to estimate the distribution of means. This estimation
depends upon the sample size. We have to make two estimates to give allowance
for the possibility that the two samples have different means. In this case, we
have the same sample size for both samples.
Therefore:
SM12 = Spooled2/N1
= 0.84/6 = 0.14
SM22 = Spooled2/N2
= 0.84/6 = 0.14
Now we are still one step away from the estimated variance of the
comparison distribution, which is a distribution of difference between two
sample means.
Sdifference2 =
SM12 + SM22
= 0.14 + 0.14 = 0.28
Sdifference =
Ö0.28 = 0.53
Step 3: determine the cut off points.
One-tailed, p < 0.01
Use the total df for determining the cutoff point. Find t needed for
rejecting H0 from the t
table.
Step 4: calculate the t score
t = (M1 - M2)/Sdifference = (9.17 - 2.5)/0.53 = 12.58
Step 5: Reject Null Hypothesis?
Example 2:
We have encountered this research situation before. A researcher
believes that single mothers spend more time with their children than married
mothers. He recruited 6 single mothers and measured how many hours they spend
with their children over a two-day period of time. He recruited 6 married
mothers and measured how many hours each of them spend with their children
during the same period of time. The data are listed below. Carry out the
hypothesis testing steps and state the results.
The calculations:
Single mothers: Married mothers:
hrs
(X1) (X1
- M1) (X1-M1)2 (X2) (X2 - M2) (X2 - M2)2
6 0 0 6 3 9
4 - 2 4 1 -2 4
9 3 9 5 2 4
7 1 1 3 0 0
7 1 1 1 -2 4
3 -3 9 1 -2 4
6 0 0 4 1 1
________________________________________________________________
SS1 = 24
SS2 = 26
M1 = 6
S12 = 24/6
= 4
N1 = 7 df1=
7 -1 = 6
M2 = 3
S2 2 = 26/6 = 4.33
N2 = 7 df2 =
7 -1 = 6
dftotal = df1+ df2 = 6 + 6 = 12
Spooled2 =
df1 / dftotal *S12 +df2
/ dftotal *S22 = 6/12 * 4 + 6/12 * 4.33 = 4.17
We will figure out the rest in class.
Comments
Post a Comment