INTRODUCTION
A
hypothesis is a preliminary or tentative explanation or postulate by the
researcher of what the researcher considers the outcome of an investigation
will be. It is an informed/educated guess.
It indicates the expectations of the
researcher regarding certain variables. It is the most specific way in
which an answer to a problem can be stated.
MEANING
Hypothesis means a mere assumption or some supposition or a
possibility to be proved or disproved.
1. A tentative explanation for an observation, phenomenon,
or scientific problem that can be tested by further investigation.
2. Something
taken to be true for the purpose of argument or investigation; an assumption. A statement that explains or
makes generalizations about a set of facts or principles, usually forming a
basis for possible experiments to confirm its viability.
DEFINITION
“A
hypothesis is a tentative generalization, the validity of which remains to be tested.
-
George
A.Landberg.
WHEN IS AN HYPOTHESIS
FORMULATED
An hypothesis is formulated after the problem has been stated and the
literature study has been concluded. It is formulated when the researcher
is totally aware of the theoretical and empirical background to the problem.
THE PURPOSE AND FUNCTION OF AN HYPOTHESIS
- It gives direction to an investigation.
- It structures the next phase in the investigation and therefore furnishes continuity to the examination of the problem.
CHARACTERISTICS OF AN HYPOTHESIS
- It must be verifiable.
- It must be formulated in simple, understandable terms.
·
Hypothesis
should be clear and precise.
·
It
should be capable of being tested.
·
A
relational hypothesis should state relationship between variables.
·
It
should be specific and limited in scope.
·
It
should be consistent with most known facts.
·
It
should be amenable to testing within a reasonable time.
- An important requirement for hypotheses is TESTABILITY.
- A condition for testability is CLEAR nad UNAMBIGUOUS CONCEPTS.
OTHER CHARACTORS
- A good hypothesis is based on sound reasoning.
- Your hypothesis should be based on previous research.
- The hypothesis should follow the most likely outcome, not the exceptional outcome.
- A good hypothesis provides a reasonable explanation for the predicted outcome.
- Do not look for unrealistic explanations.
- A good hypothesis clearly states the relationship between the defined variables.
- Clear, simply written hypothesis is easier to test.
- Do not be vague.
- A good hypothesis defines the variables in easy to measure terms.
- Who are the participants?
- What is different or will be different in your test?
- What is the effect?
- A good hypothesis is testable in a reasonable amount of time.
- Do not plan a test that will take longer than your class project.
TYPES
DESCRIPTIVE
HYPOTHESIS:
Descriptive
hypothesis are propositions that describe the existence, size, form or
distribution of some variables.
RELATIONAL
HYPOTHESIS:
It
describes the relationship between two variables.
WORKING
HYPOTHESIS:
The
working hypothesis indicates the nature of data and methods of analysis
required for the study. Working hypothesis are subject to modification as the
investigation proceeds.
NULL
HYPOTHESIS:
When
a hypothesis is stated negatively, it is called a null hypothesis. A null
hypothesis should always be specific. The null hypothesis is the one which one
wishes to disprove.
ALTERNATIVE
HYPOTHESIS
The
set of alternatives to the null hypothesis is referred to as the alternative
hypothesis. Alternative hypothesis is usually the one which one whishes to
prove.
STATISTICAL
HYPOTHESIS:
It
is a quantitative statement about a population. When the researcher derives
hypothesis from a sample and hopes it to be true for the entire population it
is known as statistical hypothesis.
SIMPLE
HYPOTHESIS:
It
states the existence of certain empirical uniformities. Many empirical
uniformities are common in sociological research.
COMPOSITE
HYPOTHESIS:
These
hypothesis aim at testing the existence of logically derived relationships
between empirical uniformities obtain.
EXPLANATORY
HYPOTHESIS:
It
states the existence of one independent variable causes or leads to an effect
on dependent variable.
PROCEDURE
OF TESTING A HYPOTHESIS:
Making
a formal statement:
Construct a formal statement of
the null hypothesis and also of the alternative hypothesis.
(Eg) Null hypothesis H0
Alternative hypothesis Ha
Selecting
a statistical technique:
There
are many important parametric tests, which are frequently used in hypothesis
testing. They are Z-test, t-test, X2-test, and F-test. The
researcher has to select the appropriate test for his research.
Selecting
the significance level:
The
hypothesis are tested on pre-determined level of significance. In practice,
either 5% level and or 1% level of significance is adopted for accepting or
rejecting a hypothesis.
Choosing
the two-tailed and one-tailed tests:
The
hypothesis indicated whether we should use a one-tailed test or a two-tailed
test. If the alternative hypothesis is of the type greater than or of the type
lesser than, we use a one-tailed test. On the other hand if the alternative
hypothesis is of the type “not equal to” then we use a two-tailed test.
Compute
the appropriate statistics from the sample data:
A
random sample has to be selected as per the sample design decided, and for the
collected data, the appropriate statistic or measure with reference to the
research question, type of hypothesis to be tested and the level of measurement
of the data.
Compute
the significance test value:
After
the sample statistic is calculated, the formula for the selected significance
test is used to obtain the calculated test value.
Obtain
the critical test value:
We
must locate the critical value in the table concerned with the selected
probability distribution for the given level of significance for the
appropriate number of degrees of freedom. The critical value so located in the
table is commonly known as table value.
Deriving
the inference:
The
calculated value is then compared with the predetermined critical value. If the
calculated value exceeds the critical value at 5% level, then the difference is
considered as significant. On the other hand, if the calculated valued is less
than the critical value at 5% level the difference is considered as
insignificant.
Hypothesis Tests
Statisticians follow
a formal process to determine whether to reject a null hypothesis, based on
sample data. This process, called hypothesis testing, consists
of four steps.
- State the hypotheses. This involves stating the null and alternative hypotheses. The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false.
- Formulate an analysis plan. The analysis plan describes how to use sample data to evaluate the null hypothesis. The evaluation often focuses around a single test statistic.
- Analyze sample data. Find the value of the test statistic (mean score, proportion, t-score, z-score, etc.) described in the analysis plan.
- Interpret results. Apply the decision rule described in the analysis plan. If the value of the test statistic is unlikely, based on the null hypothesis, reject the null hypothesis.
Decision Errors
Two types of errors
can result from a hypothesis test.
- Type I error. A Type I error occurs when the researcher rejects a null hypothesis when it is true. The probability of committing a Type I error is called the significance level. This probability is also called alpha, and is often denoted by α.
- Type II error. A Type II error occurs when the researcher fails to reject a null hypothesis that is false. The probability of committing a Type II error is called Beta, and is often denoted by β. The probability of not committing a Type II error is called the Power of the test.
Ho true
Ho false
Reject Ho
|
Type I error (a)
|
OK
|
Accept Ho
|
OK
|
Type II error (b)
|
Decision Rules
The analysis plan
includes decision rules for rejecting the null hypothesis. In practice,
statisticians describe these decision rules in two ways - with reference to a
P-value or with reference to a region of acceptance.
·
P-value. The strength of evidence in support of
a null hypothesis is measured by the P-value. Suppose the test
statistic is equal to S. The P-value is the probability of observing a
test statistic as extreme as S, assuming the null hypothesis is true.
If the P-value is less than the significance level, we reject the null
hypothesis.
·
Region of acceptance. The region of
acceptance is a range of values. If the test statistic falls within
the region of acceptance, the null hypothesis is not rejected. The region of
acceptance is defined so that the chance of making a Type I error is equal to
the significance level.
The
set of values outside the region of acceptance is called the region of
rejection. If the test statistic falls within the region of rejection,
the null hypothesis is rejected. In such cases, we say that the hypothesis has
been rejected at the α level of significance.
These approaches are
equivalent. Some statistics texts use the P-value approach; others use the
region of acceptance approach. In subsequent lessons, this tutorial will
present examples that illustrate each approach.
One-Tailed and Two-Tailed Tests
A test of a
statistical hypothesis, where the region of rejection is on only one side of
the sampling
distribution, is called a one-tailed test. For example,
suppose the null hypothesis states that the mean is less than or equal to 10.
The alternative hypothesis would be that the mean is greater than 10. The
region of rejection would consist of a range of numbers located on the right
side of sampling distribution; that is, a set of numbers greater than 10.
A test of a
statistical hypothesis, where the region of rejection is on both sides of the
sampling distribution, is called a two-tailed test. For
example, suppose the null hypothesis states that the mean is equal to 10. The
alternative hypothesis would be that the mean is less than 10 or greater than
10. The region of rejection would consist of a range of numbers located on both
sides of sampling distribution; that is, the region of rejection would consist
partly of numbers that were less than 10 and partly of numbers that were
greater than 10.
A General Procedure for Conducting Hypothesis Tests
All hypothesis tests
are conducted the same way. The researcher states a hypothesis to be tested,
formulates an analysis plan, analyzes sample data according to the plan, and
accepts or rejects the null hypothesis, based on results of the analysis.
- State the hypotheses. Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false; and vice versa.
- Formulate an analysis plan. The analysis plan describes how to use sample data to accept or reject the null hypothesis. It should specify the following elements.
- Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
- Test method. Typically, the test method involves a test statistic and a sampling distribution. Computed from sample data, the test statistic might be a mean score, proportion, difference between means, difference between proportions, z-score, t-score, chi-square, etc. Given a test statistic and its sampling distribution, a researcher can assess probabilities associated with the test statistic. If the test statistic probability is less than the significance level, the null hypothesis is rejected.
- Analyze sample data. Using sample data perform computations called for in the analysis plan.
- Test statistic. When the null hypothesis involves a mean or proportion, use either of the following equations to compute the test statistic.
Test statistic =
(Statistic - Parameter) / (Standard deviation of statistic)
Test statistic = (Statistic - Parameter) / (Standard error of statistic)
Test statistic = (Statistic - Parameter) / (Standard error of statistic)
where Parameter is the
value appearing in the null hypothesis, and Statistic is the point
estimate of Parameter.
As part of the analysis, you may need to compute the standard deviation or
standard error of the statistic. Previously, we presented common formulas for the
standard deviation and standard error.
When the parameter in the null hypothesis involves categorical data, you may
use a chi-square statistic as the test statistic. Instructions for computing a
chi-square test statistic are presented in the lesson on the chi-square
goodness of fit test.
- P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic, assuming the null hypothesis is true.
- Interpret the results. If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.
Parametric
test:
Parametric methods were developed
on the assumption that the underlying distribution was normal, exponential and
the like. Important parametric tests used for testing the significance are
‘t-test’ ‘f-test’, ‘z-test’ etc., with these tests the observed values, their
distribution, significance and conclusion are drawn on the basis of the nature
and extent of difference between the two.
Non-parametric
tests:
Non-parametric
methods are distribution free methods. Which have no assumption about the
underlying distribution. Hence, it can be used regardless of the shape of
underlying distribution. It is suitable for small sized samples. It can be
applied even in case of nominal scale and ordinal scaled data.
Important
non-parametric test used for testing the significance are median test, wilcoxon
matched-pairs test, chi-square test, Nann-whitney ‘U’ test, kruskal wallis
test, etc.,
Hypothesis Test of the Mean
This lesson explains
how to conduct a hypothesis test of a mean, when the following conditions are
met:
- The sampling method is simple random sampling.
- The sample is drawn from a normal or near-normal population.
Generally, the
sampling distribution will be approximately normally distributed if any of the
following conditions apply.
- The population distribution is normal.
- The sampling distribution is symmetric, unimodal, without outliers, and the sample size is 15 or less.
- The sampling distribution is moderately skewed, unimodal, without outliers, and the sample size is between 16 and 40.
- The sample size is greater than 40, without outliers.
This approach
consists of four steps: (1) state the hypotheses, (2) formulate an analysis
plan, (3) analyze sample data, and (4) interpret results.
State the Hypotheses
Every
hypothesis test requires the analyst to state a null
hypothesis and an alternative
hypothesis. The hypotheses are stated in such a way that they are mutually
exclusive. That is, if one is true, the other must be false; and vice versa.
The
first set of hypotheses (Set 1) is an example of a two-tailed
test, since an extreme value on either side of the sampling
distribution would cause a researcher to reject the null hypothesis. The
other two sets of hypotheses (Sets 2 and 3) are one-tailed
tests, since an extreme value on only one side of the sampling distribution
would cause a researcher to reject the null hypothesis.
Formulate an Analysis Plan
The analysis plan describes how to use
sample data to accept or reject the null hypothesis. It should specify the
following elements.
- Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
- Test method. Use the one-sample t-test to determine whether the hypothesized mean differs significantly from the observed sample mean.
Analyze Sample Data
Using sample data, conduct a one-sample
t-test. This involves finding the standard error, degrees of freedom, test
statistic, and the P-value associated with the test statistic.
- Standard error. Compute the standard error (SE) of the sampling distribution.
SE
= s * sqrt{ ( 1/n ) * ( 1 - n/N ) * [ N / ( N - 1 ) ] }
where s is the standard
deviation of the sample, N is the population size, and n is the sample
size. When the population size is much larger (at least 10 times larger) than
the sample size, the standard error can be approximated by:
SE
= s / sqrt( n )
- Degrees of freedom. The degrees of freedom (DF) is equal to the sample size (n) minus one.
Thus, DF = n - 1.
- Test statistic. The test statistic is a t-score (t) defined by the following equation.
t =
(x - μ) / SE
where x
is the sample mean, μ is the hypothesized population mean in the null
hypothesis, and SE is the standard error.
- P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a t-score, use the t Distribution Calculator to assess the probability associated with the t-score, given the degrees of freedom computed above. (See sample problems at the end of this lesson for examples of how this is done.)
Interpret Results
If the sample findings are unlikely, given the null
hypothesis, the researcher rejects the null hypothesis. Typically, this
involves comparing the P-value to the significance
level, and rejecting the null hypothesis when the P-value is less than the
significance level.
An inventor has developed a new,
energy-efficient lawn mower engine. He claims that the engine will run
continuously for 5 hours (300 minutes) on a single gallon of regular gasoline.
Suppose a simple random sample of 50 engines is tested. The engines run for an
average of 295 minutes, with a standard deviation of 20 minutes. Test the null
hypothesis that the mean run time is 300 minutes against the alternative
hypothesis that the mean run time is not 300 minutes. Use a 0.05 level of
significance. (Assume that run times for the population of engines are normally
distributed.)
Solution: The
solution to this problem takes four steps: (1) state the hypotheses, (2)
formulate an analysis plan, (3) analyze sample data, and (4) interpret results.
We work through those steps below:
·
State the hypotheses. The first
step is to state the null hypothesis and an alternative hypothesis.
Null hypothesis: μ =
300
Alternative hypothesis: μ ≠ 300
Alternative hypothesis: μ ≠ 300
Note
that these hypotheses constitute a two-tailed test. The null hypothesis will be
rejected if the sample mean is too big or if it is too small.
·
Formulate an analysis plan. For
this analysis, the significance level is 0.05. The test method is a one-sample
t-test.
·
Analyze sample data. Using
sample data, we compute the standard error (SE), degrees of freedom (DF), and
the t-score test statistic (t).
SE = s / sqrt(n) =
20 / sqrt(50) = 20/7.07 = 2.83
DF = n - 1 = 50 - 1 = 49
t = (x - μ) / SE = (295 - 300)/2.83 = 1.77
DF = n - 1 = 50 - 1 = 49
t = (x - μ) / SE = (295 - 300)/2.83 = 1.77
where
s is the standard deviation of the sample, x is the
sample mean, μ is the hypothesized population mean, and n is the sample size.
Since
we have a two-tailed
test, the P-value is the probability that the t-score having 49 degrees of
freedom is less than -1.77 or greater than 1.77.
We
use the t Distribution Calculator
to find P(t < -1.77) = 0.04, and P(t > 1.75) = 0.04. Thus, the P-value =
0.04 + 0.04 = 0.08.
·
Interpret results. Since the
P-value (0.08) is greater than the significance level (0.05), we cannot reject
the null hypothesis.
Problem 2: One-Tailed Test
Bon Air Elementary School has 300 students. The principal of the school thinks that the average IQ of students at Bon Air is at least 110. To prove her point, she administers an IQ test to 20 randomly selected students. Among the sampled students, the average IQ is 108 with a standard deviation of 10. Based on these results, should the principal accept or reject her original hypothesis? Assume a significance level of 0.01.
Bon Air Elementary School has 300 students. The principal of the school thinks that the average IQ of students at Bon Air is at least 110. To prove her point, she administers an IQ test to 20 randomly selected students. Among the sampled students, the average IQ is 108 with a standard deviation of 10. Based on these results, should the principal accept or reject her original hypothesis? Assume a significance level of 0.01.
Solution: The
solution to this problem takes four steps: (1) state the hypotheses, (2)
formulate an analysis plan, (3) analyze sample data, and (4) interpret results.
We work through those steps below:
·
State the hypotheses. The first
step is to state the null hypothesis and an alternative hypothesis.
Null hypothesis: μ
>= 110
Alternative hypothesis: μ < 110
Alternative hypothesis: μ < 110
Note
that these hypotheses constitute a one-tailed test. The null hypothesis will be
rejected if the sample mean is too small.
·
Formulate an analysis plan. For
this analysis, the significance level is 0.01. The test method is a one-sample
t-test.
·
Analyze sample data. Using
sample data, we compute the standard error (SE), degrees of freedom (DF), and
the t-score test statistic (t).
SE = s / sqrt(n) = 10 / sqrt(20) =
10/4.472 = 2.236
DF = n - 1 = 20 - 1 = 19
t = (x - μ) / SE = (108 - 110)/2.236 = -0.894
DF = n - 1 = 20 - 1 = 19
t = (x - μ) / SE = (108 - 110)/2.236 = -0.894
where
s is the standard deviation of the sample, x is the
sample mean, μ is the hypothesized population mean, and n is the sample size.
Since
we have a one-tailed
test, the P-value is the probability that the t-score having 19 degrees of
freedom is less than -0.894.
We
use the t Distribution Calculator
to find P(t < -0.894) = 0.19. Thus, the P-value is 0.19.
·
Interpret results. Since the
P-value (0.19) is greater than the significance level (0.01), we cannot reject
the null hypothesis.
No comments:
Post a Comment