hypothesis test

 for example, claiming that a new drug is better than the current drug for treatment of the same symptoms.

In each problem considered, the question of interest is simplified into two competing claims / hypotheses between which we have a choice; the null hypothesis, denoted H0, against the alternative hypothesis, denoted H1.

These two competing claims / hypotheses are not however treated on an equal basis: special consideration is given to the null hypothesis.

We have two common situations:

  1. The experiment has been carried out in an attempt to disprove or reject a particular hypothesis, the null hypothesis, thus we give that one priority so it cannot be rejected unless the evidence against it is sufficiently strong. For example,
    H0: there is no difference in taste between coke and diet coke
    against
    H1: there is a difference.
  1. If one of the two hypotheses is ‘simpler’ we give it priority so that a more ‘complicated’ theory is not adopted unless there is sufficient evidence against the simpler one. For example, it is ‘simpler’ to claim that there is no difference in flavour between coke and diet coke than it is to say that there is a difference.

The hypotheses are often statements about population parameters like expected value and variance; for example H0 might be that the expected value of the height of ten year old boys in the Scottish population is not different from that of ten year old girls. A hypothesis might also be a statement about the distributional form of a characteristic of interest, for example that the height of ten year old boys is normally distributed within the Scottish population.

The outcome of a hypothesis test test is “Reject H0 in favour of H1” or “Do not reject H0”.

Null Hypothesis

We give special consideration to the null hypothesis. This is due to the fact that the null hypothesis relates to the statement being tested, whereas the alternative hypothesis relates to the statement to be accepted if / when the null is rejected.

The final conclusion once the test has been carried out is always given in terms of the null hypothesis. We either “Reject H0 in favour of H1” or “Do not reject H0”; we neverconclude “Reject H1”, or even “Accept H1”.

If we conclude “Do not reject H0”, this does not necessarily mean that the null hypothesis is true, it only suggests that there is not sufficient evidence against H0 in favour of H1. Rejecting the null hypothesis then, suggests that the alternative hypothesis may be true.

Type I Error

In a hypothesis test, a type I error occurs when the null hypothesis is rejected when it is in fact true; that is, H0 is wrongly rejected.

Type II Error

In a hypothesis test, a type II error occurs when the null hypothesis H0, is not rejected when it is in fact false. For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug; i.e.H0: there is no difference between the two drugs on average.

 Critical Value(s)

The critical value(s) for a hypothesis test is a threshold to which the value of the test statistic in a sample is compared to determine whether or not the null hypothesis is rejected.

The critical value for any hypothesis test depends on the significance level at which the test is carried out, and whether the test is one-sided or two-sided.

Critical Region

The critical region CR, or rejection region RR, is a set of values of the test statistic for which the null hypothesis is rejected in a hypothesis test. That is, the sample space for the test statistic is partitioned into two regions; one region (the critical region) will lead us to reject the null hypothesis H0, the other will not. So, if the observed value of the test statistic is a member of the critical region, we conclude “Reject H0; if it is not a member of the critical region then we conclude “Do not reject H0“.

Significance Level

The significance level of a statistical hypothesis test is a fixed probability of wrongly rejecting the null hypothesis H0, if it is in fact true.

It is the probability of a type I error and is set by the investigator in relation to the consequences of such an error. That is, we want to make the significance level as small as possible in order to protect the null hypothesis and to prevent, as far as possible, the investigator from inadvertently making false claims.

The significance level is usually denoted by alpha
Significance Level = P(type I error) = alpha

 Usually, the significance level is chosen to be 0.05 (or equivalently, 5%).

 P-Value

The probability value (p-value) of a statistical hypothesis test is the probability of getting a value of the test statistic as extreme as or more extreme than that observed by chance alone, if the null hypothesis H0, is true.

It is the probability of wrongly rejecting the null hypothesis if it is in fact true.

The p-value is compared with the actual significance level of our test and, if it is smaller, the result is significant. That is, if the null hypothesis were to be rejected at the 5% signficance level, this would be reported as “p < 0.05”.

Small p-values suggest that the null hypothesis is unlikely to be true. The smaller it is, the more convincing is the rejection of the null hypothesis. It indicates the strength of evidence for say, rejecting the null hypothesis H0, rather than simply concluding “Reject H0‘ or “Do not reject H0“.

One-sided Test

A one-sided test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, H0 are located entirely in one tail of the probability distribution.

In other words, the critical region for a one-sided test is the set of values less than the critical value of the test, or the set of values greater than the critical value of the test.

A one-sided test is also referred to as a one-tailed test of significance.

The choice between a one-sided and a two-sided test is determined by the purpose of the investigation or prior reasons for using a one-sided test.

Example

Suppose we wanted to test a manufacturers claim that there are, on average, 50 matches in a box. We could set up the following hypotheses
H0: µ = 50,
against
H1: µ < 50 or H1: µ > 50
Either of these two alternative hypotheses would lead to a one-sided test. Presumably, we would want to test the null hypothesis against the first alternative hypothesis since it would be useful to know if there is likely to be less than 50 matches, on average, in a box (no one would complain if they get the correct number of matches in a box or more).
Yet another alternative hypothesis could be tested against the same null, leading this time to a two-sided test:
H0: µ = 50,
against
H1: µ not equal to 50
Here, nothing specific can be said about the average number of matches in a box; only that, if we could reject the null hypothesis in our test, we would know that the average number of matches in a box is likely to be less than or greater than 50.
Two-Sided Test

A two-sided test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, H0 are located in both tails of the probability distribution.

In other words, the critical region for a two-sided test is the set of values less than a first critical value of the test and the set of values greater than a second critical value of the test.

A two-sided test is also referred to as a two-tailed test of significance.

The choice between a one-sided test and a two-sided test is determined by the purpose of the investigation or prior reasons for using a one-sided test.

Example

Suppose we wanted to test a manufacturers claim that there are, on average, 50 matches in a box. We could set up the following hypotheses
H0: µ = 50,
against
H1: µ < 50 or H1: µ > 50
Either of these two alternative hypotheses would lead to a one-sided test. Presumably, we would want to test the null hypothesis against the first alternative hypothesis since it would be useful to know if there is likely to be less than 50 matches, on average, in a box (no one would complain if they get the correct number of matches in a box or more).
Yet another alternative hypothesis could be tested against the same null, leading this time to a two-sided test:
H0: µ = 50,
against
H1: µ not equal to 50
Here, nothing specific can be said about the average number of matches in a box; only that, if we could reject the null hypothesis in our test, we would know that the average number of matches in a box is likely to be less than or greater than 50.

via http://www.stats.gla.ac.uk/steps/glossary/hypothesis_testing.html#h0

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

生活在西班牙

自己动手丰衣足食

BlueAsteroid

Just another WordPress.com site

Jing's Blog

Just another WordPress.com site

Start from here......

我的心情魔方

天才遠私廚

希望能做一個分享各種資訊的好地方

语义噪声

西瓜大丸子汤的博客

笑对人生,傲立寰宇

Just another WordPress.com site

Where On Earth Is Waldo?

A Project By Melanie Coles

the Serious Computer Vision Blog

A blog about computer vision and serious stuff

Cauthy's Blog

paper review...

Cornell Computer Vision Seminar Blog

Blog for CS 7670 - Special Topics in Computer Vision

datarazzi

Life through nerd-colored glasses

Luciana Haill

Brainwaves Augmenting Consciousness

槑烎

1,2,∞

Dr Paul Tennent

and the university of nottingham

turn off the lights, please

A bunch of random, thinned and stateless thoughts around the Web

%d bloggers like this: