Hypothesis testing is a method statisticians use to make decisions or inferences about populations based on sample data. Hypothesis testing is a core concept in statistics that allows us to make informed decisions based on data. It’s a structured, methodical way to put our claims to the test, demanding evidence that’s strong enough to support them.
The 2 Hypotheses:
Null Hypothesis (H0)
This is our initial assumption. It’s the ‘innocent until proven guilty’ claim in a courtroom analogy or we might assume that a new drug has no effect on a disease.
Alternative Hypothesis (H1 or Ha)
This is what we want to prove. It’s the ‘guilty’ claim, like arguing that the new drug does have an effect on the disease.
We have two hypotheses: the null hypothesis (H0) and the alternative hypothesis (Ha). The null hypothesis usually represents the status quo or no effect, while the alternative hypothesis suggests there’s a real effect or relationship.
The Significance Level: Setting the Standard
Before we start, we set a ‘significance level’, denoted as α, which represents the probability of making a Type I error (rejecting a true null hypothesis). Commonly set at 0.05, this is the threshold for how strong our evidence must be before we reject the null hypothesis.
Collecting Data: The Witnesses
We collect sample data to evaluate the hypotheses. In our example, this might be the results of a clinical trial with the new drug.
The Test Statistic: The Cross-Examination
We calculate a ‘test statistic‘ using the sample data. It’s like a score that tells us how different our sample results are from what we would expect under the null hypothesis. The formula for the test statistic varies depending on the data.
The P-value: Weighing the Evidence
The ‘p-value‘ is a probability that tells us how likely it is to observe our sample results (or something more extreme) if the null hypothesis is true. It’s like asking, “What are the chances of this evidence appearing in a completely innocent scenario?”
Making a Decision: The Verdict
We compare the p-value to our significance level (α) to make a decision:
- If p≤α, the evidence against the null hypothesis is strong, so we reject the null hypothesis. It’s like declaring the defendant guilty.
- If p>α, we don’t have enough evidence to reject the null hypothesis. So we fail to reject the null hypothesis. It’s akin to a ‘not guilty’ verdict.
- Before Reaching the Significance Level (α): When you perform a hypothesis test and calculate a p-value, the significance level (α) acts as a threshold. If the p-value is less than or equal to α, you would reject the null hypothesis. In this scenario, you are effectively saying that the evidence from your sample suggests that the null hypothesis is unlikely to be true, and you are willing to take the risk of making a Type I error.
- After Reaching the Significance Level (α): If the p-value exceeds α, you will fail to reject the null hypothesis. In this case, you are not making a definitive statement that the null hypothesis is true; rather, you are acknowledging that there is not enough evidence in your sample to conclude otherwise. You have not reached the level of confidence required to reject the null hypothesis.
In summary, the significance level (α) serves as a predetermined threshold for statistical significance. If the p-value is less than or equal to α, you reject the null hypothesis, indicating that you believe there is enough evidence to suggest an effect or relationship.
If the p-value is greater than α, you fail to reject the null hypothesis, indicating that you do not have sufficient evidence to conclude an effect or relationship exists. It’s important to choose an appropriate α level based on the context and the acceptable level of Type I error for your analysis.
Remember that ‘failing to reject’ the null hypothesis is not the same as proving it true. Also, a small p-value doesn’t tell us how big or important an effect is, just that it’s statistically significant.
Mark Twain remarked that there were three ways to avoid telling the truth: lies, damned lies, and statistics 🙂
Happy Testing!