End to End Guide to Hypothesis Testing in Statistics: Concepts, Methods, and Examples

Some experiments, such as testing efficacy of a drug or if a new feature in an app had an impact on app downloads, require statistical approach to data analysis. In the realm of statistics, phenomena of testing assumptions about the data (population) parameter is known as hypothesis testing. In this article we will go through the theory & implementation of various hypothesis testing techniques using Python.

Theory

In simple terms, hypothesis testing is a way of testing whether a claim or assumption about a population parameter is likely to be true, based on sample evidence.

What is a population?

A population refers to all possible subjects/data points that fit the criteria of a given experiment. for example — if we are testing an assumption of whether a drug is effective in treating cold — then our population will be all the people suffering from cold.

A parameter is a numerical summary measure that tells us about the entire population. Some examples of parameter are mean, median etc.

The 5 Most Popular Regression Techniques Explained

Regression is a widely used method for modeling relationships among variables. Different regression techniques can suit…

indiequant.medium.com

What are the assumptions?

Normality Assumption: Data should ideally follow a normal distribution. This is especially important for small sample sizes. Hypothesis tests such as t-test and z-test, rely on the assumption that the data comes from a normal distribution. If the sample size is large, the Central Limit Theorem helps ensure the sampling distribution of the mean is approximately normal. There are also assumptions around test statistic (test statistic is described in later sections).

Independence of Observations:

Observations in the sample should be independent. One simple reason is variability of data is one of the importance measures in hypothesis testing and samples correlated to each other may lead to incorrect estimation of variability of sample hence testing might not be meaningful.

Homogeneity of Variance (Equal Variances): For some tests such as two sample tests, equality of population variance is very important. Few explanations are listed below:

Step by Step Guide towards Time Series Forecasting

We will take a problem of forecasting hourly passenger demands based on the historical data and follow a set of steps…

python.plainenglish.io

What are Null Hypothesis, Alternate Hypothesis and Test Statistic?

Null Hypothesis is the statement being tested. For example in case of an one sample t test — statement could be mean of sample is 5. Alternate Hypothesis is the statement against which null hypothesis is tested. In this case that would be — mean of sample is not equal to 5 (also known as two sided test). After defining null & alternate hypothesis — we compute test statistic in this case — we will compute z statistic computed as below:

Where:

xˉ= Sample mean (the average of the sample data)
μ0 = Population mean (the mean we are testing against in the null hypothesis)
σ = Population standard deviation (known)
n = Sample size

We then utilize this test statistic to compute the P-value which indicates the significance of the difference. We combine p value with a confidence level interval to determine if we can reject null hypothesis or if we are failed to reject null hypothesis.

Let’s say we decide of following decision rule:

If p-value ≤ α (significance level, typically 0.05), reject the null hypothesis (H₀).
If p-value > α, fail to reject the null hypothesis (H₀).
Lets say p-value (0.246) is greater than the significance level (0.05), so we fail to reject the null hypothesis. We would they infer that there is not enough evidence to reject the null hypothesis. We conclude that the sample mean is not significantly different from the population mean.

The Power of Vectorization in Python Data Operations

We will learn how NumPy/Pandas’s vectorized methods vastly outperform apply functions resulting in multifold boost in…

indiequant.medium.com

Top 5 Widely Used Hypothesis Tests & their implementations in Python:

One Sample Z Test:

This test is utilized when a population mean is known and we need to test if mean of a sample is significantly different from this known population mean. Z-test is used while the population standard deviation is known. The null hypothesis (H₀) assumes the sample mean is identical to the given population mean, while the alternative hypothesis (H₁) assumes they are not equal.
Assumptions: The sample size is large (n > 30) or the population variance is known & data is normally distributed.

import numpy as np
import scipy.stats as stats

# Sample data
data = [25, 30, 35, 40, 45]

# Known population parameters
population_mean = 30
population_std = 5
n = len(data)

# Sample mean
sample_mean = np.mean(data)

# Z-statistic calculation
z_stat = (sample_mean - population_mean) / (population_std / np.sqrt(n))

# p-value from Z-distribution
p_value = 2 * (1 - stats.norm.cdf(abs(z_stat)))

print(f"Z-statistic: {z_stat}")
print(f"P-value: {p_value}")

2. Two Sample Z Test

Theory:

The two-sample Z-test compares the mean of two independent samples, used while the population variances are known or the pattern sizes are large. The null hypothesis (H₀) assumes the mean of the two samples are identical, at the same time as the alternative hypothesis (H₁) assumes they’re different.
Assumptions: (a) The data in each group is normally distributed. (b) The sample sizes are sufficiently large. (c) The population variances are known.

import numpy as np
import scipy.stats as stats

# Sample data
group1 = [25, 30, 35, 40, 45]
group2 = [20, 25, 30, 35, 40]

# Known population standard deviations
sigma1 = 5
sigma2 = 6

# Sample means
mean1 = np.mean(group1)
mean2 = np.mean(group2)

# Sample sizes
n1 = len(group1)
n2 = len(group2)

# Z-statistic calculation
z_stat = (mean1 - mean2) / np.sqrt(sigma1**2/n1 + sigma2**2/n2)
p_value = 2 * (1 - stats.norm.cdf(abs(z_stat)))

print(f"Z-statistic: {z_stat}")
print(f"P-value: {p_value}")

Introduction to BIRCH Clustering & Python Implementation

Introduction to Clustering & need for BIRCH

python.plainenglish.io

3. One Sample T Test

The one-sample t-test is used to test if a sample mean is significantly different from a known or hypothesized population mean. The null hypothesis (H₀) assumes the sample mean is identical to the population mean, while the alternate hypothesis (H₁) assumes they are significantly different from each other.

Assumptions: The sample records is approximately normally distributed.

import scipy.stats as stats

# Sample data
data = [25, 30, 35, 40, 45]

# Known population mean
population_mean = 30

# Perform one-sample t-test
t_stat, p_value = stats.ttest_1samp(data, population_mean)

print(f"T-statistic: {t_stat}")
print(f"P-value: {p_value}")

4. Independent t-test (Two-Sample t-test)

The independent t-test compares the mean of independent samples to determine if there is a significant difference between them. The null hypothesis (H₀) assumes that the mean of the two samples are identical, at the same time as the alternative hypothesis (H₁) assumes they are independent. Assumptions: (a) Data in each sample is about normally distributed. (b) The two samples are independent . (c) The variances of the 2 samples are identical.

import scipy.stats as stats

group1 = [25, 30, 35, 40, 45]
group2 = [20, 25, 30, 35, 40]

# Perform independent t-test
t_stat, p_value = stats.ttest_ind(group1, group2)

print(f"T-statistic: {t_stat}")
print(f"P-value: {p_value}")

5. ANOVA test

ANOVA is used to compare the means of 3 or more samples to determine if as a minimum one sample mean is significantly different from the others. The null hypothesis (H₀) assumes that all the sample means are equal, and the alternate hypothesis (H₁) assumes at the least one sample mean is different.
Assumptions: (a) The information is normally distributed in each group.
(b) The variances of the samples are equal (homogeneity of variance). (b) The groups are independent.

import scipy.stats as stats

group1 = [25, 30, 35, 40, 45]
group2 = [20, 25, 30, 35, 40]
group3 = [15, 20, 25, 30, 35]

# Perform one-way ANOVA
f_stat, p_value = stats.f_oneway(group1, group2, group3)

print(f"F-statistic: {f_stat}")
print(f"P-value: {p_value}")

Future improvements

In next articles, we could focus on thorough examples of few more advanced tests and their applications.

If you liked the explanation , follow me for more! Feel free to leave your comments if you have any queries or suggestions.

You can also check out other articles written around data science, computing on medium. If you like my work and want to contribute to my journey, you cal always buy me a coffee :)

Reference

[1] https://pages.stat.wisc.edu/~yxu/Teaching/16%20spring%20Stat602/%5BGeorge_E._P._Box,_J._Stuart_Hunter,_William_G._Hu(BookZZ.org).pdf

Search This Blog

Indie Quant

End to End Guide to Hypothesis Testing in Statistics: Concepts, Methods, and Examples

End to End Guide to Hypothesis Testing in Statistics: Concepts, Methods, and Examples

Theory

What is a population?

The 5 Most Popular Regression Techniques Explained

Regression is a widely used method for modeling relationships among variables. Different regression techniques can suit…

What are the assumptions?

Step by Step Guide towards Time Series Forecasting

We will take a problem of forecasting hourly passenger demands based on the historical data and follow a set of steps…

What are Null Hypothesis, Alternate Hypothesis and Test Statistic?

The Power of Vectorization in Python Data Operations

We will learn how NumPy/Pandas’s vectorized methods vastly outperform apply functions resulting in multifold boost in…

Top 5 Widely Used Hypothesis Tests & their implementations in Python:

One Sample Z Test:

2. Two Sample Z Test

Theory:

Introduction to BIRCH Clustering & Python Implementation

Introduction to Clustering & need for BIRCH

3. One Sample T Test

4. Independent t-test (Two-Sample t-test)

5. ANOVA test

Future improvements

Reference

Comments

Post a Comment

Popular Posts

Missing Character Prediction in Words with BiLSTM and Attention

Handling Overfitting in Machine Learning

The 5 Most Popular Regression Techniques

Text Classification Using Recurrent Neural Networks

Hypothesis Testing Series - An End to End Guide to Bayesian Hypothesis Tests - Part 3

How I Created Animated Choropleth Map and Running Bar Plot using Python

Sentiment Analysis using Deep Learning (BERT)

The Power of Vectorization in Python Data Operations

Deep Convolutional Generative Adversarial Networks

Hypothesis Testing Series - An End to End Guide to Permutation Tests - Part 2