Parametric hypothesis tests with examples in Python

How To
Parametric Tests
T-test
Z-test
F-test
ANOVA
Python
A tutorial on parametric hypothesis tests with examples in Python.
Author

Rohit Farmer

Published

January 9, 2023

2023-01-09 First draft

Introduction

This article is an extension of Rohit Farmer. 2022. “Parametric Hypothesis Tests with Examples in R.” November 10, 2022. Please check out the parent article for the theoretical background.

Import packages

import numpy as np
from scipy import stats
import pandas as pd

dat = pd.read_csv("https://raw.githubusercontent.com/opencasestudies/ocs-bp-rural-and-urban-obesity/master/data/wrangled/BMI_long.csv")

Z-test

Example code for a two sample unpaired z-test

from statsmodels.stats.weightstats import ztest as ztest
import random

mask1 = (dat['Sex'] == "Women") & (dat['Year'] == 1985)
x1 = dat[mask1]['BMI']
x1 = x1.array.dropna()
x1 = random.sample(x1.tolist(), k = 300)

mask2 = (dat['Sex'] == "Women") & (dat['Year'] == 2017)
x2 = dat[mask2]['BMI']
x2 = x2.array.dropna()
x2 = random.sample(x2.tolist(), k = 300)

z_statistics, p_value = ztest(x1, x2, value=0) 

print("z-statistic:", z_statistics)
print("p-value:", p_value)
z-statistic: -9.201889936608346
p-value: 3.517084717411295e-20

T-test

Example code for a two-tailed t-test

mask1 = (dat['Sex'] == "Women") & (dat['Region'] == "Rural") & (dat['Year'] == 1985)
x1 = dat[mask1]['BMI']

mask2 = (dat['Sex'] == "Women") & (dat['Region'] == "Urban") & (dat['Year'] == 1985)
x2 = dat[mask2]['BMI']

t_statistic, p_value = stats.ttest_ind(x1, x2, equal_var = True, nan_policy = "omit")

print("t-statistic:", t_statistic)
print("p-value:", p_value)
t-statistic: -3.8952336023562912
p-value: 0.00011523146459551333

Example code for a one-tailed t-test

t_statistic, p_value = stats.ttest_ind(x1, x2, equal_var = True, nan_policy = "omit", alternative = "greater")

print("t-statistic:", t_statistic)
print("p-value:", p_value)
t-statistic: -3.8952336023562912
p-value: 0.9999423842677022

Two sample paired (dependent) t-test

t_statistic, p_value = stats.ttest_rel(x1, x2, nan_policy = "omit")

print("t-statistic:", t_statistic)
print("p-value:", p_value)
t-statistic: -14.095486243034763
p-value: 1.426675846865914e-31

ANOVA

Example code for a oneway ANOVA

mask1 = (dat['Sex'] == "Men") & (dat['Region'] == "Rural") & (dat['Year'] == 2017)
x1 = dat[mask1]['BMI']

mask2 = (dat['Sex'] == "Men") & (dat['Region'] == "Urban") & (dat['Year'] == 2017)
x2 = dat[mask2]['BMI']

mask3 = (dat['Sex'] == "Men") & (dat['Region'] == "National") & (dat['Year'] == 2017)
x3 = dat[mask3]['BMI']

f_value, p_value = stats.f_oneway(x1.array.dropna(), x2.array.dropna(), x3.array.dropna())

print("f-value statistic: ",f_value)
print("p-value: ", p_value)
f-value statistic:  3.4215235158825905
p-value:  0.033309935710150805
Back to top

Citation

BibTeX citation:
@online{farmer2023,
  author = {Farmer, Rohit},
  title = {Parametric Hypothesis Tests with Examples in {Python}},
  date = {2023-01-09},
  url = {https://dataalltheway.com/posts/010-02-parametric-hypothesis-tests-python},
  langid = {en}
}
For attribution, please cite this work as:
Farmer, Rohit. 2023. “Parametric Hypothesis Tests with Examples in Python.” January 9, 2023. https://dataalltheway.com/posts/010-02-parametric-hypothesis-tests-python.