import numpy as np
from scipy import stats
import pandas as pd
= pd.read_csv("https://raw.githubusercontent.com/opencasestudies/ocs-bp-rural-and-urban-obesity/master/data/wrangled/BMI_long.csv") dat
Parametric hypothesis tests with examples in Python
How To
Parametric Tests
T-test
Z-test
F-test
ANOVA
Python
A tutorial on parametric hypothesis tests with examples in Python.
Update history
2023-01-09 First draft
Introduction
This article is an extension of Rohit Farmer. 2022. “Parametric Hypothesis Tests with Examples in R.” November 10, 2022. Please check out the parent article for the theoretical background.
Import packages
Z-test
Example code for a two sample unpaired z-test
from statsmodels.stats.weightstats import ztest as ztest
import random
= (dat['Sex'] == "Women") & (dat['Year'] == 1985)
mask1 = dat[mask1]['BMI']
x1 = x1.array.dropna()
x1 = random.sample(x1.tolist(), k = 300)
x1
= (dat['Sex'] == "Women") & (dat['Year'] == 2017)
mask2 = dat[mask2]['BMI']
x2 = x2.array.dropna()
x2 = random.sample(x2.tolist(), k = 300)
x2
= ztest(x1, x2, value=0)
z_statistics, p_value
print("z-statistic:", z_statistics)
print("p-value:", p_value)
z-statistic: -9.201889936608346
p-value: 3.517084717411295e-20
T-test
Example code for a two-tailed t-test
= (dat['Sex'] == "Women") & (dat['Region'] == "Rural") & (dat['Year'] == 1985)
mask1 = dat[mask1]['BMI']
x1
= (dat['Sex'] == "Women") & (dat['Region'] == "Urban") & (dat['Year'] == 1985)
mask2 = dat[mask2]['BMI']
x2
= stats.ttest_ind(x1, x2, equal_var = True, nan_policy = "omit")
t_statistic, p_value
print("t-statistic:", t_statistic)
print("p-value:", p_value)
t-statistic: -3.8952336023562912
p-value: 0.00011523146459551333
Example code for a one-tailed t-test
= stats.ttest_ind(x1, x2, equal_var = True, nan_policy = "omit", alternative = "greater")
t_statistic, p_value
print("t-statistic:", t_statistic)
print("p-value:", p_value)
t-statistic: -3.8952336023562912
p-value: 0.9999423842677022
Two sample paired (dependent) t-test
= stats.ttest_rel(x1, x2, nan_policy = "omit")
t_statistic, p_value
print("t-statistic:", t_statistic)
print("p-value:", p_value)
t-statistic: -14.095486243034763
p-value: 1.426675846865914e-31
ANOVA
Example code for a oneway ANOVA
= (dat['Sex'] == "Men") & (dat['Region'] == "Rural") & (dat['Year'] == 2017)
mask1 = dat[mask1]['BMI']
x1
= (dat['Sex'] == "Men") & (dat['Region'] == "Urban") & (dat['Year'] == 2017)
mask2 = dat[mask2]['BMI']
x2
= (dat['Sex'] == "Men") & (dat['Region'] == "National") & (dat['Year'] == 2017)
mask3 = dat[mask3]['BMI']
x3
= stats.f_oneway(x1.array.dropna(), x2.array.dropna(), x3.array.dropna())
f_value, p_value
print("f-value statistic: ",f_value)
print("p-value: ", p_value)
f-value statistic: 3.4215235158825905
p-value: 0.033309935710150805
Citation
BibTeX citation:
@online{farmer2023,
author = {Farmer, Rohit},
title = {Parametric Hypothesis Tests with Examples in {Python}},
date = {2023-01-09},
url = {https://dataalltheway.com/posts/010-02-parametric-hypothesis-tests-python},
langid = {en}
}
For attribution, please cite this work as:
Farmer, Rohit. 2023. “Parametric Hypothesis Tests with Examples in
Python.” January 9, 2023. https://dataalltheway.com/posts/010-02-parametric-hypothesis-tests-python.