t-test: 2 groupsR packages we use in this sectionlibrary(tidyverse)
library(broom)
library(patchwork)
library(DT)
library(ggbeeswarm)
library(ggsignif)
library(rcompanion)
library(rmarkdown)t-test?t-test should we use?When you conduct a t-test, you need to consider two things:
(1) whether the groups being compared come from a single population or two different populations
(3) whether you want to test the difference in a specific direction or both directions
t-test?t-test?ttestttest
| How to show the results | Features |
|---|---|
| (1) Simple output | Rich info, but not easy to see |
| (2) Boxplot/violin plot | Easy to see the result & statistical significance |
| (3) Bar chart | easy to see |
| (4) Show the difference | Easy to see |
Paired-samples t-testA sample: (mos_mc.csv)
t-test and confirm this.mos_mc.csvdf_mos_mc <- read_csv("data/mos_mc.csv")DT::datatable(df_mos_mc)mos_mc.csv is called wide formatwide format should be changed to long format in analyzing in R→ Using tidyr::pivot_longer() function, we change mos_mc.csv to long format
df_long <- df_mos_mc %>%
tidyr::pivot_longer(mos:mc,
names_to = "burger",
values_to = "score") df_longDT::datatable(df_long)long format datadf_long %>%
mutate(burger = fct_inorder(burger)) %>%
ggplot(aes(x = burger, y = score)) +
geom_boxplot() +
scale_x_discrete(labels = c( "Mos Burger", "McDonald's")) +
labs(x = "Shop names", y = "Evaluation")df_long %>%
mutate(burger = fct_inorder(burger)) %>%
ggplot(aes(x = burger, y = score)) +
geom_violin() +
scale_x_discrete(labels = c( "Mos Burger", "McDonald's")) +
labs(x = "Shop names", y = "Evaluation")df_long %>%
mutate(burger = fct_inorder(burger)) %>%
ggplot(aes(x = burger, y = score, color = burger)) +
geom_violin() +
geom_boxplot(width = .1) + # Set the width of Box Plot as 0.1
stat_summary(fun.y = mean, geom = "point") + # Show the average as dots
scale_x_discrete(labels = c( "Mos Burger", "McDonald's")) +
labs(x = "Shop names", y = "Evaluation")summary(df_mos_mc) ID mos mc
Min. : 1.00 Min. :70.00 Min. :70.00
1st Qu.: 3.25 1st Qu.:76.25 1st Qu.:75.00
Median : 5.50 Median :80.00 Median :80.00
Mean : 5.50 Mean :80.50 Mean :79.50
3rd Qu.: 7.75 3rd Qu.:85.00 3rd Qu.:83.75
Max. :10.00 Max. :90.00 Max. :90.00
paired ttestt-value
t-value we get lies within the rejection area
Equation for calculating t-value using Paired data
\[T = \frac{\bar{d} - d_0}{u_x / \sqrt{n}}\] Where,
\[\bar{d} = \frac{\sum (x_i - y_i)}{n}\]
\(n\) : Sample size (10)
\(u_x\) : Unbiased standard deviation
\(\bar{d}\) : The difference of evaluation between McDonald’s and Mos burger
\(d_0\) : The value we want to estimate (0)
\(x_i\) : Evaluation for McDonald’s burger
\(y_i\) : Evaluation for Mos burger
Using the equation above, we can calculate the t-value
x <- df_mos_mc$mos
y <- df_mos_mc$mcd <- x - yt <- (mean(d) - 0) / (sd(d) / sqrt(10))t[1] 0.557086
This is the t-value (0.557086) we calculated by hand
The following is the t-distribution table
two-paired t-test (because what we want to know is NOT McDonald’s is tastier than Mos Burger, or vise-verse).two-paired t-test.-2.262 & 2.262)long format datadf_long) we modified at Section 4.2df_longdf_long# A tibble: 20 x 3
ID burger score
<dbl> <chr> <dbl>
1 1 mos 80
2 1 mc 75
3 2 mos 75
4 2 mc 70
5 3 mos 80
6 3 mc 80
7 4 mos 90
8 4 mc 85
9 5 mos 85
10 5 mc 90
11 6 mos 80
12 6 mc 75
13 7 mos 75
14 7 mc 85
15 8 mos 85
16 8 mc 80
17 9 mos 85
18 9 mc 80
19 10 mos 70
20 10 mc 75
t.test(df_long$score[df_long$burger == "mos"],
df_long$score[df_long$burger == "mc"]) # unpaired is default
Welch Two Sample t-test
data: df_long$score[df_long$burger == "mos"] and df_long$score[df_long$burger == "mc"]
t = 0.37354, df = 18, p-value = 0.7131
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.624301 6.624301
sample estimates:
mean of x mean of y
80.5 79.5
t = 0.37354 in line 4ggsignif() function, we can draw a boxplot with statistical significance, p-valueunpaired data is default in ggsignif() functiontest.args = list(paired = TRUE)df_long %>%
mutate(burger = fct_inorder(burger)) %>%
ggplot(aes(x = burger, y = score, color = burger)) +
geom_violin() +
geom_boxplot(width = .1) +
stat_summary(fun.y = mean, geom = "point") +
scale_x_discrete(labels = c( "MOS Burger", "McDonald's")) +
labs(x = "Store", y = "Evaluation") +
ggsignif::geom_signif(comparisons = combn(sort(unique(df_long$burger)), 2, FUN = list),
test = "t.test",
test.args = list(paired = TRUE),
na.rm = T,
step_increase = 0.1)→ When you use unpaired data, then you delete test.args = list(paired = TRUE)
wide format datadf_mos_mc# A tibble: 10 x 3
ID mos mc
<dbl> <dbl> <dbl>
1 1 80 75
2 2 75 70
3 3 80 80
4 4 90 85
5 5 85 90
6 6 80 75
7 7 75 85
8 8 85 80
9 9 85 80
10 10 70 75
t.test(df_mos_mc$mos,
df_mos_mc$mc,
paired = TRUE)
Paired t-test
data: df_mos_mc$mos and df_mos_mc$mc
t = 0.55709, df = 9, p-value = 0.5911
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.060696 5.060696
sample estimates:
mean of the differences
1
One Sample t-testUsing One Sample t-test, we can do the same t-test in the Section 4.5.1
Calcualte the difference in mean (diff) between Mos Burger and McDonald’
Here, we use wide format data: df_mos_mc
Null Hypothesis:mean of diff = 0
→ there is no difference in tastes between Mos Burger and McDonald’s
diff <- df_mos_mc$mos - df_mos_mc$mc
diff [1] 5 5 0 5 -5 5 -10 5 5 -5
diffmean(diff)[1] 1
t.test(diff)
One Sample t-test
data: diff
t = 0.55709, df = 9, p-value = 0.5911
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-3.060696 5.060696
sample estimates:
mean of x
1
Unpaired-samples t-testA sample: (mos_mc.csv)
t-test and confirm this.paired and unpaired data| Type of data | Details |
|---|---|
Paired |
10 people eat both burgers |
Unpaired |
10 people eat Mos and the other 10 people eat Mc burgers |
mos_mc.csvdf_mos_mc <- read_csv("data/mos_mc.csv")DT::datatable(df_mos_mc)mos_mc.csv is called wide formatwide format should be changed to long format in Rtidyr::pivot_longer() function, we change mos_mc.csv to long formatdf_long <- df_mos_mc %>%
tidyr::pivot_longer(mos:mc,
names_to = "burger",
values_to = "score") DT::datatable(df_long)t.test(df_long$score[df_long$burger == "mos"],
df_long$score[df_long$burger == "mc"]) # unpaired is default
Welch Two Sample t-test
data: df_long$score[df_long$burger == "mos"] and df_long$score[df_long$burger == "mc"]
t = 0.37354, df = 18, p-value = 0.7131
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.624301 6.624301
sample estimates:
mean of x mean of y
80.5 79.5
t = 0.55709 in line 4ggsignif() function, we can draw a boxplot + violin with statistical significance, p-valueunpaired data is default in ggsignif() functiontest.args = list(paired = TRUE)df_long %>%
mutate(burger = fct_inorder(burger)) %>%
ggplot(aes(x = burger, y = score, fill = burger)) +
geom_violin() +
geom_boxplot(width = .1) + # 箱ひげ図の幅を 0.1 と指定
stat_summary(fun.y = mean, geom = "point") + # 平均値を点で示す
ggbeeswarm::geom_beeswarm() +
scale_x_discrete(labels = c( "Mos Burger", "McDonald's")) +
labs(x = "Shop names", y = "Evaluation") +
ggsignif::geom_signif(comparisons = combn(sort(unique(df_long$burger)), 2, FUN = list),
test = "t.test",
na.rm = T,
step_increase = 0.1)df_mos_mc# A tibble: 10 x 3
ID mos mc
<dbl> <dbl> <dbl>
1 1 80 75
2 2 75 70
3 3 80 80
4 4 90 85
5 5 85 90
6 6 80 75
7 7 75 85
8 8 85 80
9 9 85 80
10 10 70 75
t.test(df_mos_mc$mos,
df_mos_mc$mc) # unpaired is default
Welch Two Sample t-test
data: df_mos_mc$mos and df_mos_mc$mc
t = 0.37354, df = 18, p-value = 0.7131
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.624301 6.624301
sample estimates:
mean of x mean of y
80.5 79.5
long format dataRecently, researchers tend to show their t-test results by bar chart in major academic journals in Social Sciences
We use a hypothetical survey data on Mos Burgers and McDonald’s.
Unpaired-samples t-test
20 people eat fried potatoes of either Mos Burgers or McDonald’s.
Respondents 1 to 10 eat fried potato of Mos Burger.
Respondents 11 to 20 eat fried potato McDonald’s burger.
These 20 respondents evaluate them in 0 - 100 scale.
What we want to know here:
Which flied potatoes is better in tasts between Mos Burgers and McDonald’s
Unpaired-samples t-testpotato_ttest <- t.test(fried_potato ~ mosmc, data = df_menu)
potato_ttest
Welch Two Sample t-test
data: fried_potato by mosmc
t = 4.4463, df = 15.32, p-value = 0.0004489
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
3.233228 9.166772
sample estimates:
mean in group mc mean in group mos
80.9 74.7
t = 4.4463 in line 3mean_ci)mean_ci <- function(data, by, vari){
se <- function(x) sqrt(var(x)/length(x))
meanci <- data %>%
group_by({{by}}) %>%
summarise(n = n(),
mean_out = mean({{vari}}),
se_out = se({{vari}}),
.group = "drop"
) %>%
mutate(
lwr = mean_out - 1.96 * se_out,
upr = mean_out + 1.96 * se_out
) %>%
mutate(across(where(is.double), round, 1)) %>%
mutate(mean_label = format(round(mean_out, 1), nsmall = 1)) %>%
select({{by}}, mean_out, lwr, upr, mean_label) %>%
mutate(across(.cols = {{by}}, as.factor))
return(meanci)
}mean and 95% confidence intervals of fried potatoes for Mos Burger and McDonald’spotato_mean <- df_menu %>%
mean_ci(mosmc, fried_potato)
potato_mean# A tibble: 2 x 5
mosmc mean_out lwr upr mean_label
<fct> <dbl> <dbl> <dbl> <chr>
1 mc 80.9 79.4 82.4 80.9
2 mos 74.7 72.4 77 74.7
lwr — the lower bound of 95% confidence interval
upr — the lower bound of 95% confidence interval
mean_label — the labels pasted on the bar chart
Using broom::tidy() function, we change the results we get into tibble format
We need p-values we want to show on the bar chart
We assign some conditions with which we show p-values
We use long format data here (df_menu)
potato_ttest <- t.test(fried_potato ~ mosmc, data = df_menu)potato_tidy <- tidy(potato_ttest) %>%
select(estimate, p.value, conf.low, conf.high) %>%
mutate(
p_label = case_when(
p.value <= 0.01 ~ "p < .01",
p.value > 0.01 & p.value <= 0.05 ~ "p < .05",
p.value > 0.05 & p.value <= 0.1 ~ "p < .1",
p.value > 0.1 ~ "N.S"
)
)potato_mean and potato_tidy by using bind_cols() functiondf_potatomosmcp_labeldf_potato <- bind_cols(potato_mean, potato_tidy) %>%
mutate(
mosmc = as.factor(if_else(mosmc == "mc",
"McDonalds", "Mos Burger")),
p_label = if_else(mosmc == "McDonalds", p_label, NA_character_),
menu = "Fried Potato"
)df_potatodf_potato# A tibble: 2 x 11
mosmc mean_out lwr upr mean_label estimate p.value conf.low conf.high
<fct> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 McDonalds 80.9 79.4 82.4 80.9 6.2 4.49e-4 3.23 9.17
2 Mos Burger 74.7 72.4 77 74.7 6.2 4.49e-4 3.23 9.17
# … with 2 more variables: p_label <chr>, menu <chr>
pl_potato <- df_potato %>%
ggplot(aes(x = mosmc, y = mean_out, fill = mosmc)) +
geom_bar(stat = "identity") +
geom_errorbar(aes(ymin = lwr, ymax = upr, width = 0.3)) +
geom_label(aes(label = mean_label),
size = 7.5, position = position_stack(vjust = 0.5),
show.legend = F, fill = "white") +
geom_segment(aes(x = 1, y = 90, xend = 1, yend = 95)) +
geom_segment(aes(x = 1, y = 95, xend = 2, yend = 95)) +
geom_segment(aes(x = 2, y = 90, xend = 2, yend = 95)) +
geom_text(aes(x = 1.5, y = 100, label = p_label),
size = 4.5, family = "Times New Roman", inherit.aes = FALSE) +
scale_fill_manual(values = c("red", "green4")) +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 105)) +
labs(x = NULL, y = "Average of Evaluation",
title = "Comparing the Average Score of Fried Potato") +
theme(legend.position = "none",
plot.title = element_text(size = 12, hjust = 0.5),
axis.title = element_text(size = 13),
axis.text = element_text(size = 13))
pl_potatodf_potato <- df_potato %>%
mutate(across(where(is.double), ~ round(.x, 1))) %>%
mutate(
diff_x = "difference",
diff_label = format(round(estimate, 1),
nsmall = 1)
)df_potato# A tibble: 2 x 13
mosmc mean_out lwr upr mean_label estimate p.value conf.low conf.high
<fct> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 McDonalds 80.9 79.4 82.4 80.9 6.2 0 3.2 9.2
2 Mos Burger 74.7 72.4 77 74.7 6.2 0 3.2 9.2
# … with 4 more variables: p_label <chr>, menu <chr>, diff_x <chr>,
# diff_label <chr>
pl_mean_poteto <- df_potato %>%
ggplot(aes(x = mosmc,
y = mean_out,
ymin = lwr,
ymax = upr)) +
geom_pointrange(size = 1) +
geom_text(aes(label = mean_label),
size = 6.5,
nudge_x = .13) +
ylim(70, 100) +
labs(x = NULL, y = NULL, title = "Average of Evaluation") +
theme(plot.title = element_text(hjust = 0.5, size = 16),
axis.text = element_text(size = 17),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
strip.background = element_blank(),
strip.text.y = element_blank())pl_diff_poteto <- df_potato %>%
ggplot(aes(x = diff_x, y = estimate)) +
geom_hline(yintercept = 0, col = "red") +
geom_pointrange(aes(ymin = conf.low, ymax = conf.high), size = 1) +
geom_text(aes(label = diff_label),
size = 6.5,
nudge_x = .19) +
labs(x = NULL, y = NULL, title = "difference") +
theme(plot.title = element_text(hjust = 0.5, size = 16),
axis.text = element_text(size = 17),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
strip.text.y = element_text(size = 17),
strip.background = element_blank())patchwork package, show the results we get herepl_mean_diff <- pl_mean_poteto + pl_diff_poteto + plot_layout(widths = c(3, 1))
pl_mean_diffdf_menu %>%
ggplot(aes(mosmc, fried_potato, fill = mosmc)) +
geom_violin() +
geom_boxplot(width = .1) +
stat_summary(fun.y = mean, geom = "point") +
scale_x_discrete(labels = c( "Mos Burger", "McDonald's")) +
labs(x = "Shop Name", y = "Evaluation") +
ggsignif::geom_signif(comparisons = combn(sort(unique(df_menu$mosmc)), 2, FUN = list),
test = "t.test", na.rm = T,
step_increase = 0.1)menu.csv is a hypothetical survey data on Mos Burger and McDonald’s.✔ You can down the data here:menu.csv
Q1: State null hypothesis
Q2: State alternative hypothesis
Q3: Conduct a t-test using t.test() function and show its result as shown in (1) Simple output in Section 3.
Q4: Explain the result in plain language
Q5: Visualize your result using boxplot and violin plot as shown in (2) Boxplot/Violin plot in Section 3.
Q6: Visualize your result using Bar chart as shown in (3) Bar chart in Section 3.
Q7: Visualize your result using the difference as shown in (4) Show the difference in Section 3.
test_score.csv is a hypothetical data on two examinations conducted before and after the Methods of Social Survey (MSS) class at Waseda University.
What we want to know is whether or not the MSS class makes student test scores better
✔ You can down the data here:test_score.csv
Q1: State null hypothesis
Q2: State alternative hypothesis
Q3: Conduct a t-test using t.test() function and show its result as shown in (1) Simple output in Section 3.
Q4: Explain the result in plain language
Q5: Visualize your result using boxplot and violin plot as shown in (2) Boxplot/Violin plot in Section 3.
Q6: Visualize your result using Bar chart as shown in (3) Bar chart in Section 3.
Q7: Visualize your result using the difference as shown in (4) Show the difference in Section 3.