# 1. Why we conduct a t test?

• A t-test is a statistical test that is used to compare the means of two groups.
• It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different each other.
• The Central Limit Theorem suggests that even if the original variables themselves are not normally distributed, distribution of samples converges to normal as the number of samples increases.
→ This is the theoretical foundation that we can conduct statistical inference and hypothesis test using a normal distribution.
• However, it is not always the case that we can have access to sizable data.
• t test provides us a solution of this small N problem in hypothesis testing.

# 2. What type of t-test should we use?

When you conduct a t-test, you need to consider two things:
(1) whether the groups being compared come from a single population or two different populations
(3) whether you want to test the difference in a specific direction or both directions

#### One-sample, two-sample, or paired t-test?

• If the groups come from a single population (for example, measuring before and after an medical treatment), perform a paired t-test.
• If the groups come from two different populations (for exmaple, comparing which hamberger tastes better between McDonald’s and In-N-Out Burger), perform a two-sample t-test.
• If there is one group being compared against a standard value (for example, comparing your test score to the averaged test score of your friends in the same class), perform a one-sample t-test.

#### One-tailed or two-tailed t-test?

• If you only care whether the two populations are different from one another, perform a two-tailed t-test.
• If you want to know whether one population mean is greater than or less than the other, perform a one-tailed t-test.
• A one-tailed t-test is way harder to be passed.

# 3. Statistical significance

Denoted as Details
significance level α The probability of the study rejecting the null hypothesis
significance probability p-value The probability of obtaining a result at least as extreme, given that the null hypothesis is true.
• In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis.
• The null hypothesis (often denoted $$H_0$$ ) is a default hypothesis that a quantity to be measured is zero (null).
• The alternative hypothesis (often denoted $$H_1$$ ) is a position that states something is happening, a new theory is preferred instead of an old one (null hypothesis).
• A study’s defined significance level, denoted by $$α$$, is the probability of the study rejecting the null hypothesis, given that the null hypothesis was assumed to be true.
• Significance probability, the p-value of a result ($$p$$), is the probability of obtaining a result at least as extreme, given that the null hypothesis is true.
• The result is statistically significant, by the standards of the study, when $$p≤α$$.
• The significance level for a study is chosen before data collection, and is typically set to 5% or much lower—depending on the field of study.

## 3.1 Process of t test

• Let’s suppose you drew 10 numbers (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) from the population (population mean $$μ$$ = 5.5)
• The sample mean = 5.5
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)/10 = 5.5
• What we want to know:
Whether or not the population mean is 5.
• To confirm this, we need to conduct t test.
Process of t test
• Show null hypothesis: $$H_0$$
• Show alternative hypothesis: $$H_1$$
• Calculate t value
• Identify the critical values
• Check if your t value is within the rejection area
• Conclusion
• Process of t test in this case

• $$H_0$$: the population mean = 5
• What we want to know is “The sample mean is 5.5. With this result, can we conclude that the population mean is 5?”
• $$H_1$$: the population mean is not 5
• $$H_0$$ and $$H_1$$ are mutually exclusive
• Calculate $$t value$$ with the following equation:

$T = \frac{\bar{x} - μ_0}{SE} = \frac{\bar{x} - μ_0}{u_x / \sqrt{n}}$

• $$\bar{x}$$ : Sample mean (= 5.5)
• $$μ_0$$ : The value we want to estimate (= 5)
• $$n$$ : Sample size (= 10)
• $$u_x$$: unbiased standard deviation
• $$SE$$ : standard Error: SE

$$u_x^2$$ can be calculated with the following eauation:

$u_x^2 = \frac{\sum_{i=1}^n (x_i - \bar{x})^2}{n-1}$

• $$u_x$$ = 3.03
• If we plug in $$μ_0$$ = 5 in the equation above, we get the following t value:

$T = \frac{\bar{x} - μ_0}{u_x / \sqrt{n}}$

$= \frac{{5.5} - 5}{3.03 / \sqrt{10}}$

$= 0.522$

• We want to know whether or not the population mean is 5.

#### Point estimation

• Point estimation involves the use of sample data to calculate a single value (known as a point estimate) which is to serve as a “best guess” or “best estimate” of an unknown population parameter (for example, the population mean).
• We estimate the sample mean, 0.522, and infer whether the population mean is 5.