- A t-test is a statistical test that is used to compare the means of two groups.

- It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different each other.

- The Central Limit Theorem suggests that even if the original variables themselves are not normally distributed, distribution of samples converges to normal as the number of samples increases.

→ This is the theoretical foundation that we can conduct statistical inference and hypothesis test using a normal distribution.

- However, it is not always the case that we can have access to sizable data.
- t test provides us a solution of this small N problem in hypothesis testing.

When you conduct a t-test, you need to consider two things:

(1) whether the groups being compared come from a single population or two different populations

(3) whether you want to test the difference in a specific direction or both directions

- If the groups come from a single population (for example, measuring before and after an medical treatment), perform a
**paired t-test**. - If the groups come from two different populations (for exmaple, comparing which hamberger tastes better between McDonald’s and In-N-Out Burger), perform a
**two-sample t-test**. - If there is one group being compared against a standard value (for example, comparing your test score to the averaged test score of your friends in the same class), perform a
**one-sample t-test**.

- If you only care whether the two populations are different from one another, perform a
**two-tailed t-test**.

- If you want to know whether one population mean is greater than or less than the other, perform a
**one-tailed t-test**.

- A
**one-tailed t-test**is way harder to be passed.

Denoted as | Details | |
---|---|---|

significance level |
α | The probability of the study rejecting the null hypothesis |

significance probability |
p-value | The probability of obtaining a result at least as extreme, given that the null hypothesis is true. |

- In statistical hypothesis testing, a result has
**statistical significance**when it is very unlikely to have occurred given the null hypothesis.

- The
**null hypothesis**(often denoted \(H_0\) ) is a default hypothesis that a quantity to be measured is zero (null).

- The
**alternative hypothesis**(often denoted \(H_1\) ) is a position that states something is happening, a new theory is preferred instead of an old one (null hypothesis).

- A study’s defined
**significance level**, denoted by \(α\), is the probability of the study rejecting the null hypothesis, given that the null hypothesis was assumed to be true.

**Significance probability**, the p-value of a result (\(p\)), is the probability of obtaining a result at least as extreme, given that the null hypothesis is true.

- The result is
**statistically significant**, by the standards of the study, when \(p≤α\).

- The significance level for a study is chosen before data collection, and is typically set to 5% or much lower—depending on the field of study.

`t test`

- Let’s suppose you drew 10 numbers (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) from the population (population mean \(μ\) = 5.5)

- The sample mean = 5.5

(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)/10 = 5.5

- What we want to know:

Whether or not the population mean is 5.

- To confirm this, we need to conduct t test.

`t test`

`null hypothesis`

: \(H_0\)
`alternative hypothesis`

: \(H_1\)
`t value`

`critical values`

`t value`

is within the rejection area
**Process of t test in this case**

- \(H_0\): the population mean = 5

- What we want to know is “The sample mean is 5.5. With this result, can we conclude that the population mean is 5?”

- \(H_1\): the population mean is not 5

- \(H_0\) and \(H_1\) are
`mutually exclusive`

- Calculate \(t value\) with the following equation:

\[T = \frac{\bar{x} - μ_0}{SE} = \frac{\bar{x} - μ_0}{u_x / \sqrt{n}}\]

- \(\bar{x}\) : Sample mean (= 5.5)

- \(μ_0\) : The value we want to estimate (= 5)

- \(n\) : Sample size (= 10)

- \(u_x\): unbiased standard deviation

- \(SE\) :
`standard Error: SE`

★ \(u_x^2\) can be calculated with the following eauation:

\[u_x^2 = \frac{\sum_{i=1}^n (x_i - \bar{x})^2}{n-1}\]

- \(u_x\) = 3.03

- If we plug in \(μ_0\) = 5 in the equation above, we get the following t value:

\[T = \frac{\bar{x} - μ_0}{u_x / \sqrt{n}}\]

\[ = \frac{{5.5} - 5}{3.03 / \sqrt{10}}\]

\[ = 0.522\]

**We want to know whether or not the population mean is 5.**

- Point estimation involves the use of sample data to calculate a single value (known as a point estimate) which is to serve as a “best guess” or “best estimate” of an unknown population parameter (for example, the population mean).

- We estimate the sample mean, 0.522, and infer whether the population mean is 5.