INTRODUCTION TO APPLIED STATISTICS

Tutorial Sheet 6

DATE : October 26, 2018

Question 1: Define or briefly explain each of the following terms:
a) Null hypothesis
b) Alternative hypothesis
c) Critical point(s)
d) Significance level
e) Non-rejection region or Acceptance region
f) Rejection region
g) Test statistics
h) Type 1 errors
i) Type 2 error
j) P – value
k) Left tailed test
l) Right tailed test
m) Double tailed test

Answer:
a) Null hypothesis ( $H_0$ ): A statement of no effect, no difference, or status quo. It is the hypothesis tested and assumed true until evidence contradicts it.
b) Alternative hypothesis ( $H_1$ or $H_a$ ): A statement that contradicts $H_0$ , representing the effect, difference, or change the test aims to detect.
c) Critical point(s): Value(s) on the test distribution that separate the rejection region from the non-rejection region. Determined by $\alpha$ and the sampling distribution.
d) Significance level ( $\alpha$ ): The probability of rejecting $H_0$ when it is true (Type I error). Common values are 0.05, 0.01, or 0.10.
e) Non-rejection region (Acceptance region): Range of test statistic values where $H_0$ is not rejected. If the test statistic falls here, we fail to reject $H_0$ .
f) Rejection region: Range of test statistic values where $H_0$ is rejected. If the test statistic falls here, we reject $H_0$ .
g) Test statistic: A standardized value calculated from sample data, used to decide whether to reject $H_0$ (e.g., $z$ , $t$ , $\chi^2$ ).
h) Type I error: Rejecting $H_0$ when it is true. Probability is $\alpha$ .
i) Type II error: Failing to reject $H_0$ when it is false. Probability is $\beta$ .
j) P-value: The probability of obtaining a test statistic as extreme as or more extreme than the observed value, assuming $H_0$ is true. If $p \leq \alpha$ , reject $H_0$ .
k) Left-tailed test: Test where $H_1$ states a parameter is less than a value. Rejection region is in the left tail.
l) Right-tailed test: Test where $H_1$ states a parameter is greater than a value. Rejection region is in the right tail.
m) Double-tailed test (two-tailed): Test where $H_1$ states a parameter is not equal to a value. Rejection region is split between both tails.

Question 2: In a statistical test, we have a choice of a left-tailed test, a right-tailed test, or a two-tailed test. Is it the null hypothesis or the alternate hypothesis that determines which type of test is used? Explain your answer.

Answer:
The alternative hypothesis determines the type of test.

Explanation:

The null hypothesis ( $H_0$ ) always states equality (e.g., $\mu = \mu_0$ ).
The alternative hypothesis ( $H_1$ ) defines the direction of the test:
- If $H_1: \mu < \mu_0$ → left-tailed test.
- If $H_1: \mu > \mu_0$ → right-tailed test.
- If $H_1: \mu \neq \mu_0$ → two-tailed test.
  We design the test (rejection region) based on $H_1$ because we seek evidence against $H_0$ in favor of $H_1$ .

Question 3: If we fail to reject (i.e., “accept”) the null hypothesis, does this mean that we have proved it to be true beyond all doubt? Explain your answer.

Answer:
No. Failing to reject $H_0$ does not prove it is true beyond doubt.

Explanation:

Hypothesis testing provides evidence against $H_0$ , not proof for it.
Failing to reject $H_0$ means insufficient evidence to support $H_1$ , but it does not confirm $H_0$ .
Possible reasons: Small sample size, high variability, or a true effect too small to detect. There is always a risk of Type II error ( $\beta$ ).

Question 4: Let X be a random variable that represents the pH of arterial plasma (i.e., acidity of the blood). For healthy adults, the mean of the X distribution is $\mu = 7.4$ (Reference: Merck Manual, a commonly used reference in medical schools and nursing programs). A new drug for arthritis has been developed. However, it is thought that this drug may change blood pH. A random sample of 31 patients with arthritis took the drug for 3 months. Blood tests showed that $\bar{x} = 8.1$ with sample standard deviation $s = 1.9$ . Use a 5% level of significance to test the claim that the drug has changed (either way) the mean pH level of the blood.

Answer:
Step 1: Hypotheses

$H_0: \mu = 7.4$ (no change)
$H_1: \mu \neq 7.4$ (change; two-tailed test)

Step 2: Test statistic
Population variance unknown, $n = 31 > 30$ → use $t$ -test.
$t = \frac{\bar{x} - \mu}{s / \sqrt{n}} = \frac{8.1 - 7.4}{1.9 / \sqrt{31}} = \frac{0.7}{1.9 / 5.56776} = \frac{0.7}{0.3413} \approx 2.051$
Degrees of freedom (df) = $n - 1 = 30$ .

Step 3: Critical value
$\alpha = 0.05$ , two-tailed → $\alpha/2 = 0.025$ .
Critical $t$ -values: $t_{0.025, 30} = \pm 2.042$ .

Step 4: Decision
$|t| = 2.051 > 2.042$ → Reject $H_0$ .

Step 5: Conclusion
At 5% significance, there is sufficient evidence that the drug changed the mean blood pH.

Question 5: Total blood volume (in ml) per body weight (in kg) is important in medical research. For healthy adults, the red blood cell volume mean is about $\mu = 28 \, \text{ml/kg}$ . Red blood cell volume that is too low or too high can indicate a medical problem. Suppose that Graham has had seven blood tests, and the red blood cell volumes were
$\begin{array}{cccccc} 32 & 25 & 41 & 35 & 30 & 37 & 29 \\ \end{array}$
Assume that Graham’s red blood cell volume has a normal distribution with $\sigma = 4.75$ . Do the data indicate that Graham’s red blood cell volume is different from that of healthy adults? Use a 0.01 level of significance.

Answer:
Step 1: Hypotheses

$H_0: \mu = 28$
$H_1: \mu \neq 28$ (two-tailed test)

Step 2: Sample mean
Data: $32, 25, 41, 35, 30, 37, 29$
$\bar{x} = \frac{32 + 25 + 41 + 35 + 30 + 37 + 29}{7} = \frac{229}{7} \approx 32.714$

Step 3: Test statistic
$\sigma$ known, normal distribution → $z$ -test.
$z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} = \frac{32.714 - 28}{4.75 / \sqrt{7}} = \frac{4.714}{4.75 / 2.64575} = \frac{4.714}{1.795} \approx 2.627$

Step 4: Critical value
$\alpha = 0.01$ , two-tailed → $z_{0.005} = \pm 2.576$ .

Step 5: Decision
$|z| = 2.627 > 2.576$ → Reject $H_0$ .

Step 6: Conclusion
At 1% significance, Graham's red blood cell volume differs from healthy adults.

Question 6: A tobacco company advertises that the average nicotine content of its cigarettes is at most 14 milligrams. A consumer protection agency wants to determine whether the average nicotine content is in fact greater than 14. A random sample of 300 cigarettes of the company’s brand yield an average nicotine content of 14.6 and a standard deviation of 3.8 milligrams. If $\alpha = 0.01$ , is there significant evidence that the agency’s claim has been supported by the data?

Answer:
Step 1: Hypotheses

$H_0: \mu \leq 14$ (company's claim)
$H_1: \mu > 14$ (agency's claim; right-tailed test)

Step 2: Test statistic
$n = 300 > 30$ → $z$ -test.
$z = \frac{\bar{x} - \mu}{s / \sqrt{n}} = \frac{14.6 - 14}{3.8 / \sqrt{300}} = \frac{0.6}{3.8 / 17.3205} = \frac{0.6}{0.2194} \approx 2.735$

Step 3: Critical value
$\alpha = 0.01$ , right-tailed → $z_{0.01} = 2.326$ .

Step 4: Decision
$z = 2.735 > 2.326$ → Reject $H_0$ .

Step 5: Conclusion
At 1% significance, there is evidence supporting the agency's claim that average nicotine > 14 mg.

Question 7: Diltiazem is a commonly prescribed drug for hypertension. However, diltiazem causes headaches in about 12% of patients using the drug. It is hypothesized that regular exercise might help reduce the headaches. If a random sample of 209 patients using diltiazem exercised regularly and only 16 had headaches, would this indicate a reduction in the population proportion of patients having headaches? Use a 1% level of significance.

Answer:
Step 1: Hypotheses

$H_0: p = 0.12$ (no reduction)
$H_1: p < 0.12$ (reduction; left-tailed test)

Step 2: Test statistic
Sample proportion $\hat{p} = 16/209 \approx 0.0766$ .
$z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}} = \frac{0.0766 - 0.12}{\sqrt{\frac{0.12 \times 0.88}{209}}} = \frac{-0.0434}{\sqrt{0.000505}} = \frac{-0.0434}{0.0225} \approx -1.929$

Step 3: Critical value
$\alpha = 0.01$ , left-tailed → $z_{0.01} = -2.326$ .

Step 4: Decision
$z = -1.929 > -2.326$ → Fail to reject $H_0$ .

Step 5: Conclusion
At 1% significance, there is insufficient evidence that exercise reduces headaches.

Question 8: Let Y be a random variable that represents red blood cell count (RBC) in millions of cells per cubic millimeter of whole blood. Then Y has a distribution that is approximately normal. For the population of healthy female adults, the mean of the Y distribution is about 4.8 (based on information from Diagnostic Tests with Nursing Implications, Springhouse Corporation). Suppose that a female patient has taken six laboratory blood tests over the past several months and that the RBC count data sent to the patient’s doctor are
$\begin{array}{cccccc} 4.9 & 4.2 & 4.5 & 4.1 & 4.4 & 4.3 \\ \end{array}$
Do the given data indicate that the population mean RBC count for this patient is lower than that of the female adults? Use $\alpha = 0.05$

Answer:
Step 1: Hypotheses

$H_0: \mu = 4.8$
$H_1: \mu < 4.8$ (left-tailed test)

Step 2: Sample mean and SD
Data: $4.9, 4.2, 4.5, 4.1, 4.4, 4.3$
$\bar{x} = \frac{4.9 + 4.2 + 4.5 + 4.1 + 4.4 + 4.3}{6} = \frac{26.4}{6} = 4.4$
Variance: $\frac{\sum (x_i - \bar{x})^2}{n-1} = \frac{(0.5)^2 + (-0.2)^2 + (0.1)^2 + (-0.3)^2 + (0.0)^2 + (-0.1)^2}{5} = \frac{0.4}{5} = 0.08$
$s = \sqrt{0.08} \approx 0.2828$

Step 3: Test statistic
$\sigma$ unknown, $n = 6$ → $t$ -test.
$t = \frac{\bar{x} - \mu}{s / \sqrt{n}} = \frac{4.4 - 4.8}{0.2828 / \sqrt{6}} = \frac{-0.4}{0.2828 / 2.449} = \frac{-0.4}{0.1155} \approx -3.463$
df = $5$ .

Step 4: Critical value
$\alpha = 0.05$ , left-tailed → $t_{0.05, 5} = -2.015$ .

Step 5: Decision
$t = -3.463 < -2.015$ → Reject $H_0$ .

Step 6: Conclusion
At 5% significance, the patient's mean RBC count is lower than healthy adults.

Question 9: Let X be a random variable that represents hemoglobin count (HC) in grams per 100 milliliters of whole blood. Then X has a distribution that is approximately normal, with population mean of about 14 for healthy adult women. Suppose that a female patient has taken 10 laboratory blood tests during the past year. The HC data sent to the patient’s doctor are
$\begin{array}{cccccccc} 15 & 18 & 16 & 19 & 14 & 12 & 14 & 17 & 15 & 11 \\ \end{array}$
Does this information indicate that the population average HC for this patient is higher than that for healthy adult women? Use $\alpha = 0.05$

Answer:
Step 1: Hypotheses

$H_0: \mu = 14$
$H_1: \mu > 14$ (right-tailed test)

Step 2: Sample mean and SD
Data: $15, 18, 16, 19, 14, 12, 14, 17, 15, 11$
$\bar{x} = \frac{151}{10} = 15.1$
Variance: $\frac{\sum (x_i - \bar{x})^2}{n-1} = \frac{( -0.1)^2 + (2.9)^2 + \dots + (-4.1)^2}{9} = \frac{57.1}{9} \approx 6.344$
$s = \sqrt{6.344} \approx 2.519$

Step 3: Test statistic
$t = \frac{15.1 - 14}{2.519 / \sqrt{10}} = \frac{1.1}{2.519 / 3.162} = \frac{1.1}{0.797} \approx 1.380$
df = $9$ .

Step 4: Critical value
$\alpha = 0.05$ , right-tailed → $t_{0.05, 9} = 1.833$ .

Step 5: Decision
$t = 1.380 < 1.833$ → Fail to reject $H_0$ .

Step 6: Conclusion
At 5% significance, there is insufficient evidence that the patient's HC is higher.

Question 10: A photocopying machine at Apex must be repaired if it produces more than 10% defective prints among a large lot of papers it photocopies in a day. A random sample of 100 papers from a day’s photocopies contain 15 defective papers and the head of examinations says that the machine must be repaired. Does the sample evidence support her decision? Use $\alpha = 0.01$

Answer:
Step 1: Hypotheses

$H_0: p \leq 0.10$ (no repair needed)
$H_1: p > 0.10$ (repair needed; right-tailed test)

Step 2: Test statistic
$\hat{p} = 15/100 = 0.15$
$z = \frac{0.15 - 0.10}{\sqrt{\frac{0.10 \times 0.90}{100}}} = \frac{0.05}{\sqrt{0.0009}} = \frac{0.05}{0.03} \approx 1.667$

Step 3: Critical value
$\alpha = 0.01$ , right-tailed → $z_{0.01} = 2.326$ .

Step 4: Decision
$z = 1.667 < 2.326$ → Fail to reject $H_0$ .

Step 5: Conclusion
At 1% significance, evidence does not support repair (defect rate ≤ 10%).

Question 11: A study was conducted of 90 adult male patients following a new treatment for congestive heart failure. One of the variables measured on the patients was the increase in exercise capacity (in minutes) over a 4-week treatment period. The previous treatment regime had produced an average increase of $\mu = 2$ minutes. The researchers wanted to evaluate whether the new treatment had increased the value of $\mu$ in comparison to the previous treatment. The data yielded $\bar{x} = 2.17$ and $s = 1.05$ . Using $\alpha = 0.05$ , what conclusions can you draw about the research hypothesis?

Answer:
Step 1: Hypotheses

$H_0: \mu \leq 2$ (no increase)
$H_1: \mu > 2$ (increase; right-tailed test)

Step 2: Test statistic
$n = 90 > 30$ → $z$ -test.
$z = \frac{2.17 - 2}{1.05 / \sqrt{90}} = \frac{0.17}{1.05 / 9.4868} = \frac{0.17}{0.1107} \approx 1.536$

Step 3: Critical value
$\alpha = 0.05$ , right-tailed → $z_{0.05} = 1.645$ .

Step 4: Decision
$z = 1.536 < 1.645$ → Fail to reject $H_0$ .

Step 5: Conclusion
At 5% significance, there is insufficient evidence that the new treatment increases exercise capacity.

Question 12: Two competing models are under consideration. Ten stoves of the first model and 12 stoves of the second model are tested. The following results are obtained.
Model 1: Mean time $\bar{x}_1 = 11.4 \, \text{min}, \, \sigma_1 = 2.5 \, \text{min}, \, n_1 = 10$
Model 2: Mean time $\bar{x}_2 = 9.9 \, \text{min}, \, \sigma_2 = 3.0 \, \text{min}, \, n_2 = 12$
Assume that the time required to bring water to a boil is normally distributed for each stove. Is there any difference (either way) between the performances of these two models? Use a 5% level of significance.

Answer:
Step 1: Hypotheses

$H_0: \mu_1 = \mu_2$
$H_1: \mu_1 \neq \mu_2$ (two-tailed test)

Step 2: Test statistic
$\sigma_1, \sigma_2$ known → $z$ -test.
$z = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}} = \frac{11.4 - 9.9}{\sqrt{\frac{6.25}{10} + \frac{9}{12}}} = \frac{1.5}{\sqrt{0.625 + 0.75}} = \frac{1.5}{\sqrt{1.375}} \approx \frac{1.5}{1.1726} \approx 1.279$

Step 3: Critical value
$\alpha = 0.05$ , two-tailed → $z_{0.025} = \pm 1.96$ .

Step 4: Decision
$|z| = 1.279 < 1.96$ → Fail to reject $H_0$ .

Step 5: Conclusion
At 5% significance, no difference in mean time between models.

Question 13: Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily absorption of brand A and brand B headache remedies. Twelve people were randomly selected and given an oral dosage of brand A. Another 12 were randomly selected and given an equal dosage of brand B. The lengths of time in minutes for the drugs to reach a specified level in the blood were recorded. The means, standard deviations, and sizes of the two samples follow.
Brand A: $\bar{x}_1 = 21.8 \, \text{min}; \, s_1 = 8.7 \, \text{min}; \, n_1 = 12$
Brand B: $\bar{x}_2 = 18.9 \, \text{min}; \, s_2 = 7.5 \, \text{min}; \, n_2 = 12$
Past experience with the drug composition of the two remedies permits researchers to assume that both distributions are approximately normal. Let us use a 5% level of significance to test the claim that there is no difference in the mean time required for bodily absorption. Also, find or estimate the P-value of the sample test statistic.

Answer:
Step 1: Hypotheses

$H_0: \mu_1 = \mu_2$
$H_1: \mu_1 \neq \mu_2$ (two-tailed test)

Step 2: Test statistic
Variance ratio: $s_1^2/s_2^2 = 75.69/56.25 \approx 1.345 < 2$ → assume equal variances. Pooled variance:

$s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2} = \frac{11 \times 75.69 + 11 \times 56.25}{22} = \frac{1451.34}{22} \approx 65.97$

$t = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}} = \frac{21.8 - 18.9}{\sqrt{65.97} \sqrt{\frac{1}{12} + \frac{1}{12}}} = \frac{2.9}{8.122 \times \sqrt{1/6}} \approx \frac{2.9}{8.122 \times 0.4082} \approx \frac{2.9}{3.315} \approx 0.875$

df = $n_1 + n_2 - 2 = 22$ .

Step 3: Critical value
$\alpha = 0.05$ , two-tailed → $t_{0.025, 22} = \pm 2.074$ .

Step 4: Decision
$|t| = 0.875 < 2.074$ → Fail to reject $H_0$ .

P-value estimate
For $t = 0.875$ , df = 22:

From t-table, $t_{0.20} = 0.857$ (P(T > 0.857) = 0.20.
Since 0.875 ≈ 0.857, P(T > 0.875) ≈ 0.20.
Two-tailed P-value ≈ $2 \times 0.20 = 0.40$ .
(Exact P-value ≈ 0.390).

Conclusion
P-value > 0.05 → Fail to reject $H_0$ . No difference in mean absorption time.

Question 14: A random sample of $n_1 = 228$ students registered in the Pre medical Faculty showed that 141 voted in the last students ‘union election. A random sample of $n_2 = 216$ registered students in the Nursing Faculty showed that 125 voted in the most recent students’ union election. Do these data indicate that the student population proportion of voters’ turnout in Pre medicals is higher than that in Nursing Faculty? Use a 5% level of significance.

Answer:
Step 1: Hypotheses

$H_0: p_1 = p_2$
$H_1: p_1 > p_2$ (right-tailed test)

Step 2: Test statistic
$\hat{p_1} = 141/228 \approx 0.6184$ , $\hat{p_2} = 125/216 \approx 0.5787$ .

Pooled proportion: $\hat{p} = \frac{141 + 125}{228 + 216} = \frac{266}{444} \approx 0.5991$ .

$z = \frac{\hat{p_1} - \hat{p_2}}{\sqrt{\hat{p}(1 - \hat{p}) \left( \frac{1}{n_1} + \frac{1}{n_2} \right)}} = \frac{0.6184 - 0.5787}{\sqrt{0.5991 \times 0.4009 \times \left( \frac{1}{228} + \frac{1}{216} \right)}} = \frac{0.0397}{\sqrt{0.2402 \times 0.009016}} \approx \frac{0.0397}{0.04654} \approx 0.853$

Step 3: Critical value
$\alpha = 0.05$ , right-tailed → $z_{0.05} = 1.645$ .

Step 4: Decision
$z = 0.853 < 1.645$ → Fail to reject $H_0$ .

Step 5: Conclusion
At 5% significance, no evidence that Pre-medical voter turnout is higher than Nursing.

@Dr. Microbiota

End of Solutions