INTRODUCTION TO APPLIED STATISTICS

Tutorial Sheet 6

DATE : October 26, 2018


Question 1: Define or briefly explain each of the following terms:
a) Null hypothesis
b) Alternative hypothesis
c) Critical point(s)
d) Significance level
e) Non-rejection region or Acceptance region
f) Rejection region
g) Test statistics
h) Type 1 errors
i) Type 2 error
j) P – value
k) Left tailed test
l) Right tailed test
m) Double tailed test

Answer:
a) Null hypothesis (H0H_0): A statement of no effect, no difference, or status quo. It is the hypothesis tested and assumed true until evidence contradicts it.
b) Alternative hypothesis (H1H_1 or HaH_a): A statement that contradicts H0H_0, representing the effect, difference, or change the test aims to detect.
c) Critical point(s): Value(s) on the test distribution that separate the rejection region from the non-rejection region. Determined by α\alpha and the sampling distribution.
d) Significance level (α\alpha): The probability of rejecting H0H_0 when it is true (Type I error). Common values are 0.05, 0.01, or 0.10.
e) Non-rejection region (Acceptance region): Range of test statistic values where H0H_0 is not rejected. If the test statistic falls here, we fail to reject H0H_0.
f) Rejection region: Range of test statistic values where H0H_0 is rejected. If the test statistic falls here, we reject H0H_0.
g) Test statistic: A standardized value calculated from sample data, used to decide whether to reject H0H_0 (e.g., zz, tt, χ2\chi^2).
h) Type I error: Rejecting H0H_0 when it is true. Probability is α\alpha.
i) Type II error: Failing to reject H0H_0 when it is false. Probability is β\beta.
j) P-value: The probability of obtaining a test statistic as extreme as or more extreme than the observed value, assuming H0H_0 is true. If pαp \leq \alpha, reject H0H_0.
k) Left-tailed test: Test where H1H_1 states a parameter is less than a value. Rejection region is in the left tail.
l) Right-tailed test: Test where H1H_1 states a parameter is greater than a value. Rejection region is in the right tail.
m) Double-tailed test (two-tailed): Test where H1H_1 states a parameter is not equal to a value. Rejection region is split between both tails.


Question 2: In a statistical test, we have a choice of a left-tailed test, a right-tailed test, or a two-tailed test. Is it the null hypothesis or the alternate hypothesis that determines which type of test is used? Explain your answer.

Answer:
The alternative hypothesis determines the type of test.

Explanation:

  • The null hypothesis (H0H_0) always states equality (e.g., μ=μ0\mu = \mu_0).
  • The alternative hypothesis (H1H_1) defines the direction of the test:
    • If H1:μ<μ0H_1: \mu < \mu_0 → left-tailed test.
    • If H1:μ>μ0H_1: \mu > \mu_0 → right-tailed test.
    • If H1:μμ0H_1: \mu \neq \mu_0 → two-tailed test.
      We design the test (rejection region) based on H1H_1 because we seek evidence against H0H_0 in favor of H1H_1.

Question 3: If we fail to reject (i.e., “accept”) the null hypothesis, does this mean that we have proved it to be true beyond all doubt? Explain your answer.

Answer:
No. Failing to reject H0H_0 does not prove it is true beyond doubt.

Explanation:

  • Hypothesis testing provides evidence against H0H_0, not proof for it.
  • Failing to reject H0H_0 means insufficient evidence to support H1H_1, but it does not confirm H0H_0.
  • Possible reasons: Small sample size, high variability, or a true effect too small to detect. There is always a risk of Type II error (β\beta).

Question 4: Let X be a random variable that represents the pH of arterial plasma (i.e., acidity of the blood). For healthy adults, the mean of the X distribution is μ=7.4\mu = 7.4 (Reference: Merck Manual, a commonly used reference in medical schools and nursing programs). A new drug for arthritis has been developed. However, it is thought that this drug may change blood pH. A random sample of 31 patients with arthritis took the drug for 3 months. Blood tests showed that xˉ=8.1\bar{x} = 8.1 with sample standard deviation s=1.9s = 1.9. Use a 5% level of significance to test the claim that the drug has changed (either way) the mean pH level of the blood.

Answer:
Step 1: Hypotheses

  • H0:μ=7.4H_0: \mu = 7.4 (no change)
  • H1:μ7.4H_1: \mu \neq 7.4 (change; two-tailed test)

Step 2: Test statistic
Population variance unknown, n=31>30n = 31 > 30 → use tt-test.
t=xˉμs/n=8.17.41.9/31=0.71.9/5.56776=0.70.34132.051t = \frac{\bar{x} - \mu}{s / \sqrt{n}} = \frac{8.1 - 7.4}{1.9 / \sqrt{31}} = \frac{0.7}{1.9 / 5.56776} = \frac{0.7}{0.3413} \approx 2.051
Degrees of freedom (df) = n1=30n - 1 = 30.

Step 3: Critical value
α=0.05\alpha = 0.05, two-tailed → α/2=0.025\alpha/2 = 0.025.
Critical tt-values: t0.025,30=±2.042t_{0.025, 30} = \pm 2.042.

Step 4: Decision
t=2.051>2.042|t| = 2.051 > 2.042 → Reject H0H_0.

Step 5: Conclusion
At 5% significance, there is sufficient evidence that the drug changed the mean blood pH.


Question 5: Total blood volume (in ml) per body weight (in kg) is important in medical research. For healthy adults, the red blood cell volume mean is about μ=28ml/kg\mu = 28 \, \text{ml/kg}. Red blood cell volume that is too low or too high can indicate a medical problem. Suppose that Graham has had seven blood tests, and the red blood cell volumes were
32254135303729\begin{array}{cccccc} 32 & 25 & 41 & 35 & 30 & 37 & 29 \\ \end{array}
Assume that Graham’s red blood cell volume has a normal distribution with σ=4.75\sigma = 4.75. Do the data indicate that Graham’s red blood cell volume is different from that of healthy adults? Use a 0.01 level of significance.

Answer:
Step 1: Hypotheses

  • H0:μ=28H_0: \mu = 28
  • H1:μ28H_1: \mu \neq 28 (two-tailed test)

Step 2: Sample mean
Data: 32,25,41,35,30,37,2932, 25, 41, 35, 30, 37, 29
xˉ=32+25+41+35+30+37+297=229732.714\bar{x} = \frac{32 + 25 + 41 + 35 + 30 + 37 + 29}{7} = \frac{229}{7} \approx 32.714

Step 3: Test statistic
σ\sigma known, normal distribution → zz-test.
z=xˉμσ/n=32.714284.75/7=4.7144.75/2.64575=4.7141.7952.627z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} = \frac{32.714 - 28}{4.75 / \sqrt{7}} = \frac{4.714}{4.75 / 2.64575} = \frac{4.714}{1.795} \approx 2.627

Step 4: Critical value
α=0.01\alpha = 0.01, two-tailed → z0.005=±2.576z_{0.005} = \pm 2.576.

Step 5: Decision
z=2.627>2.576|z| = 2.627 > 2.576 → Reject H0H_0.

Step 6: Conclusion
At 1% significance, Graham's red blood cell volume differs from healthy adults.


Question 6: A tobacco company advertises that the average nicotine content of its cigarettes is at most 14 milligrams. A consumer protection agency wants to determine whether the average nicotine content is in fact greater than 14. A random sample of 300 cigarettes of the company’s brand yield an average nicotine content of 14.6 and a standard deviation of 3.8 milligrams. If α=0.01\alpha = 0.01, is there significant evidence that the agency’s claim has been supported by the data?

Answer:
Step 1: Hypotheses

  • H0:μ14H_0: \mu \leq 14 (company's claim)
  • H1:μ>14H_1: \mu > 14 (agency's claim; right-tailed test)

Step 2: Test statistic
n=300>30n = 300 > 30zz-test.
z=xˉμs/n=14.6143.8/300=0.63.8/17.3205=0.60.21942.735z = \frac{\bar{x} - \mu}{s / \sqrt{n}} = \frac{14.6 - 14}{3.8 / \sqrt{300}} = \frac{0.6}{3.8 / 17.3205} = \frac{0.6}{0.2194} \approx 2.735

Step 3: Critical value
α=0.01\alpha = 0.01, right-tailed → z0.01=2.326z_{0.01} = 2.326.

Step 4: Decision
z=2.735>2.326z = 2.735 > 2.326 → Reject H0H_0.

Step 5: Conclusion
At 1% significance, there is evidence supporting the agency's claim that average nicotine > 14 mg.


Question 7: Diltiazem is a commonly prescribed drug for hypertension. However, diltiazem causes headaches in about 12% of patients using the drug. It is hypothesized that regular exercise might help reduce the headaches. If a random sample of 209 patients using diltiazem exercised regularly and only 16 had headaches, would this indicate a reduction in the population proportion of patients having headaches? Use a 1% level of significance.

Answer:
Step 1: Hypotheses

  • H0:p=0.12H_0: p = 0.12 (no reduction)
  • H1:p<0.12H_1: p < 0.12 (reduction; left-tailed test)

Step 2: Test statistic
Sample proportion p^=16/2090.0766\hat{p} = 16/209 \approx 0.0766.
z=p^p0p0(1p0)n=0.07660.120.12×0.88209=0.04340.000505=0.04340.02251.929z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}} = \frac{0.0766 - 0.12}{\sqrt{\frac{0.12 \times 0.88}{209}}} = \frac{-0.0434}{\sqrt{0.000505}} = \frac{-0.0434}{0.0225} \approx -1.929

Step 3: Critical value
α=0.01\alpha = 0.01, left-tailed → z0.01=2.326z_{0.01} = -2.326.

Step 4: Decision
z=1.929>2.326z = -1.929 > -2.326 → Fail to reject H0H_0.

Step 5: Conclusion
At 1% significance, there is insufficient evidence that exercise reduces headaches.


Question 8: Let Y be a random variable that represents red blood cell count (RBC) in millions of cells per cubic millimeter of whole blood. Then Y has a distribution that is approximately normal. For the population of healthy female adults, the mean of the Y distribution is about 4.8 (based on information from Diagnostic Tests with Nursing Implications, Springhouse Corporation). Suppose that a female patient has taken six laboratory blood tests over the past several months and that the RBC count data sent to the patient’s doctor are
4.94.24.54.14.44.3\begin{array}{cccccc} 4.9 & 4.2 & 4.5 & 4.1 & 4.4 & 4.3 \\ \end{array}
Do the given data indicate that the population mean RBC count for this patient is lower than that of the female adults? Use α=0.05\alpha = 0.05

Answer:
Step 1: Hypotheses

  • H0:μ=4.8H_0: \mu = 4.8
  • H1:μ<4.8H_1: \mu < 4.8 (left-tailed test)

Step 2: Sample mean and SD
Data: 4.9,4.2,4.5,4.1,4.4,4.34.9, 4.2, 4.5, 4.1, 4.4, 4.3
xˉ=4.9+4.2+4.5+4.1+4.4+4.36=26.46=4.4\bar{x} = \frac{4.9 + 4.2 + 4.5 + 4.1 + 4.4 + 4.3}{6} = \frac{26.4}{6} = 4.4
Variance: (xixˉ)2n1=(0.5)2+(0.2)2+(0.1)2+(0.3)2+(0.0)2+(0.1)25=0.45=0.08\frac{\sum (x_i - \bar{x})^2}{n-1} = \frac{(0.5)^2 + (-0.2)^2 + (0.1)^2 + (-0.3)^2 + (0.0)^2 + (-0.1)^2}{5} = \frac{0.4}{5} = 0.08
s=0.080.2828s = \sqrt{0.08} \approx 0.2828

Step 3: Test statistic
σ\sigma unknown, n=6n = 6tt-test.
t=xˉμs/n=4.44.80.2828/6=0.40.2828/2.449=0.40.11553.463t = \frac{\bar{x} - \mu}{s / \sqrt{n}} = \frac{4.4 - 4.8}{0.2828 / \sqrt{6}} = \frac{-0.4}{0.2828 / 2.449} = \frac{-0.4}{0.1155} \approx -3.463
df = 55.

Step 4: Critical value
α=0.05\alpha = 0.05, left-tailed → t0.05,5=2.015t_{0.05, 5} = -2.015.

Step 5: Decision
t=3.463<2.015t = -3.463 < -2.015 → Reject H0H_0.

Step 6: Conclusion
At 5% significance, the patient's mean RBC count is lower than healthy adults.


Question 9: Let X be a random variable that represents hemoglobin count (HC) in grams per 100 milliliters of whole blood. Then X has a distribution that is approximately normal, with population mean of about 14 for healthy adult women. Suppose that a female patient has taken 10 laboratory blood tests during the past year. The HC data sent to the patient’s doctor are
15181619141214171511\begin{array}{cccccccc} 15 & 18 & 16 & 19 & 14 & 12 & 14 & 17 & 15 & 11 \\ \end{array}
Does this information indicate that the population average HC for this patient is higher than that for healthy adult women? Use α=0.05\alpha = 0.05

Answer:
Step 1: Hypotheses

  • H0:μ=14H_0: \mu = 14
  • H1:μ>14H_1: \mu > 14 (right-tailed test)

Step 2: Sample mean and SD
Data: 15,18,16,19,14,12,14,17,15,1115, 18, 16, 19, 14, 12, 14, 17, 15, 11
xˉ=15110=15.1\bar{x} = \frac{151}{10} = 15.1
Variance: (xixˉ)2n1=(0.1)2+(2.9)2++(4.1)29=57.196.344\frac{\sum (x_i - \bar{x})^2}{n-1} = \frac{( -0.1)^2 + (2.9)^2 + \dots + (-4.1)^2}{9} = \frac{57.1}{9} \approx 6.344
s=6.3442.519s = \sqrt{6.344} \approx 2.519

Step 3: Test statistic
t=15.1142.519/10=1.12.519/3.162=1.10.7971.380t = \frac{15.1 - 14}{2.519 / \sqrt{10}} = \frac{1.1}{2.519 / 3.162} = \frac{1.1}{0.797} \approx 1.380
df = 99.

Step 4: Critical value
α=0.05\alpha = 0.05, right-tailed → t0.05,9=1.833t_{0.05, 9} = 1.833.

Step 5: Decision
t=1.380<1.833t = 1.380 < 1.833 → Fail to reject H0H_0.

Step 6: Conclusion
At 5% significance, there is insufficient evidence that the patient's HC is higher.


Question 10: A photocopying machine at Apex must be repaired if it produces more than 10% defective prints among a large lot of papers it photocopies in a day. A random sample of 100 papers from a day’s photocopies contain 15 defective papers and the head of examinations says that the machine must be repaired. Does the sample evidence support her decision? Use α=0.01\alpha = 0.01

Answer:
Step 1: Hypotheses

  • H0:p0.10H_0: p \leq 0.10 (no repair needed)
  • H1:p>0.10H_1: p > 0.10 (repair needed; right-tailed test)

Step 2: Test statistic
p^=15/100=0.15\hat{p} = 15/100 = 0.15
z=0.150.100.10×0.90100=0.050.0009=0.050.031.667z = \frac{0.15 - 0.10}{\sqrt{\frac{0.10 \times 0.90}{100}}} = \frac{0.05}{\sqrt{0.0009}} = \frac{0.05}{0.03} \approx 1.667

Step 3: Critical value
α=0.01\alpha = 0.01, right-tailed → z0.01=2.326z_{0.01} = 2.326.

Step 4: Decision
z=1.667<2.326z = 1.667 < 2.326 → Fail to reject H0H_0.

Step 5: Conclusion
At 1% significance, evidence does not support repair (defect rate ≤ 10%).


Question 11: A study was conducted of 90 adult male patients following a new treatment for congestive heart failure. One of the variables measured on the patients was the increase in exercise capacity (in minutes) over a 4-week treatment period. The previous treatment regime had produced an average increase of μ=2\mu = 2 minutes. The researchers wanted to evaluate whether the new treatment had increased the value of μ\mu in comparison to the previous treatment. The data yielded xˉ=2.17\bar{x} = 2.17 and s=1.05s = 1.05. Using α=0.05\alpha = 0.05, what conclusions can you draw about the research hypothesis?

Answer:
Step 1: Hypotheses

  • H0:μ2H_0: \mu \leq 2 (no increase)
  • H1:μ>2H_1: \mu > 2 (increase; right-tailed test)

Step 2: Test statistic
n=90>30n = 90 > 30zz-test.
z=2.1721.05/90=0.171.05/9.4868=0.170.11071.536z = \frac{2.17 - 2}{1.05 / \sqrt{90}} = \frac{0.17}{1.05 / 9.4868} = \frac{0.17}{0.1107} \approx 1.536

Step 3: Critical value
α=0.05\alpha = 0.05, right-tailed → z0.05=1.645z_{0.05} = 1.645.

Step 4: Decision
z=1.536<1.645z = 1.536 < 1.645 → Fail to reject H0H_0.

Step 5: Conclusion
At 5% significance, there is insufficient evidence that the new treatment increases exercise capacity.


Question 12: Two competing models are under consideration. Ten stoves of the first model and 12 stoves of the second model are tested. The following results are obtained.
Model 1: Mean time xˉ1=11.4min,σ1=2.5min,n1=10\bar{x}_1 = 11.4 \, \text{min}, \, \sigma_1 = 2.5 \, \text{min}, \, n_1 = 10
Model 2: Mean time xˉ2=9.9min,σ2=3.0min,n2=12\bar{x}_2 = 9.9 \, \text{min}, \, \sigma_2 = 3.0 \, \text{min}, \, n_2 = 12
Assume that the time required to bring water to a boil is normally distributed for each stove. Is there any difference (either way) between the performances of these two models? Use a 5% level of significance.

Answer:
Step 1: Hypotheses

  • H0:μ1=μ2H_0: \mu_1 = \mu_2
  • H1:μ1μ2H_1: \mu_1 \neq \mu_2 (two-tailed test)

Step 2: Test statistic
σ1,σ2\sigma_1, \sigma_2 known → zz-test.
z=(xˉ1xˉ2)0σ12n1+σ22n2=11.49.96.2510+912=1.50.625+0.75=1.51.3751.51.17261.279z = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}} = \frac{11.4 - 9.9}{\sqrt{\frac{6.25}{10} + \frac{9}{12}}} = \frac{1.5}{\sqrt{0.625 + 0.75}} = \frac{1.5}{\sqrt{1.375}} \approx \frac{1.5}{1.1726} \approx 1.279

Step 3: Critical value
α=0.05\alpha = 0.05, two-tailed → z0.025=±1.96z_{0.025} = \pm 1.96.

Step 4: Decision
z=1.279<1.96|z| = 1.279 < 1.96 → Fail to reject H0H_0.

Step 5: Conclusion
At 5% significance, no difference in mean time between models.


Question 13: Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily absorption of brand A and brand B headache remedies. Twelve people were randomly selected and given an oral dosage of brand A. Another 12 were randomly selected and given an equal dosage of brand B. The lengths of time in minutes for the drugs to reach a specified level in the blood were recorded. The means, standard deviations, and sizes of the two samples follow.
Brand A: xˉ1=21.8min;s1=8.7min;n1=12\bar{x}_1 = 21.8 \, \text{min}; \, s_1 = 8.7 \, \text{min}; \, n_1 = 12
Brand B: xˉ2=18.9min;s2=7.5min;n2=12\bar{x}_2 = 18.9 \, \text{min}; \, s_2 = 7.5 \, \text{min}; \, n_2 = 12
Past experience with the drug composition of the two remedies permits researchers to assume that both distributions are approximately normal. Let us use a 5% level of significance to test the claim that there is no difference in the mean time required for bodily absorption. Also, find or estimate the P-value of the sample test statistic.

Answer:
Step 1: Hypotheses

  • H0:μ1=μ2H_0: \mu_1 = \mu_2
  • H1:μ1μ2H_1: \mu_1 \neq \mu_2 (two-tailed test)

Step 2: Test statistic
Variance ratio: s12/s22=75.69/56.251.345<2s_1^2/s_2^2 = 75.69/56.25 \approx 1.345 < 2 → assume equal variances. Pooled variance:


sp2=(n11)s12+(n21)s22n1+n22=11×75.69+11×56.2522=1451.342265.97s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2} = \frac{11 \times 75.69 + 11 \times 56.25}{22} = \frac{1451.34}{22} \approx 65.97


t=xˉ1xˉ2sp1n1+1n2=21.818.965.97112+112=2.98.122×1/62.98.122×0.40822.93.3150.875t = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}} = \frac{21.8 - 18.9}{\sqrt{65.97} \sqrt{\frac{1}{12} + \frac{1}{12}}} = \frac{2.9}{8.122 \times \sqrt{1/6}} \approx \frac{2.9}{8.122 \times 0.4082} \approx \frac{2.9}{3.315} \approx 0.875


df = n1+n22=22n_1 + n_2 - 2 = 22.

Step 3: Critical value
α=0.05\alpha = 0.05, two-tailed → t0.025,22=±2.074t_{0.025, 22} = \pm 2.074.

Step 4: Decision
t=0.875<2.074|t| = 0.875 < 2.074 → Fail to reject H0H_0.

P-value estimate
For t=0.875t = 0.875, df = 22:

  • From t-table, t0.20=0.857t_{0.20} = 0.857 (P(T > 0.857) = 0.20.
  • Since 0.875 ≈ 0.857, P(T > 0.875) ≈ 0.20.
  • Two-tailed P-value ≈ 2×0.20=0.402 \times 0.20 = 0.40.
  • (Exact P-value ≈ 0.390).

Conclusion
P-value > 0.05 → Fail to reject H0H_0. No difference in mean absorption time.


Question 14: A random sample of n1=228n_1 = 228 students registered in the Pre medical Faculty showed that 141 voted in the last students ‘union election. A random sample of n2=216n_2 = 216 registered students in the Nursing Faculty showed that 125 voted in the most recent students’ union election. Do these data indicate that the student population proportion of voters’ turnout in Pre medicals is higher than that in Nursing Faculty? Use a 5% level of significance.

Answer:
Step 1: Hypotheses

  • H0:p1=p2H_0: p_1 = p_2
  • H1:p1>p2H_1: p_1 > p_2 (right-tailed test)

Step 2: Test statistic
p1^=141/2280.6184\hat{p_1} = 141/228 \approx 0.6184, p2^=125/2160.5787\hat{p_2} = 125/216 \approx 0.5787.


Pooled proportion: p^=141+125228+216=2664440.5991\hat{p} = \frac{141 + 125}{228 + 216} = \frac{266}{444} \approx 0.5991.

z=p1^p2^p^(1p^)(1n1+1n2)=0.61840.57870.5991×0.4009×(1228+1216)=0.03970.2402×0.0090160.03970.046540.853z = \frac{\hat{p_1} - \hat{p_2}}{\sqrt{\hat{p}(1 - \hat{p}) \left( \frac{1}{n_1} + \frac{1}{n_2} \right)}} = \frac{0.6184 - 0.5787}{\sqrt{0.5991 \times 0.4009 \times \left( \frac{1}{228} + \frac{1}{216} \right)}} = \frac{0.0397}{\sqrt{0.2402 \times 0.009016}} \approx \frac{0.0397}{0.04654} \approx 0.853

Step 3: Critical value
α=0.05\alpha = 0.05, right-tailed → z0.05=1.645z_{0.05} = 1.645.

Step 4: Decision
z=0.853<1.645z = 0.853 < 1.645 → Fail to reject H0H_0.

Step 5: Conclusion
At 5% significance, no evidence that Pre-medical voter turnout is higher than Nursing.

@Dr. Microbiota


End of Solutions