Bus Statistics: US Department of Transportation.
order#123762366,
Topic:
Bus Statistics
Type of paper:
Coursework
Discipline:
Mathematics and Statistics
Format or citation style:
APA
Pages: 2
Deadline: 16hrs
The goal of this lab is to test the linear correlation between the two Quantitative Variables; to find the equation of the regression line for the variables; and to use the line for prediction.
Important Note: The interpretation of Microsoft Excel output is part of this lab, the output itself is not sufficient to complete this lab. Make sure to answer each question in complete sentences.
U.S. Department of Transportation
As part of a study on transportation safety, the U.S. Department of Transportation collected data on the number of fatal accidents per 1000 licenses and the percentage of licensed drivers under the age of 21 in a sample of 42 cities. Data collected over a one-year period follow. These data are contained in the file named Safety.xlxs uploaded along with this lab assignment.
Managerial Report
1. Use Microsoft Excel to construct a scatter plot for number of fatal accidents and percentage of drivers under the age of 21. Paste the output below, please do not upload a separate Excel file. Make sure your graph is clearly labeled within Excel (x and y axis, and a title). Develop numerical and graphical summaries of the data as well using the descriptive statistics tool.
2. Use Excel’s regression tool to conduct regression analysis to investigate the relationship between the number of fatal accidents and the percentage of drivers under the age of 21. Use the computer output to answer the following questions:
a) What is your independent variable, X, in this problem?
b) What is your dependent variable, Y, in this problem?
c) What is the Line of regression for this problem?
d) What is the slope? Explain what it means in the context of this problem.
e) What is the y intercept? Explain what it means in the context of this problem.
f) What is R squared? And what does this number mean in the context of the problem?
g) Conduct a hypothesis test on the slope and discuss your findings. Make sure to state your null and alternative hypothesis and show your test statistic, you should use the numbers from the Excel output. Use a .05 level of significance. Give a statistical conclusion in the context of this problem.
h) If a city has 5 percent of its licensed drivers under the age of 21, what would be the predicted number of fatal accidents per 1000 licenses? Show how you calculated this.
3. What conclusion and recommendations about the two variables can you derive from the above analysis? Use complete sentences to answer this.
Stat Project Points: 25 Points
Stat Project Estimated CU Time: 3 CU
Managerial Report
- A scatter plot for number of fatal accidents and percentage of drivers under the age of 21. Numerical and graphical summaries of the data as well using the descriptive statistics tool.
Descriptive statistics for percent under 21 and fatal accidents per 1000
Percent Under 21 | Fatal Accidents per 1000 | |
Mean | 12.26190476 | 1.922404762 |
Standard Error | 0.483237634 | 0.165257284 |
Median | 12 | 1.881 |
Mode | 8 | #N/A |
Standard Deviation | 3.1317378 | 1.070989605 |
Sample Variance | 9.807781649 | 1.147018735 |
Kurtosis | -1.137109498 | -0.974888754 |
Skewness | 0.210357273 | 0.193164404 |
Range | 10 | 4.061 |
Minimum | 8 | 0.039 |
Maximum | 18 | 4.1 |
Sum | 515 | 80.741 |
Count | 42 | 42 |
Scatter plot
- Excel’s regression analysis; to investigate the relationship between the number of fatal accidents and the percentage of drivers under the age of 21.
- What is your independent variable, X, in this problem?
Age (Percent under 21)
- What is your dependent variable, Y, in this problem?
Number of fatal accidents per 1000
- What is the Line of regression for this problem?
y = -1.5974 + 0.2871x
- What is the slope? Explain what it means in the context of this problem.
Slope = 0.2871
When the X value changes by 1% then the number of fatal accidents increases by 0.2871
- What is the y intercept? Explain what it means in the context of this problem.
y-intercept = -1.5974
it is the number of fatal accidents when the percent under 21 is zero.
- What is R squared? And what does this number mean in the context of the problem?
R-squared = 0.704571 = 70%
The variations in the number of fatal accidents is explained by X = percent under 21. The data is 70% close to the regression line.
- Conduct a hypothesis test on the slope and discuss your findings. Make sure to state your null and alternative hypothesis and show your test statistic, you should use the numbers from the Excel output. Use a .05 level of significance. Give a statistical conclusion in the context of this problem.
Null Hypothesis, H0: The percent of licensed drivers aged below 21 years is not a significant predictor of the number of fatal accidents per 1000 accidents in the 42 cities
Alternative Hypothesis, H1: The percent of licensed drivers aged below 21 years is a significant predictor of the number of fatal accidents per 1000 accidents in the 42 cities
Level of significance; alpha = 0.05
Test statistic from regression analysis; t stat = -4.29792
Respective p value = 0.000107
Decision; the p value is less than the significance level: 0.00017<0.05 hence the null hypothesis is rejected
Conclusion: the null hypothesis was rejected; therefore, there is a significant relationship between the two and age (Percent under 21) is a significant predictor of the number of fatal accidents per 1000 in the sampled 42 cities.
- If a city has 5 percent of its licensed drivers under the age of 21, what would be the predicted number of fatal accidents per 1000 licenses? Show how you calculated this.
y = -1.5974 + 0.2871x; x = 5
y = -1.5974 + 0.2871 * 5
= -0.5385
Therefore, a city with 5% of its licensed drivers under 21 years would expect zero fatal accidents per 1000 licenses
- Conclusion and recommendations about the two variables
The percentage of licensed drivers with age below 21 years in the 42 cities was found to be a significant predictor of the number of fatal accidents in the same cities. It was found out that the higher the percentage, the higher the number of fatal accidents. Therefore, I would recommend that U.S. Department of Transportation should issue less or no licenses to those under 21 years if the fatal accidents were to be minimized.