Mathematics and Statistics: Data Analysis.

order#110728709,

Topic:

Any topic (writer’s choice)

Type of paper:

Coursework

Discipline:

Mathematics and Statistics

Format or citation style:

MLA

Pages: 2

Deadline: 10hrs

Data Analysis #1

Classifying Plant Leaves

Due: Tuesday, April 10, 2018

- Include a copy of data table, a statistical software code and output as an appendix.

__Part I: Analyzing the Data__

- Create a scatter plot of your training sample data, using the species letter as the plotting symbol. Do you have any outliers? If you have made gross measurement errors or data entry errors, correct the errors and redo the plot. If not, why are the outliers leaves so different from the rest?
- Compute summary statistics for your two training samples:
- Do you believe the assumption of bivariate normal distribution for the width and length is justified? Explain.
- Do you believe the assumption of equal covariance matrices is justified? Explain.
- Evaluate the classification rule of the formula

Using your summary statistics. Indicate the boundary on the scatter plot you create in Problem 1.

- Use your classification rule to classify each observation in your training samples and a future leaf with 42-mm width and 68-mm length. Determine the number of correct and incorrect classifications for each training sample.
- Use a discriminant analysis routine in a statistical computing package to classify each observation in your training samples and a future leaf with 42-mm width and 68-mm length. Do the results agree with your answers in Problem 6?
- What further analysis, if any, do you suggest for these data? Why might this analysis be useful?

Data Analysis #1

Classifying Plant Leaves

Due: Tuesday, April 10, 2018

- Include a copy of data table, a statistical software code and output as an appendix.

__Part I: Analyzing the Data__

- Create a scatter plot of your training sample data, using the species letter as the plotting symbol. Do you have any outliers? If you have made gross measurement errors or data entry errors, correct the errors and redo the plot. If not, why are the outliers leaves so different from the rest?

- Compute summary statistics for your two training samples

- Do you believe the assumption of bivariate normal distribution for the width and length is justified? Explain.

The assumption was justified given that all the p-values were greater than 0.5. The normality test was not rejected hence the assumption held up.

- Do you believe the assumption of equal covariance matrices is justified? Explain.

The assumption has been justified given that the p-value obtained is greater than 0.05. The null hypothesis that the covariance matrices are equal was not rejected.

- Evaluate the classification rule of the formula

Using your summary statistics. Indicate the boundary on the scatter plot you create in Problem 1.

- Use your classification rule to classify each observation in your training samples and a future leaf with 42-mm width and 68-mm length. Determine the number of correct and incorrect classifications for each training sample.

The future leaf with 42-mm width and 68-mm length is incorrect since the expression -179.7156≥-171.2915 is false.

- Use a discriminant analysis routine in a statistical computing package to classify each observation in your training samples and a future leaf with 42-mm width and 68-mm length. Do the results agree with your answers in Problem 6?

The results agree.

**Code**