MBAS901 – Final Assignment November 3, 2022

Essential Tools for Business Analytics (MBAS901)

Trimester 3, 2022

Final Assessment/Alternative Exam: Individual (Marks allocated: 50%)

Due Date: Tuesday 15th November 2022, by 11.30 pm (Submission via Turnitin)

Provide your answers to each question, including relevant figures (e.g. SAS Viya outputs) in a word/pdf document. You must answer all questions. Note the word limit of your answer script is 2500 words.

Task 1: Exploratory Data Analysis (25 points)

Using the ‘FACILITY_TOY’ dataset available in SAS Viya, answer the following questions. Export/copy all charts you create into your answer script, describe the charts (e.g. what your chart is visualizing in each axis) and interpret the charts, i.e. what your chart highlights. This interpretation is ideally accessible to even a non-technical reader.

Q1. On a geographic map, show the countries where toy facilities are located. Size of the bubble should be the total unit capacity (4 points).

Q2. What is the total unit capacity in the United States (1 point) and in Australia (1 point).

Q3. Temporarily remove the United States from the map you prepared in Q1. Show the updated map (without US) (3 points). Which country has the second-largest unit capacity after the United States

(1 point).

Q4. Many countries have more than one toy facility. Further, most facilities have more than one unit manufacturing toys. As one would expect, the majority of these units do not operate at full capacity.

Assuming the actual usage of the units is provided by the ‘Unit Actual’ variable and the total unit capacity is provided by the ‘Unit Capacity’ variable, calculate the ‘Capacity Utilization Ratio’ and store values in a new variable. Show how you created this calculated item by taking a screenshot of the appropriate SAS Viya window.

Generate a histogram of the new variable and copy/export it into your answer script. Interpret the histogram (4 Points).

Q5. Prepare a bar chart to show the average ‘Capacity Utilization Ratio’ by facility for each country. Use a filter to show only Spain, Australia, and Japan in this bar chart. Copy/export the chart into your answer script. Interpret your chart (4 points)

Q6. There are many factors that could explain the variation observed in the Unit Capacity Utilization Ratio. Identify two such factors and demonstrate how these two factors explain the variation in Unit Capacity Utilization Ratio with the help of two charts and associated interpretation. (3.5 points per chart). Create a histogram of the new variable and paste it into your answer script. Analyze the histogram (4 Points).

Q5. Create a bar chart for each country that shows the average ‘Capacity Utilization Ratio’ by facility. In this bar chart, use a filter to show only Spain, Australia, and Japan. Copy/paste the chart into your response script. Analyze your graph (4 points)

Q6. Many factors could account for the variation in the Unit Capacity Utilization Ratio. Using two charts and associated interpretation, identify two such factors and demonstrate how these two factors explain the variation in Unit Capacity Utilization Ratio. (3.5 points per graph)

MBAS901 – Final Assignment November 3, 2022

Task 2. Predictive Data Analytics (25 points)

Using ‘FLCRASH’ data, answer the following questions.

Q1. Note the variable ‘Total Crash Injuries’ provides a number of injuries associated with every accident. In SAS Viya, prepare a histogram showing the distribution of Total Crash Injuries. What can you say about the distribution of crash injuries? (2 points)

Q2. Create a new custom category variable based on the ‘Total Crash Injuries’ variable. This new custom category variable should contain two categories only. One category is injuries equal to zero, while the other category is for crashes with one or more injuries. (3 points).

Visualize the frequency of the two new categories you just created on a bar chart. How many crashes report zero injuries? (3 points).

Q3. In Q2, you created a new categorical variable with only two values (binary). Your task now is to develop two models that can predict the value this target variable takes, given other explanatory variables. In other words, you attempt to predict if a crash is going to result in injuries (or not) given other important variables.

What are the two models (or techniques) you can use to predict this target variable? (2 points).

Create one model to predict the target variable you created in Q2. Assess this model’s accuracy.

What are the most important variables in predicting this target variable? (6 points).

Create the second model to predict the target variable. Assess this model’s accuracy. What are the most important variables identified by the model to predict the target variable. (6 points).

Compare the performance of the two models. Report and discuss the results of your comparison.

Which model is the champion? (3 points).