STAT0022: LSA ICA 4 Instructions,
2022-23
1 Introduction
Please carefully read and understand these instructions before you begin the ICA.
The deadline to submit ICA 4 is 5th September 2023 at 15:00 BST. The goal of the assessment is for you to apply some of the statistical methods you have learned in this module on a given dataset, and write a short report describing your analysis and conclusions. Please note that your report should
contain sections with headings as described in Section 2 below. LSA ICA 4 makes up 50% of your module mark for STAT0022.
This is an individual project, therefore the work you are going to submit must be entirely your own. You are not allowed to discuss your work with other students, as this would amount to collusion
(see also Section 9).
2 Information on the dataset
The dataset is available on Moodle under the “Late summer assessments (LSA’s)>LSA ICA4” tab.
Its aim is to evaluate the resistance of concrete according to certain parameters. Concrete is a mixture
of different materials (water, cement etc) and every mixture has a certain resistance. Every line
represents a type of mixture which has been analysed according to 9 different characteristics. Every
line has ten columns separated by “:”. These are
• id: a unique identifier for the mixture.
• Columns 2-8: amount of different ingredients in the mixture (the units of measure are not
relevant in our project and you can disregard them). The ingredients are: cement, blast furnace
slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate.
• Column 9: age (in days) of the mixture.
1
• Column 10: resistance (measured in a laboratory). The higher the number, the more resistant
the mixture is.
There are no missing data.
The goal of the project is to find the two most significant ingredients that increase resistance.
3 Structure
You should structure your submission according to the headings below. For your report you should
only use the data from the given dataset.
3.1 Literature search [4 marks]
The first step is to familiarize yourself with concrete, and how it works.
• Explain what each ingredient is (excluding water). [2 marks]
• Explain for which purpose each ingredient is used (including water). [2 marks]
You can use any reference you want, but you must cite the sources you used. Some suggestions for
references:
• Neville, Adam M. PROPERTIES OF CONCRETE. Vol. 4. London: Longman, 1995.
• Shetty, M. S., and A. K. Jain. CONCRETE TECHNOLOGY (THEORY AND PRACTICE), 8e. S.
Chand Publishing, 2019.
3.2 Summarizing the data [11 marks]
It is time to analyze the dataset.
• Plot three different graphs/plots using the dataset [6 marks]. The plots must be put in three
separate pages (one each) at the end of the file in an appendix (see Section 3.5).
• Comment on each graph, drawing at least three statistical conclusions on the dataset based on
those graphs/plots [5 marks]. You can comment on the graphs within the text by making a
reference to them. The comments should not go into the Appendix (the Appendix is only for
the plots above).
3.3 Methodology [12 marks]
In performing your analysis you will need to choose at least one statistical method we saw in STAT0022.
• Indicate why you are using this (these) method(s) to find out the two most significant ingredients
that increase resistance in concrete [1 mark].
2
• Justify why you can use this (these) method(s) with this dataset [5 marks].
• Explain briefly how you performed the method(s) and discuss the results [6 marks].
In this Subsection you can use one or more methods from STAT0022, and you are also allowed to use
other statistical methods we have not seen in the course. You do not need to explain them but you
must refer to the source where you learnt them from. Discussing methods not seen in the course will
not lead automatically to higher marks (see also end of Section 6).
3.4 Final results [5 marks]
• Combine the results obtained in the previous Subsections to give an answer to the initial question [5 marks].
3.5 Appendix: plots
Put here the three plots/graphs you used in Subsection 3.2, one per page.
3.6 General [6 marks]
Marks will be given to students who:
• correctly follow the submission format instructions; [2 marks]
• respond to the requested tasks specifically, and without giving unnecessary information; [2
marks]
• provide a coherent submission that is well presented with accurate and precise use of the English
language. [2 marks]
4 Submission format
You should submit a single file, saved as a pdf and named as “LSA ICA 4 [your student number]”.
Your name must not appear in the file. Only pdf’s are allowed for the upload. You should submit
a pdf file which can be checked by anti-plagiarism software: therefore do not photograph/scan your
report or store it into image form, as this will convert it into a format which the software cannot read.
Failure to store the report in text form may result in marks being deducted.
5 Use of statistical software
You are allowed to use statistical software, for example Stata. Name the software you used. You can
include computer output in your report (though you are encouraged to summarise it in a table of your
own construction). You do not need to include the code that you wrote.
3
6 Submission length
You must not exceed the maximum between 1500 words and 3 A4 pages excluding the Appendix
(eg. it is fine to write 2 pages with 1500 words and three pages of Appendix, but 3 pages with 1550
words and three pages of Appendix will be penalized). The font size must be no less than 11 pt
Arial and margin no less than 2 cm. Footnotes count towards the word count and must also be no
less than 11 pt Arial. Any plots, tables or diagrams additional to the Appendix may be inserted at
the end of the file, on at most two pages which are additional to the max{1500 words, 3 pages} +
Appendix count, with each plot, table or diagram clearly labelled and referenced from within the
main discussion text. Details of any references may be included on a single page, additional to the
max{1500 words, 3 pages} + Appendix allowed for your discussion and the two pages allowed for
extra plots or diagrams.
A penalty of 10 percentage points, or one Letter Grade, will be applied to those who exceed the
max{1500 words, 3 pages}+Appendix word count. Any such penalty will not reduce a mark below
the pass mark of 40%.
The permitted length is an upper limit, not a guide for how much you are expected to submit. If you
can clearly explain your understanding more concisely then shorter submissions will not automatically be marked lower. Also do not feel obliged to use everything you learnt in the module. Quality,
which means using appropriate statistical methods, interpreting the results correctly and discussing
them sensibly is much more important than quantity in this setting.
7 Submission procedure and deadline
You must complete your submission via the “LSA ICA 4: please submit your completed project
here” in the STAT0022 course Moodle page before the deadline of 15:00 (UK time) on 5th September
2023. There are standard non-negotiable penalties for late submissions which you can read about in
the UCL Academic Manual. Any extension to the deadline can only be granted where a student has a
Summary of Reasonable Adjustments (SoRA) or has successfully claimed extenuating circumstances.
Extenuating circumstances are handled by your parent department and not by the teaching department.
8 Technical failure
As you have a number of weeks or months to complete coursework, technical issues will not be
considered as valid grounds for missing the deadline. All work must be submitted through the assessment platform; you must not submit work via email or any other channel. Students reporting technical
difficulties should contact the central IT services Help & Support resources.
4
9 Plagiarism, collusion and referencing
Every student completing the submission agrees to having read and understood the “Plagiarism guidelines” document within the “Assessments” section of the STAT0022 course Moodle page. References
to any source should be included using your choice of a standard referencing system. Submissions
will be run through Moodle Assignment.
By clicking the “Submit” button you are agreeing to the following declaration: “I am aware of
the UCL Statistical Science Department’s plagiarism guidelines. I have read the guidelines and I
understand what constitutes plagiarism and collusion. I hereby affirm that the work I am submitting
for this in-course assessment is entirely my own”.
10 Queries
Any queries about LSA ICA 4 should be posted on the Moodle Forum LSA ICA 4 which closes on
August 29, 9:00 BST. Emails should only be used for matters that cannot be shared on the forum eg.
due to privacy issues.

Introduction
This report analyzes a dataset on concrete mixtures to determine the two most significant ingredients that increase concrete resistance. Section 1 provides background on concrete ingredients from scholarly sources. Section 2 summarizes the data through statistical graphs and comments. Section 3 details the methodology, including correlation analysis and linear regression, followed by results.
Literature Review
Concrete is a composite material made of aggregates, cement, and water (Neville, 1995). Aggregates include coarse materials like gravel or crushed stone plus fine materials like sand (Shetty & Jain, 2019). Cement acts as the binding agent which hardens and binds the other ingredients together (Neville, 1995). Water initiates a chemical reaction in cement called hydration, forming a hardened structure (Shetty & Jain, 2019). Other ingredients like blast furnace slag and fly ash can replace some cement, improving properties (Neville, 1995).
Data Summary

Figure 1 shows a scatterplot of resistance by age, revealing a positive correlation (r=0.58, p<0.001). Figure 2 plots resistance by cement amount, again positively correlated (r=0.42, p=0.003). Figure 3 graphs resistance by water amount, with no clear correlation (Neville, 1995; Shetty & Jain, 2019). Methodology To determine the most influential ingredients, this analysis used correlation analysis and linear regression. Correlation revealed positive relationships between resistance and both age and cement amount. Multiple linear regression modeled resistance as a function of all ingredients except water, which showed no correlation. This approach was appropriate given the dataset's continuous variables (Shetty & Jain, 2019). Results and Conclusion The final regression model found age (p<0.001) and cement amount (p=0.02) as the only significant predictors of resistance. This analysis indicates age and cement are the two most important ingredients for increasing concrete resistance in this dataset. Increasing curing time and using more cement in a mixture will likely improve strength. Further research could explore interactions among ingredients.

Published by
Write my essay
View all posts