Date | May 2021 | Marks available | 2 | Reference code | 21M.2.SL.TZ1.1 |
Level | Standard Level | Paper | Paper 2 | Time zone | Time zone 1 |
Command term | Copy and Complete | Question number | 1 | Adapted from | N/A |
Question
As part of his mathematics exploration about classic books, Jason investigated the time taken by students in his school to read the book The Old Man and the Sea. He collected his data by stopping and asking students in the school corridor, until he reached his target of 1010 students from each of the literature classes in his school.
Jason constructed the following box and whisker diagram to show the number of hours students in the sample took to read this book.
Mackenzie, a member of the sample, took 2525 hours to read the novel. Jason believes Mackenzie’s time is not an outlier.
For each student interviewed, Jason recorded the time taken to read The Old Man and the Sea (x)(x), measured in hours, and paired this with their percentage score on the final exam (y)(y). These data are represented on the scatter diagram.
Jason correctly calculates the equation of the regression line yy on xx for these students to be
y=-1.54x+98.8y=−1.54x+98.8.
He uses the equation to estimate the percentage score on the final exam for a student who read the book in 1.51.5 hours.
Jason found a website that rated the ‘top 5050’ classic books. He randomly chose eight of these classic books and recorded the number of pages. For example, Book HH is rated 44th44th and has 281281 pages. These data are shown in the table.
Jason intends to analyse the data using Spearman’s rank correlation coefficient, rsrs.
State which of the two sampling methods, systematic or quota, Jason has used.
Write down the median time to read the book.
Calculate the interquartile range.
Determine whether Jason is correct. Support your reasoning.
Describe the correlation.
Find the percentage score calculated by Jason.
State whether it is valid to use the regression line yy on xx for Jason’s estimate. Give a reason for your answer.
Copy and complete the information in the following table.
Calculate the value of rsrs.
Interpret your result.
Markscheme
Quota sampling A1
[1 mark]
1010 (hours) A1
[1 mark]
15-715−7 (M1)
Note: Award M1 for 1515 and 77 seen.
88 A1
[2 marks]
indication of a valid attempt to find the upper fence (M1)
15+1.5×815+1.5×8
2727 A1
25<2725<27 (accept equivalent answer in words) R1
Jason is correct A1
Note: Do not award R0A1. Follow through within this part from their 2727, but only if their value is supported by a valid attempt or clearly and correctly explains what their value represents.
[4 marks]
“negative” seen A1
Note: Strength cannot be inferred visually; ignore “strong” or “weak”.
[1 mark]
correct substitution (M1)
y=-1.54×1.5+98.8y=−1.54×1.5+98.8
96.5 (%) (96.49)96.5(%) (96.49) A1
[2 marks]
not reliable A1
extrapolation OR outside the given range of the data R1
Note: Do not award A1R0. Only accept reasoning that includes reference to the range of the data. Do not accept a contextual reason such as 1.51.5 hours is too short to read the book.
[2 marks]
A1A1
Note: Do not award A1 for correct ranks for ‘number of pages’. Award A1 for correct ranks for ‘top 50 rating’.
[2 marks]
0.714 (0.714285…)0.714 (0.714285…) A2
Note: FT from their table.
[2 marks]
EITHER
there is a (strong/moderate) positive association between the number of pages and the top 5050 rating. A1
OR
there is a (strong/moderate) agreement between the rank order of number of pages and the rank order top 5050 rating. A1
OR
there is a (strong/moderate) positive (linear) correlation between the rank order of number of pages and the rank order top 5050 rating. A1
Note: Follow through from their value of rsrs.
[1 mark]