Specific Objectives
By the end of this topic, the learner should be able to:
- State the measures of central tendency;
- Calculate the mean using the assumed mean method;
- Create a cumulative frequency table;
- Estimate the median and the quartiles by:
- Calculation, and
- Using an ogive;
- Define and calculate the measures of dispersion: range, quartiles, interquartile range, quartile deviation, variance, and standard deviation;
- Interpret measures of dispersion.
Content
- Mean from assumed mean;
- Cumulative frequency table;
- Ogive;
- Median;
- Quartiles;
- Range;
- Interquartile range;
- Quartile deviation;
- Variance;
- Standard deviation.
These statistical measures are called measures of central tendency and include the mean, mode, and median.
Mean using working (Assumed) Mean
The assumed mean method is a technique for calculating the arithmetic mean and standard deviation of a data set. It simplifies the calculations by reducing the size of numbers involved.
Example
The masses to the nearest kilogram of 40 students in the form 3 class were measured and recorded in the table below. Calculate the mean mass.
| Mass (kg) | 47 | 48 | 49 | 50 | 51 | 52 | 53 |
|---|---|---|---|---|---|---|---|
| Number of students | 2 | 0 | 1 | 2 | 3 | 2 | 5 |
| 54 | 55 | 56 | 57 | 58 | 59 | 60 | |
| 6 | 7 | 5 | 3 | 2 | 1 | 1 |
Solution
We use an assumed mean of 53.
| Mass x (kg) | t = x – 53 | f | ft |
|---|---|---|---|
| 47 48 49 50 51 52 53 54 | -6 -5 -4 -3 -2 -1 0 1 | 2 0 1 2 3 2 5 6 | -12 0 -4 -6 -6 -2 0 6 |
| 55 | 2 | 7 | 14 |
| 56 | 3 | 5 | 15 |
| 57 | 4 | 3 | 12 |
| 58 | 5 | 2 | 10 |
| 59 60 | 6 7 | 1 1 | 6 7 |
| Σf = 40 | Σft = 40 | ||
Mean of t = Σft / Σf = 40 / 40 = 1
Mean of x = 53 + mean of t = 53 + 1 = 54
Mean of Grouped Data
The masses to the nearest gram of 100 eggs were as follows:
| Marks | 100-103 | 104-107 | 108-111 | 112-115 | 116-119 | 120-123 |
|---|---|---|---|---|---|---|
| Frequency | 1 | 15 | 42 | 31 | 8 | 3 |
Find the Mean Mass
Solution
Let us use a working mean of 109.5.
| Class | Mid-point x | t = x – 109.5 | f | ft |
|---|---|---|---|---|
| 100-103 | 101.5 | -8 | 1 | -8 |
| 104-107 | 105.5 | -4 | 15 | -60 |
| 108-111 | 109.5 | 0 | 42 | 0 |
| 112-115 | 113.5 | 4 | 31 | 124 |
| 116-119 | 117.5 | 8 | 8 | 64 |
| 120-123 | 121.5 | 12 | 3 | 36 |
| Σ | 100 | 156 | ||
Mean of t = Σft / Σf = 156 / 100 = 1.56
Therefore, mean of x = 109.5 + 1.56 = 111.06 g
To get the mean of grouped data easily, we divide each figure by the class width after subtracting the assumed mean. To obtain the mean of the original data from the mean of the new set of data, we reverse the steps in the following order:
- Multiply the mean by the class width and then add the working mean.
Example
The example above is used to demonstrate the steps.
| Class | Mid-point x | t | f | ft |
|---|---|---|---|---|
| 100-103 | 101.5 | -2 | 1 | -2 |
| 104-107 | 105.5 | -1 | 15 | -15 |
| 108-111 | 109.5 | 0 | 42 | 0 |
| 112-115 | 113.5 | 1 | 31 | 31 |
| 116-119 | 117.5 | 2 | 8 | 16 |
| 120-123 | 121.5 | 3 | 3 | 9 |
| Σ | 100 | 39 | ||
Mean of t = 39 / 100 = 0.39
Therefore, mean of x = 0.39 × 4 + 109.5 = 1.56 + 109.5 = 111.06 g
Quartiles, Deciles, and Percentiles
A median divides a set of data into two equal parts with an equal number of items.
Quartiles divide a set of data into four equal parts. The lower quartile is the median of the bottom half. The upper quartile is the median of the top half, and the middle coincides with the median of the whole set of data.
Deciles divide a set of data into ten equal parts. Percentiles divide a set of data into one hundred equal parts.
Note: For percentiles, deciles, and quartiles, the data must be arranged in order of size.
Example
| Height in cm | 145-149 | 150-154 | 155-159 | 160-164 | 165-169 | 170-174 | 175-179 |
|---|---|---|---|---|---|---|---|
| Frequency | 2 | 5 | 16 | 9 | 5 | 2 | 1 |
Calculate the:
- Median height;
- Lower quartile and upper quartile;
- 80th percentile.
Solution
There are 40 students. Therefore, the median height is the average of the heights of the 20th and 21st students.
| Class | Frequency | Cumulative frequency |
|---|---|---|
| 145-149 | 2 | 2 |
| 150-154 | 5 | 7 |
| 155-159 | 16 | 23 |
| 160-164 | 9 | 32 |
| 165-169 | 5 | 37 |
| 170-174 | 2 | 39 |
| 175-179 | 1 | 40 |
Both the 20th and 21st students fall in the 155-159 class. This class is called the median class. Using the formula:
m = L + ((N/2 – C) / f) × I
Where:
- L is the lower class limit of the median class;
- N is the total frequency;
- C is the cumulative frequency before the median class;
- I is the class interval;
- f is the frequency of the median class.
Therefore:
Height of the 20th student = 154.5 + ((20 – 7) / 16) × 5 = 154.5 + 4.0625 = 158.5625 cm
Height of the 21st student = 154.5 + ((21 – 7) / 16) × 5 = 154.5 + 4.375 = 158.875 cm
Therefore, median height = (158.5625 + 158.875) / 2 = 158.7 cm
- Lower quartile = L + ((N/4 – C) / f) × I
The 10th student falls in the 155-159 class.
= 154.5 + ((10 – 7) / 16) × 5 = 154.5 + 0.9375 = 155.44 cm
Upper quartile = L + ((3N/4 – C) / f) × I
The 30th student falls in the 160-164 class.
= 159.5 + ((30 – 23) / 9) × 5 = 159.5 + 3.89 = 163.39 cm
Note: The median corresponds to the middle quartile or the 50th percentile.
Example
Determine the upper quartile and the lower quartile for the following set of numbers:
5, 10, 6, 5, 8, 7, 3, 2, 7, 8, 9
Solution
Arranging in ascending order:
2, 3, 5, 5, 6, 7, 7, 8, 8, 9, 10
The median is 7.
The lower quartile is the median of the first half, which is 5.
The upper quartile is the median of the second half, which is 8.
Median from Cumulative Frequency Curve
A graph for cumulative frequency is called an ogive. We plot a graph of cumulative frequency against the upper class limit.
Example
Given the class intervals and frequencies, we first find the cumulative frequency as shown below. Then draw the graph of cumulative frequency against the upper class limit.
| Arm Span (cm) | Frequency (f) | Cumulative Frequency |
|---|---|---|
| 140 ≤ x < 145 | 3 | 3 |
| 145 ≤ x < 150 | 1 | 4 |
| 150 ≤ x < 155 | 4 | 8 |
| 155 ≤ x < 160 | 8 | 16 |
| 160 ≤ x < 165 | 7 | 23 |
| 165 ≤ x < 170 | 5 | 28 |
| 170 ≤ x < 175 | 2 | 30 |
Solution
Reading from the graph:
- The median = 39.5;
- The lower quartile;
- The upper quartile.
23 candidates scored 55 and over.
Pass mark is 31 if 70% of pupils are to pass.
(i) The middle 50% include the marks between the lower and the upper quartiles, i.e., between 28.5 and 53.5 marks.
(ii) The middle 80% include the marks between the first decile and the 9th decile, i.e., between 18 and 69 marks.

Measure of Dispersion
Range
The range is the difference between the highest value and the lowest value in a data set.
Disadvantage: It depends only on the two extreme values and ignores the distribution of the other data points.
Interquartile Range
The interquartile range is the difference between the lower and upper quartiles. It represents the middle 50% of the values in the data set.
Semi-Quartile Range
The semi-quartile range is half the interquartile range. It is also called the quartile deviation and is a measure of spread.
Mean Absolute Deviation
The mean absolute deviation is found by calculating the absolute differences of each number from the mean and then finding their average. It measures the average distance of data points from the mean.
Variance
Variance is the mean of the squares of the deviations from the mean. It measures how spread out the data points are around the mean.
Example
| Deviation from mean (d) | +1 | -1 | +6 | -4 | -2 | -11 | +1 | 10 |
|---|---|---|---|---|---|---|---|---|
| fi | 1 | 1 | 36 | 16 | 4 | 121 | 1 | 100 |
Variance = …
The square root of the variance is called the standard deviation. It is also called the root mean square deviation. For the above example, its standard deviation = …
Example
The table below shows the number of children per family in a housing estate.
| Number of children | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|---|
| Number of families | 1 | 5 | 11 | 27 | 10 | 4 | 2 |
Calculate:
- The mean number of children per family;
- The standard deviation.
Solution
| Number of children (x) | Families (f) | fx | Deviation d = x – m | f d² |
|---|---|---|---|---|
| 0 | 1 | 0 | -3 | 9 |
| 1 | 5 | 5 | -2 | 20 |
| 2 | 11 | 22 | -1 | 11 |
| 3 | 27 | 81 | 0 | 0 |
| 4 | 10 | 40 | 1 | 10 |
| 5 | 4 | 20 | 2 | 8 |
| 6 | 2 | 12 | 3 | 18 |
Mean = …
Variance = …
Example
The table below shows the distribution of marks of 40 candidates in a test.
| Marks | 1-10 | 11-20 | 21-30 | 31-40 | 41-50 | 51-60 | 61-70 | 71-80 | 81-90 | 91-100 |
|---|---|---|---|---|---|---|---|---|---|---|
| Frequency | 2 | 2 | 3 | 9 | 12 | 5 | 2 | 3 | 1 | 1 |
Calculate the mean and standard deviation.
| Marks | Midpoint (x) | Frequency (f) | fx | d = x – m | d² | f d² |
|---|---|---|---|---|---|---|
| 1-10 | 5.5 | 2 | 11.0 | -39.5 | 1560.25 | 3120.5 |
| 11-20 | 15.5 | 2 | 31.0 | -29.5 | 870.25 | 1740.5 |
| 21-30 | 25.5 | 3 | 76.5 | -19.5 | 380.25 | 1140.75 |
| 31-40 | 35.5 | 9 | 319.5 | -9.5 | 90.25 | 812.25 |
| 41-50 | 45.5 | 12 | 546.0 | 0.5 | 0.25 | 3.00 |
| 51-60 | 55.5 | 5 | 277.5 | 10.5 | 110.25 | 551.25 |
| 61-70 | 65.5 | 2 | 131.0 | 20.5 | 420.25 | 840.5 |
| 71-80 | 75.5 | 3 | 226.5 | 30.5 | 930.25 | 2790.75 |
| 81-90 | 85.5 | 1 | 85.5 | 40.5 | 1640.25 | 1640.25 |
| 91-100 | 95.5 | 1 | 95.5 | 50.5 | 2550.25 | 2550.25 |
| 40 | 1800 | 15190 | ||||
Mean = …
Variance = …
Variance = 379.8
Standard deviation = 19.49
Note: Adding or subtracting a constant to or from each number in a set of data does not alter the value of the variance or standard deviation.
More Formulas
The formula for calculating the variance is:
…
Example
The table below shows the length in centimeters of 80 plants of a particular species of tomato.
| Length | 152-156 | 157-161 | 162-166 | 167-171 | 172-176 | 177-181 |
|---|---|---|---|---|---|---|
| Frequency | 12 | 14 | 24 | 15 | 8 | 7 |
Calculate the mean and the standard deviation.
Solution
Let A = 169.
| Length | Mid-point x | x – 169 | t = (x – 169) / 5 | f | ft |
|---|---|---|---|---|---|
| 152-156 | 154 | -15 | -3 | 12 | -36 |
| 157-161 | 159 | -10 | -2 | 14 | -28 |
| 162-166 | 164 | -5 | -1 | 24 | -24 |
| 167-171 | 169 | 0 | 0 | 15 | 0 |
| 172-176 | 174 | 5 | 1 | 8 | 8 |
| 177-181 | 179 | 10 | 2 | 7 | 14 |
Mean = …
Therefore, mean = -4.125 + 169 = 164.875 (to 4 significant figures)
Variance of t = …
= 2.8 – 0.6806 = 2.119
Therefore, variance of x = 2.119 × 25 = 52.975 = 52.98 (4 significant figures)
Standard deviation of x = √52.98 = 7.279 = 7.28 (to 2 decimal places)
End of topic
| Did you understand everything? If not, ask a teacher, friends, or anybody and make sure you understand before going to sleep! |
Past KCSE Questions on the Topic
Every week the number of absentees in a school was recorded. This was done for 39 weeks. These observations were tabulated as shown below:
Number of absentees 0-3 4-7 8-11 12-15 16-19 20-23 Number of weeks 6 9 8 11 3 2 Estimate the median absentee rate per week in the school.
The table below shows high altitude wind speeds recorded at a weather station over a period of 100 days.
Wind speed (knots) 0-19 20-39 40-59 60-79 80-99 100-119 120-139 140-159 160-179 Frequency (days) 9 19 22 18 13 11 5 2 1 (a) On the grid provided, draw a cumulative frequency graph for the data.
(b) Use the graph to estimate:
- The interquartile range;
- The number of days when the wind speed exceeded 125 knots.
Five pupils A, B, C, D, and E obtained the marks 53, 41, 60, 80, and 56 respectively. The table below shows part of the work to find the standard deviation.
Pupil Mark x x – a (x – a)2 A 53 -5 B 41 -17 C 60 2 D 80 22 E 56 -2 (a) Complete the table.
(b) Find the standard deviation.
In an agricultural research centre, the length of a sample of 50 maize cobs was measured and recorded as shown in the frequency distribution table below.
Length in cm 8-10 11-13 14-16 17-19 20-22 23-25 Number of cobs 4 7 11 15 8 5 Calculate:
- The mean;
- (i) The variance;
- (ii) The standard deviation.
The table below shows the frequency distribution of masses of 50 newborn calves in a ranch.
Mass (kg) Frequency
15-18 2
19-22 3
23-26 10
27-30 14
31-34 13
35-38 6
39-42 2
(a) On the grid provided, draw a cumulative frequency graph for the data.
(b) Use the graph to estimate:
- The median mass;
- The probability that a calf picked at random has a mass lying between 25 kg and 28 kg.
The table below shows the weight and price of three commodities in a given period.
Commodity Weight Price Relatives
X 3 125
Y 4 164
Z 2 140
Calculate the retail index for the group of commodities.
The number of people who attended an agricultural show in one day was 510 men, 1080 women, and some children. When the information was represented on a pie chart, the combined angle for the men and women was 216°. Find the angle representing the children.
The mass of 40 babies in a certain clinic was recorded as follows:
Mass in Kg No. of babies
1.0-1.9 6
2.0-2.9 14
3.0-3.9 10
4.0-4.9 7
5.0-5.9 2
6.0-6.9 1
Calculate:
- The interquartile range of the data;
- The standard deviation of the data using 3.45 as the assumed mean.
The data below shows the masses in grams of 50 potatoes.
Mass (g) 25-34 35-44 45-54 55-64 65-74 75-84 85-94 No of potatoes 3 6 16 12 8 4 1 (a) On the grid provided, draw a cumulative frequency curve for the data.
(b) Use the graph in (a) above to determine:
- The 60th percentile mass;
- The percentage of potatoes whose masses lie in the range 53g to 68g.
The histogram below represents the distribution of marks obtained in a test.
The bar marked A has a height of 3.2 units and a width of 5 units. The bar marked B has a height of 1.2 units and a width of 10 units.

If the frequency of the class represented by bar B is 6, determine the frequency of the class represented by bar A.
A frequency distribution of marks obtained by 120 candidates is to be represented in a histogram. The table below shows the grouped marks. Frequencies for all the groups and also the area and height of the rectangle for the group 30-60 marks.
Marks 0-10 10-30 30-60 60-70 70-100 Frequency 12 40 36 8 24 Area of rectangle 180 Height of rectangle 6 (a) (i) Complete the table.
(ii) On the grid provided below, draw the histogram.
(b) (i) State the group in which the median mark lies.
(ii) A vertical line drawn through the median mark divides the total area of the histogram into two equal parts. Using this information or otherwise, estimate the median mark.


