Specific Objectives

By the end of this topic, the learner should be able to:

  • State the measures of central tendency;
  • Calculate the mean using the assumed mean method;
  • Create a cumulative frequency table;
  • Estimate the median and the quartiles by:
    • Calculation, and
    • Using an ogive;
  • Define and calculate the measures of dispersion: range, quartiles, interquartile range, quartile deviation, variance, and standard deviation;
  • Interpret measures of dispersion.

Content

  • Mean from assumed mean;
  • Cumulative frequency table;
  • Ogive;
  • Median;
  • Quartiles;
  • Range;
  • Interquartile range;
  • Quartile deviation;
  • Variance;
  • Standard deviation.

These statistical measures are called measures of central tendency and include the mean, mode, and median.

Mean using working (Assumed) Mean

The assumed mean method is a technique for calculating the arithmetic mean and standard deviation of a data set. It simplifies the calculations by reducing the size of numbers involved.

Example

The masses to the nearest kilogram of 40 students in the form 3 class were measured and recorded in the table below. Calculate the mean mass.

Mass (kg)47484950515253
Number of students2012325
54555657585960
6753211

Solution

We use an assumed mean of 53.

Mass x (kg)t = x – 53fft
47
48
49
50
51
52
53
54
-6
-5
-4
-3
-2
-1
0
1
2
0
1
2
3
2
5
6
-12
0
-4
-6
-6
-2
0
6
552714
563515
574312
585210
59
60
6
7
1
1
6
7
Σf = 40Σft = 40

Mean of t = Σft / Σf = 40 / 40 = 1

Mean of x = 53 + mean of t = 53 + 1 = 54

Mean of Grouped Data

The masses to the nearest gram of 100 eggs were as follows:

Marks100-103104-107108-111112-115116-119120-123
Frequency115423183

Find the Mean Mass

Solution

Let us use a working mean of 109.5.

ClassMid-point xt = x – 109.5fft
100-103101.5-81-8
104-107105.5-415-60
108-111109.50420
112-115113.5431124
116-119117.58864
120-123121.512336
Σ100156

Mean of t = Σft / Σf = 156 / 100 = 1.56

Therefore, mean of x = 109.5 + 1.56 = 111.06 g

To get the mean of grouped data easily, we divide each figure by the class width after subtracting the assumed mean. To obtain the mean of the original data from the mean of the new set of data, we reverse the steps in the following order:

  • Multiply the mean by the class width and then add the working mean.

Example

The example above is used to demonstrate the steps.

ClassMid-point xtfft
100-103101.5-21-2
104-107105.5-115-15
108-111109.50420
112-115113.513131
116-119117.52816
120-123121.5339
Σ10039

Mean of t = 39 / 100 = 0.39

Therefore, mean of x = 0.39 × 4 + 109.5 = 1.56 + 109.5 = 111.06 g

ecolebooks.com

Quartiles, Deciles, and Percentiles

A median divides a set of data into two equal parts with an equal number of items.

Quartiles divide a set of data into four equal parts. The lower quartile is the median of the bottom half. The upper quartile is the median of the top half, and the middle coincides with the median of the whole set of data.

Deciles divide a set of data into ten equal parts. Percentiles divide a set of data into one hundred equal parts.

Note: For percentiles, deciles, and quartiles, the data must be arranged in order of size.

Example

Height in cm145-149150-154155-159160-164165-169170-174175-179
Frequency25169521

Calculate the:

  1. Median height;
  2. Lower quartile and upper quartile;
  3. 80th percentile.

Solution

There are 40 students. Therefore, the median height is the average of the heights of the 20th and 21st students.

ClassFrequencyCumulative frequency
145-14922
150-15457
155-1591623
160-164932
165-169537
170-174239
175-179140

Both the 20th and 21st students fall in the 155-159 class. This class is called the median class. Using the formula:

m = L + ((N/2 – C) / f) × I

Where:

  • L is the lower class limit of the median class;
  • N is the total frequency;
  • C is the cumulative frequency before the median class;
  • I is the class interval;
  • f is the frequency of the median class.

Therefore:

Height of the 20th student = 154.5 + ((20 – 7) / 16) × 5 = 154.5 + 4.0625 = 158.5625 cm

Height of the 21st student = 154.5 + ((21 – 7) / 16) × 5 = 154.5 + 4.375 = 158.875 cm

Therefore, median height = (158.5625 + 158.875) / 2 = 158.7 cm

  1. Lower quartile = L + ((N/4 – C) / f) × I

The 10th student falls in the 155-159 class.

= 154.5 + ((10 – 7) / 16) × 5 = 154.5 + 0.9375 = 155.44 cm

Upper quartile = L + ((3N/4 – C) / f) × I

The 30th student falls in the 160-164 class.

= 159.5 + ((30 – 23) / 9) × 5 = 159.5 + 3.89 = 163.39 cm

Note: The median corresponds to the middle quartile or the 50th percentile.

Example

Determine the upper quartile and the lower quartile for the following set of numbers:

5, 10, 6, 5, 8, 7, 3, 2, 7, 8, 9

Solution

Arranging in ascending order:

2, 3, 5, 5, 6, 7, 7, 8, 8, 9, 10

The median is 7.

The lower quartile is the median of the first half, which is 5.

The upper quartile is the median of the second half, which is 8.

Median from Cumulative Frequency Curve

A graph for cumulative frequency is called an ogive. We plot a graph of cumulative frequency against the upper class limit.

Example

Given the class intervals and frequencies, we first find the cumulative frequency as shown below. Then draw the graph of cumulative frequency against the upper class limit.

Arm Span (cm)Frequency (f)Cumulative Frequency
140 ≤ x < 14533
145 ≤ x < 15014
150 ≤ x < 15548
155 ≤ x < 160816
160 ≤ x < 165723
165 ≤ x < 170528
170 ≤ x < 175230

Solution

Reading from the graph:

  • The median = 39.5;
  • The lower quartile;
  • The upper quartile.

23 candidates scored 55 and over.

Pass mark is 31 if 70% of pupils are to pass.

(i) The middle 50% include the marks between the lower and the upper quartiles, i.e., between 28.5 and 53.5 marks.

(ii) The middle 80% include the marks between the first decile and the 9th decile, i.e., between 18 and 69 marks.

Image From EcoleBooks.com

Measure of Dispersion

Range

The range is the difference between the highest value and the lowest value in a data set.

Disadvantage: It depends only on the two extreme values and ignores the distribution of the other data points.

Interquartile Range

The interquartile range is the difference between the lower and upper quartiles. It represents the middle 50% of the values in the data set.

Semi-Quartile Range

The semi-quartile range is half the interquartile range. It is also called the quartile deviation and is a measure of spread.

Mean Absolute Deviation

The mean absolute deviation is found by calculating the absolute differences of each number from the mean and then finding their average. It measures the average distance of data points from the mean.

Variance

Variance is the mean of the squares of the deviations from the mean. It measures how spread out the data points are around the mean.

Example

Deviation from mean (d)+1-1+6-4-2-11+110
fi11361641211100

Variance = …

The square root of the variance is called the standard deviation. It is also called the root mean square deviation. For the above example, its standard deviation = …

Example

The table below shows the number of children per family in a housing estate.

Number of children0123456
Number of families1511271042

Calculate:

  1. The mean number of children per family;
  2. The standard deviation.

Solution

Number of children (x)Families (f)fxDeviation d = x – mf d²
010-39
155-220
21122-111
3278100
41040110
542028
6212318

Mean = …

Variance = …

Example

The table below shows the distribution of marks of 40 candidates in a test.

Marks1-1011-2021-3031-4041-5051-6061-7071-8081-9091-100
Frequency22391252311

Calculate the mean and standard deviation.

MarksMidpoint (x)Frequency (f)fxd = x – mf d²
1-105.5211.0-39.51560.253120.5
11-2015.5231.0-29.5870.251740.5
21-3025.5376.5-19.5380.251140.75
31-4035.59319.5-9.590.25812.25
41-5045.512546.00.50.253.00
51-6055.55277.510.5110.25551.25
61-7065.52131.020.5420.25840.5
71-8075.53226.530.5930.252790.75
81-9085.5185.540.51640.251640.25
91-10095.5195.550.52550.252550.25
40180015190

Mean = …

Variance = …

Variance = 379.8

Standard deviation = 19.49

Note: Adding or subtracting a constant to or from each number in a set of data does not alter the value of the variance or standard deviation.

More Formulas

The formula for calculating the variance is:

Example

The table below shows the length in centimeters of 80 plants of a particular species of tomato.

Length152-156157-161162-166167-171172-176177-181
Frequency1214241587

Calculate the mean and the standard deviation.

Solution

Let A = 169.

LengthMid-point xx – 169t = (x – 169) / 5fft
152-156154-15-312-36
157-161159-10-214-28
162-166164-5-124-24
167-17116900150
172-1761745188
177-181179102714

Mean = …

Therefore, mean = -4.125 + 169 = 164.875 (to 4 significant figures)

Variance of t = …

= 2.8 – 0.6806 = 2.119

Therefore, variance of x = 2.119 × 25 = 52.975 = 52.98 (4 significant figures)

Standard deviation of x = √52.98 = 7.279 = 7.28 (to 2 decimal places)

End of topic

Did you understand everything?
If not, ask a teacher, friends, or anybody and make sure you understand before going to sleep!

Past KCSE Questions on the Topic

  1. Every week the number of absentees in a school was recorded. This was done for 39 weeks. These observations were tabulated as shown below:

    Number of absentees0-34-78-1112-1516-1920-23
    Number of weeks6981132

    Estimate the median absentee rate per week in the school.

  2. The table below shows high altitude wind speeds recorded at a weather station over a period of 100 days.

    Wind speed (knots)0-1920-3940-5960-7980-99100-119120-139140-159160-179
    Frequency (days)91922181311521

    (a) On the grid provided, draw a cumulative frequency graph for the data.

    (b) Use the graph to estimate:

    • The interquartile range;
    • The number of days when the wind speed exceeded 125 knots.
  3. Five pupils A, B, C, D, and E obtained the marks 53, 41, 60, 80, and 56 respectively. The table below shows part of the work to find the standard deviation.

    PupilMark xx – a(x – a)2
    A53-5
    B41-17
    C602
    D8022
    E56-2

    (a) Complete the table.

    (b) Find the standard deviation.

  4. In an agricultural research centre, the length of a sample of 50 maize cobs was measured and recorded as shown in the frequency distribution table below.

    Length in cm8-1011-1314-1617-1920-2223-25
    Number of cobs47111585

    Calculate:

    1. The mean;
    2. (i) The variance;
    3. (ii) The standard deviation.
  5. The table below shows the frequency distribution of masses of 50 newborn calves in a ranch.

    Mass (kg) Frequency

    15-18 2

    19-22 3

    23-26 10

    27-30 14

    31-34 13

    35-38 6

    39-42 2

    (a) On the grid provided, draw a cumulative frequency graph for the data.

    (b) Use the graph to estimate:

    1. The median mass;
    2. The probability that a calf picked at random has a mass lying between 25 kg and 28 kg.
  6. The table below shows the weight and price of three commodities in a given period.

    Commodity Weight Price Relatives

    X 3 125

    Y 4 164

    Z 2 140

    Calculate the retail index for the group of commodities.

  7. The number of people who attended an agricultural show in one day was 510 men, 1080 women, and some children. When the information was represented on a pie chart, the combined angle for the men and women was 216°. Find the angle representing the children.

  8. The mass of 40 babies in a certain clinic was recorded as follows:

    Mass in Kg No. of babies

    1.0-1.9 6

    2.0-2.9 14

    3.0-3.9 10

    4.0-4.9 7

    5.0-5.9 2

    6.0-6.9 1

    Calculate:

    1. The interquartile range of the data;
    2. The standard deviation of the data using 3.45 as the assumed mean.
  9. The data below shows the masses in grams of 50 potatoes.

    Mass (g)25-3435-4445-5455-6465-7475-8485-94
    No of potatoes361612841

    (a) On the grid provided, draw a cumulative frequency curve for the data.

    (b) Use the graph in (a) above to determine:

    1. The 60th percentile mass;
    2. The percentage of potatoes whose masses lie in the range 53g to 68g.
  10. The histogram below represents the distribution of marks obtained in a test.

    The bar marked A has a height of 3.2 units and a width of 5 units. The bar marked B has a height of 1.2 units and a width of 10 units.

    Image From EcoleBooks.com

    If the frequency of the class represented by bar B is 6, determine the frequency of the class represented by bar A.

  11. A frequency distribution of marks obtained by 120 candidates is to be represented in a histogram. The table below shows the grouped marks. Frequencies for all the groups and also the area and height of the rectangle for the group 30-60 marks.

    Marks0-1010-3030-6060-7070-100
    Frequency124036824
    Area of rectangle180
    Height of rectangle6

    (a) (i) Complete the table.

    (ii) On the grid provided below, draw the histogram.

    (b) (i) State the group in which the median mark lies.

    (ii) A vertical line drawn through the median mark divides the total area of the histogram into two equal parts. Using this information or otherwise, estimate the median mark.




');}
Bc0138c3d2dab0944d91d638547c2715

subscriber

Leave a Reply

Your email address will not be published. Required fields are marked *

Accept Our Privacy Terms.*