Individual Activity - Linear Correlation Coefficient
GOALS:
a. To understand qualitatively the correlation coefficient
b. To be able to calculate the correlation coefficient
c. To understand the limits of the correlation coefficient.
DISCUSSION:
The linear correlation coefficient (r) is a number between -1 and 1 which measures how close to a straight line a set of points falls. The closer to zero the correlation coefficient is, the less the points fall on a straight line (hence the term "linear" correlation coefficient).
| The closer the correlation coefficient is to one, the more the points will fall along a line stretching from the lower left to the upper right. |
The closer the correlation coefficient is to negative one, the more the points will fall along a line stretching from the upper left to the lower right. |
![]() |
![]() |
CALCULATING THE CORRELATION COEFFICIENT:
- Consider a data set of N pairs of numbers
| n | x | y | ||
| 1 | x1 | y1 | ||
| 2 | x2 | y2 | ||
| 3 | x3 | y3 | ||
| . | . | . | ||
| . | . | . | ||
| . | . | . | ||
| N | xN | yN |
(i) The first step is to calculate the averages (<x> and <y>) of the x and y values:
<x> = 1/N (x1 + x2 + x3 ...+ xN)
<y> = 1/N (y1 + y2 + y3 ...+ yN)
(ii) Calculate the standard deviations (s x and s y) of the x and y data sets:
(iii) Calculate the covariance between the two data sets:
(iv) The correlation coefficient is then defined as:
EXAMPLE 1:
Calculate the correlation coefficient for the following x and y data sets: (Here is some graph paper in case you don't have any at home.)
| n | x | y | ||
| 1 | 1 | 4 | ||
| 2 | 2 | 4 | ||
| 3 | 3 | 3 | ||
| 4 | 4 | 2 | ||
| 5 | 5 | 1 |
In this case, the x variable corresponds to a constantly increasing parameter, such as time.
(i) Calculate <x> and <y>, average x and average y:
<x> = 1/5 (1 + 2 + 3 + 4 + 5) = 3.0
<y> = 1/5 (4 + 4 + 3 + 2 + 1) = 2.8
(ii) Calculate s x and s y, the standard deviation of x and y:
(iii) Calculate s xy, the covariance:

(iv) Calculate r, the correlation coefficient:
EXAMPLE 2: (Here is some graph paper in case you don't have any at home.)
| n | x | y | ||
| 1 | 2 | 2 | ||
| 2 | 3 | 3 | ||
| 3 | 3 | 2 | ||
| 4 | 3 | 1 | ||
| 5 | 4 | 2 |
(i) The first step is to calculate the averages (<x> and <y>) of the x and y values:
<x> = 1/5 (2 + 3 + 3 + 3 + 4) = 3.0
<y> = 1/5 (2 + 3 + 2 + 1 + 2) = 2.0
(ii) Calculate the standard deviations (s x and s y) of the x and y data sets:
(iii) Calculate s xy, the covariance:
(iv) Calculate r, the correlation coefficient:

