Pearson correlation coefficient: Introduction, formula, calculation, and examples

What is the Pearson correlation coefficient?

Pearson correlation coefficient or Pearson’s correlation coefficient or Pearson’s r is defined in statistics as the measurement of the strength of the relationship between two variables and their association with each other. 

In simple words, Pearson’s correlation coefficient calculates the effect of change in one variable when the other variable changes.

For example: Up till a certain age, (in most cases) a child’s height will keep increasing as his/her age increases. Of course, his/her growth depends upon various factors like genes, location, diet, lifestyle, etc. 

This approach is based on covariance and thus is the best method to measure the relationship between two variables.

Create a free account

What does the Pearson correlation coefficient test do?

The Pearson coefficient correlation has a high statistical significance. It looks at the relationship between two variables. It seeks to draw a line through the data of two variables to show their relationship. The relationship of the variables is measured with the help Pearson correlation coefficient calculator. This linear relationship can be positive or negative.

Pearson correlation coefficient linear relationship types

For example: 

  • Positive linear relationship: In most cases, universally, the income of a person increases as his/her age increases.
  • Negative linear relationship: If the vehicle increases its speed, the time taken to travel decreases, and vice versa.

From the example above, it is evident that the Pearson correlation coefficient, r, tries to find out two things – the strength and the direction of the relationship from the given sample sizes.

Pearson correlation coefficient formula

The correlation coefficient formula finds out the relation between the variables. It returns the values between -1 and 1. Use the below Pearson coefficient correlation calculator to measure the strength of two variables.

Pearson correlation coefficient formula:

pearson formula


N = the number of pairs of scores

Σxy = the sum of the products of paired scores

Σx = the sum of x scores

Σy = the sum of y scores

Σx2 = the sum of squared x scores

Σy2 = the sum of squared y scores

Pearson correlation coefficient calculator

Here is a step by step guide to calculating Pearson’s correlation coefficient:

Step one: Create a Pearson correlation coefficient table. Make a data chart, including both the variables. Label these variables ‘x’ and ‘y.’ Add three additional columns – (xy), (x^2), and (y^2). Refer to this simple data chart.

pearson table

Step two: Use basic multiplication to complete the table.

pearson table

Step three: Add up all the columns from bottom to top.

pearson table

Step four: Use the correlation formula to plug in the values.

If the result is negative, there is a negative correlation relationship between the two variables. If the result is positive, there is a positive correlation relationship between the variables. Results can also define the strength of a linear relationship i.e., strong positive relationship, strong negative relationship, medium positive relationship, and so on.

Determining the strength of the Pearson product-moment correlation coefficient

The Pearson product-moment correlation coefficient, or simply the Pearson correlation coefficient or the Pearson coefficient correlation r, determines the strength of the linear relationship between two variables. The stronger the association between the two variables, the closer your answer will incline towards 1 or -1. Attaining values of 1 or -1 signify that all the data points are plotted on the straight line of ‘best fit.’ It means that the change in factors of any variable does not weaken the correlation with the other variable. The closer your answer lies near 0, the more the variation in the variables.

How to interpret the Pearson correlation coefficient

Below are the proposed guidelines for the Pearson coefficient correlation interpretation:
Pearson correlation
Note that the strength of the association of the variables depends on what you measure and sample sizes.
On a graph, one can notice the relationship between the variables and make assumptions before even calculating them. The scatterplots, if close to the line, show a strong relationship between the variables. The closer the scatterplots lie next to the line, the stronger the relationship of the variables. The further they move from the line, the weaker the relationship gets. If the line is nearly parallel to the x-axis, due to the scatterplots randomly placed on the graph, it’s safe to assume that there is no correlation between the two variables.

What do the terms strength and direction mean?

The terms ‘strength’ and ‘direction’ have a statistical significance. Here’s a straightforward explanation of the two words:

  • Strength: Strength signifies the relationship correlation between two variables. It means how consistently one variable will change due to the change in the other. Values that are close to +1 or -1 indicate a strong relationship. These values are attained if the data points fall on or very close to the line. The further the data points move away, the weaker the strength of the linear relationship. When there is no practical way to draw a straight line because the data points are scattered, the strength of the linear relationship is the weakest.
  • Direction: The direction of the line indicates a positive linear or negative linear relationship between variables. If the line has an upward slope, the variables have a positive relationship. This means an increase in the value of one variable will lead to an increase in the value of the other variable. A negative correlation depicts a downward slope. This means an increase in the amount of one variable leads to a decrease in the value of another variable.

Create a free account

Examples of Pearson’s correlation coefficient

Let’s look at some visual examples to help you interpret a Pearson correlation coefficient table:

  • Large positive correlation:
    pearson correlation coefficient

The above figure depicts a correlation of almost +1.
The scatterplots are nearly plotted on the straight line.
The slope is positive, which means that if one variable increases, the other variable also increases, showing a positive linear line.
This denotes that a change in one variable is directly proportional to the change in the other variable.
An example of a large positive correlation would be – As children grow, so do their clothes and shoe sizes.
Let’s look at some visual examples to help you interpret a Pearson correlation coefficient table:

  • Medium positive correlation:
    pearson's r

The figure above depicts a positive correlation.
The correlation is above than +0.8 but below than 1+.
It shows a pretty strong linear uphill pattern.
An example of a medium positive correlation would be – As the number of automobiles increases, so does the demand in the fuel variable increases.

  • Small negative correlation
    pearson correlation coefficient

In the figure above, the scatter plots are not as close to the straight line compared to the earlier examples
It shows a negative linear correlation of approximately -0.5
The change in one variable is inversely proportional to the change of the other variable as the slope is negative.
An example of a small negative correlation would be – The more somebody eats, the less hungry they get.

  • Weak / no correlation
    pearson's r

The scatterplots are far away from the line.
It is tough to practically draw a line.
The correlation is approximately +0.15
It can’t be judged that the change in one variable is directly proportional or inversely proportional to the other variable.
An example of a weak/no correlation would be – An increase in fuel prices leads to lesser people adopting pets.