How to Calculate Correlation in Excel: Step by Guide
Last Updated :
02 Sep, 2024
Understanding the relationship between two variables is essential in data analysis, and correlation is a powerful statistical tool to measure that relationship. Excel, as a versatile data analysis tool, allows you to calculate correlation easily.
In this article, you will learn the different methods to calculate correlation in Excel, including using built-in functions and data analysis tools. Whether you're a beginner or an advanced Excel user, this step-by-step guide will help you efficiently analyze the relationships between variables.
How to Calculate Correlation in Excel
What is Correlation in Excel?
Correlation measures the strength and direction of the linear relationship between two variables. The correlation coefficient ranges from -1 to 1:
- +1 indicates a perfect positive correlation, where variables move in the same direction.
- -1 indicates a perfect negative correlation, where variables move in opposite directions.
- 0 means no linear relationship exists between the variables.
Understanding the Correlation Coefficient
Discover how the correlation coefficient helps interpret the relationship between variables.
The correlation coefficient quantifies how strongly two variables are related. The closer the coefficient is to 1 or -1, the stronger the relationship:
Positive Correlation: When the coefficient is positive, both variables increase or decrease together.
Negative Correlation: When the coefficient is negative, one variable increases while the other decreases.
No Correlation: A coefficient close to zero suggests no linear relationship between the variables.
What is Correlation Data Analysis in Excel?
It is essential to make sure that your data is well organized in a spreadsheet before using correlation. Each variable should have its own column and each row should represent an observation or data point. You can refer to the below points to prepare your data:
- Open Excel: After launching Microsoft you can create a new spreadsheet or open an existing sheet that contains the data you want to analyze.
- Organize your data: Enter the data in appropriate cells, ensuring that each variable has its column, and each row represents an observation.
- Data Format: Your data should be in numerical format for accurate correlation analysis. If your data is in the non-numeric format then convert it in numeric format.
You can also enter the correlation formula yourself, Below is the correlation formula:
=\frac{\sum(x-\overline{x})(y-\overline{y})}{\sqrt{\sum(x-\overline{x})^{2}-\sum(y-\overline{y})^{2}})
where X and Y are measurements, ∑ is the sum, and the X and Y with bars over them indicate the mean value of the measurements.
How to Calculate Correlation in Excel: Step by Step Guide
The value of the correlation coefficient ranges from -1 to +1. The closer the value is to -1 or +1, the strongly both entities are related to one another. If the correlation coefficient comes out to be 0, we say that there is no linear relationship between both entities. Let's understand this with the help of an example, in which we will calculate the Pearson correlation coefficient using Excel. Suppose, we have records of the height and weight of 10 students of a class which is given as:
Height (in cm) | Weight (in Kg) |
---|
155 | 66 |
178 | 82 |
148 | 62 |
162 | 70 |
165 | 71 |
172 | 74 |
158 | 64 |
152 | 65 |
176 | 80 |
185 | 93 |
We can calculate correlation in Excel using two methods:
Method 1: Using CORREL() Function
Excel has a built-in CORREL() function that can be used for calculating the Pearson correlation coefficient. The basic syntax for CORREL() is given as:
=CORREL(array1, array2)Â
Where array1 and array2 are the arrays of records of the first entity and second entity, respectively.
Step 1: We can calculate the Correlation coefficient between both attributes using the formula applied in the A13 cell, i.e.,
=CORREL(A2:A11, B2:B11)Â
We pass the first array, Height (in cm) from A2:A11 as the first parameter, and the second array, Weight (in kg) from B2:B11 as the second parameter inside the CORREL() formula.
Using the CORREL() function to calculate Pearson's correlation coefficientThe value obtained after calculating the correlation coefficient comes out to be 0.959232649 which is very close to +1, hence we can derive a conclusion that the height and weight of the student are highly positively correlated to each other. We can likely say if a student is taller then there is a higher chance that the student will be having higher weight as well.
A video is also given below demonstrating all the usage of the CORREL() function to calculate the correlation value.
Go to the Data tab in the menu bar and select Data Analysis. If you don't see it, you may need to enable the Analysis ToolPak from Excel Options.

Step 2: Click on the Data Analysis
From the data tab, select the Data Analysis option.

Step 3: Select the Correlation Option
A data analysis tools dialogue box will appear, in the dialogue box select the Correlation option.
Data Analysis dialog boxAn additional dialogue box for correlation will appear, in the dialogue box first we have to give the input range, so select the entire table. Since our data is grouped by Columns, we will select the Columns option. Also, our data have labels in the first row, therefore we will click the checkbox saying Labels in the first row. We can get output as per our requirement in the current sheet or a new worksheet or a new workbook. We can select the new worksheet option and click the OK button.
Filling all the values inside the correlation dialog boxStep 5: Preview the Result
The output will get automatically generated in the new worksheet.
The correlation table generated using the Data Analysis toolA video is also given below demonstrating all the above steps given above to calculate the correlation value.
From the new worksheet, we can notice a correlation table will get generated in which we can see our correlation value between height and weight comes out to be 0.959232649, which we also got in using the first method.
Excel correlations are a good place to start when creating a marketing, sales, and spending plan, but they don't provide the full picture. In order to rapidly assess the correlation between two variables and use this information as a starting point for more in-depth analysis, it is worthwhile to use Excel's built-in data analysis options.
Creating a Correlation Matrix in Excel
Learn how to create a correlation matrix to analyze multiple variables in a dataset.
A correlation matrix allows you to examine relationships between multiple variables simultaneously:
Step 1: Organize Your Data
Ensure each variable is in a separate column and each observation is in a row.
Step 2: Select the Data Range
Highlight the entire range of data, including column headers.
Step 3: Use the CORREL Function
Go to the Formulas tab, click on More Functions > Statistical > CORREL.
Step 4: Enter the Data Range in the Function Wizard
Select the data ranges for each pair of variables in the CORREL function wizard and click OK.
Step 5: Review the Correlation Matrix
Excel will display the correlation coefficients in a matrix format, allowing you to see how each variable relates to the others.
Benefits of Calculating Correlation in Excel
Understand the advantages of using Excel to calculate correlation for data analysis.
Identify Relationships: Determine if and how strongly variables are related.
Support Decision-Making: Use correlation to make informed decisions in marketing, sales, finance, and other fields.
Visualize Data Trends: Spot trends and patterns in your data quickly.
Tips for Correlation Analysis
- Data Cleaning: Make sure that your data is accurate and error-free before performing the correlation analysis. Incorrect or missing data can affect the output.
- Sample Size: Correlation analysis is more reliable with larger sample sizes. Smaller sample sizes may lead to less accurate results.
- Causation vs. Correlation: Correlation does not imply causation. Even with a strong correlation, it is essential to explore other factors and conduct further research before establishing causation.
Conclusion
Calculating correlation in Excel is an essential skill for anyone involved in data analysis. Whether you use the CORREL function or Excel’s Data Analysis Tool, these methods allow you to quickly assess relationships between variables. Start using these techniques today to gain deeper insights from your data!
Similar Reads
Non-linear Components In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co
11 min read
Spring Boot Tutorial Spring Boot is a Java framework that makes it easier to create and run Java applications. It simplifies the configuration and setup process, allowing developers to focus more on writing code for their applications. This Spring Boot Tutorial is a comprehensive guide that covers both basic and advance
10 min read
Class Diagram | Unified Modeling Language (UML) A UML class diagram is a visual tool that represents the structure of a system by showing its classes, attributes, methods, and the relationships between them. It helps everyone involved in a projectâlike developers and designersâunderstand how the system is organized and how its components interact
12 min read
Steady State Response In this article, we are going to discuss the steady-state response. We will see what is steady state response in Time domain analysis. We will then discuss some of the standard test signals used in finding the response of a response. We also discuss the first-order response for different signals. We
9 min read
Backpropagation in Neural Network Back Propagation is also known as "Backward Propagation of Errors" is a method used to train neural network . Its goal is to reduce the difference between the modelâs predicted output and the actual output by adjusting the weights and biases in the network.It works iteratively to adjust weights and
9 min read
Polymorphism in Java Polymorphism in Java is one of the core concepts in object-oriented programming (OOP) that allows objects to behave differently based on their specific class type. The word polymorphism means having many forms, and it comes from the Greek words poly (many) and morph (forms), this means one entity ca
7 min read
3-Phase Inverter An inverter is a fundamental electrical device designed primarily for the conversion of direct current into alternating current . This versatile device , also known as a variable frequency drive , plays a vital role in a wide range of applications , including variable frequency drives and high power
13 min read
What is Vacuum Circuit Breaker? A vacuum circuit breaker is a type of breaker that utilizes a vacuum as the medium to extinguish electrical arcs. Within this circuit breaker, there is a vacuum interrupter that houses the stationary and mobile contacts in a permanently sealed enclosure. When the contacts are separated in a high vac
13 min read
AVL Tree Data Structure An AVL tree defined as a self-balancing Binary Search Tree (BST) where the difference between heights of left and right subtrees for any node cannot be more than one. Example of an AVL Tree:The balance factors for different nodes are : 12 :1, 8:1, 18:1, 5:1, 11:0, 17:0 and 4:0. Since all differences
4 min read
CTE in SQL In SQL, a Common Table Expression (CTE) is an essential tool for simplifying complex queries and making them more readable. By defining temporary result sets that can be referenced multiple times, a CTE in SQL allows developers to break down complicated logic into manageable parts. CTEs help with hi
6 min read