# Cross Tabulation and Chi Square Analysis Explained

## Know how to use cross tabulation and chi square to examine the relationship with the data that is not evident.

Join over
10 million users

Content Index

### What is Cross Tabulation?

Ever looked at the Nutrition Chart behind a snack pack? This little table gives you a comprehensive breakdown of how a particular snack will contribute to your overall energy levels, this breakdown helps you make informed decision in regards to your diet and calorie consumption.

Cross Tabulation is a main frame statistical model which follows on similar lines, it help you take informed decision with regards to your research by identifying patterns, trends and correlation between parameters within your study. When conducting a study, the raw data can usually be daunting and will always points to several chaotic possible outcomes, in such situation cross-tab helps you zero in on a single theory beyond doubt by drawing trends, comparisons and correlations between factors that are mutually inclusive within your study.

For example, consider your college application - you probably did not realize it at the time but you were mentally cross tabulating the factors involved to arrive at a conscious decision with respect to which colleges you wanted to attend and had the best shot at while applying. Let us go through your decision making process one factor at a time.

First, you needed to look at the academic factor which were your grades throughout high school, SAT scores, the field you wanted to major in and the application essay you would need to write. Second, comes the financial factor which will look at the tuition fees and possibilities of a scholarship. Last, but definitely not the least, would be the emotional factor which will consider your distance from home and how far are the universities your friends are considering so reunions would not be an issue. In other words, cross tabulating Academics + Finance + Emotions led you to a refined list of universities one of which is or soon will be your Alma Mater.

Cross tabulation also known as cross-tab or contingency table is a statistical tool that is used for categorical data. Categorical data involves values that are mutually exclusive to each other. Data is always collected in numbers, but numbers have no value unless they mean something. 4,7,9 are simply numerical unless until specified. For example, 4 apples, 7 bananas, and 9 kiwis.

Cross tabulation is usually used to examine the relationship within the data that is not evident. It is quite useful in market research studies and in surveys. A cross tab report shows the connection between two or more question asked in the survey.

New Feature Alert: Cross Tab 2.0

### Understanding Cross Tabulation with Example

Cross-tab is a popular choice for statistical data analysis. Since it is a reporting/ analyzing tool it can used with any level of data: ordinal or nominal, because it treats all data as nominal data (nominal data is not measured it is categorised).

Let’s say you can analyze the relation between two categorical variable like age and purchase of electronic gadgets.

There are two questions asked here:

2. What is the electronic gadget that you are likely to buy in the next 6 months?
Age Laptop Phone Tablet Digital Camera
20-25 38% 29% 31% 12%
25-30 19% 15% 24% 17%
30-35 23% 19% 11% 27%
35-40 19% 12% 9% 30%
above 40 12% 17% 5% 31%

In this example you can see the distinctive connection between the age and the purchase of the electronic gadget. It is not surprising but certainly interesting to see the correlation between the two variables through the data collected.

In survey research crosstab allows to deep dive and analyze the prospective data, making it simpler to spot trends and opportunities without getting overwhelmed with all the data gathered from the responses.

### Cross Tabulation and Chi Square

Chi square or Pearson's chi- square test, is any statistical hypothesis, which is used to determine whether there is a significant difference between expected frequencies and the observed frequencies in one or more category.

An important consideration when cross tabulating the findings of your study is verifying if what is represented in the cross-tab is true or false. This is similar to the doubt some of us would be in after joining a university questioning if this was indeed a good fit or not. To resolve this dilemma crosstab is computed along with the Chi-Square analysis which helps identify if the factors involved in the study are independent or related to each other. If the two factors are independent then the tabulation is termed insignificant and the study would be termed as null hypothesis which means that since the factors are not related to each other the outcome of the study is unreliable on the contrary if there exist a relation between the two factors that would confirm that the tabulation results are significant and can be relied on to take strategic decisions.

Another significant term that we will introduce here is “Null hypothesis”. The null hypothesis, basically assumes, any kind of difference or importance one can see in a set of data is by chance. The opposite of the null hypothesis is called the “alternative hypothesis”.

Applying chi square to surveys is usually done with these question types:

For example, we need to find out if there is any association between the buyer behavior of purchasing electronic devices and the region where it is sold then the data will be entered like the one in the table below:

As mentioned earlier the Chi square test helps you determine if two discrete variables are associated. If there's an association, the distribution of one variable will differ depending on the value of the second variable. But if the two variables are independent, the distribution of the first variable will be similar for all values of the second variable.

Using cross tabulation and chi square we derive the following inference:

Applying the Chi square calculation to the above values:

Pearson's chi square= 0.803, P- Value= 0.05

So what does this mean?

We need to pay attention to the p- value. Compare the p-value to your alpha-level which is commonly 0.05

• If p-value is less than or equal to alpha-value then the two variables are associated.
• If p-value is greater than alpha value, you conclude the variables are independent.

In this example Pearson chi-square statistics is 0.803 (with a p-value 0.05). So with an alpha-value of 0.05, we therefore, conclude that there is no correlation and is insignificant.

### Advantages of Cross Tabulation in research

1. One major advantage of using cross tabulation in a survey is, its simple to compute and extremely easy to understand. Even if the researcher does not have an in-depth knowledge of the concept, it is extremely easy to interpret the results.
2. It eliminates confusions as raw data can sometimes be difficult to understand and interpret. Even if there are small data sets there is a possibility that you might get confused if the data is not arranged in an orderly manner. Cross tabulation offers a simple way of correlating the variables that help minimize a confusion related to data representation.
3. One can derive numerous insights from cross tabulation. As mentioned in the examples of cross tabulation in the section above, it is not easy to interpret raw data. Cross-tab clearly maps out the correlation between variables, insights that otherwise may have been overlooked are clearly understood. It is extremely easy to understand the insights from even a complicated form of statistics.
4. It provides the qualified or relative data on two or more variable across multiple features with ease.
5. The most important advantage of using cross tabulation for survey analysis is the ease of using any type of data, whether it is nominal, ordinal, interval and ratio.