Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers for reducing data to a story and interpreting it to derive insights. The data analysis process helps in reducing a large chunk of data into smaller fragments, which makes sense.
Three essential things take place during the data analysis process — the first data organization. Summarization and categorization together contribute to becoming the second known method used for data reduction. It helps in finding patterns and themes in the data for easy identification and linking. Third and the last way is data analysis – researchers do it in both top-down or bottom-up fashion.
Marshall and Rossman, on the other hand, describe data analysis as a messy, ambiguous, and time-consuming, but a creative and fascinating process through which a mass of collected data is being brought to order, structure and meaning.
We can say that “the data analysis and interpretation is a process representing the application of deductive and inductive logic to the research and data analysis.”
Researchers rely heavily on data as they have a story to tell or problems to solve. It starts with a question, and data is nothing but an answer to that question. But, what if there is no question to ask? Well! It is possible to explore data even without a problem – we call it ‘Data Mining’ which often reveal some interesting patterns within the data that are worth exploring.
Irrelevant to the type of data, researchers explore, their mission and audiences’ vision guide them to find the patterns so they could shape the story they want to tell. One of the essential things expected from researchers while analyzing data is to stay open and remain unbiased towards unexpected patterns, expressions, and results. Remember, sometimes, data analysis tells the most unforeseen yet exciting stories that were not at all expected at the time of initiating data analysis. Therefore, rely on the data you have at hand and enjoy the journey of exploratory data analysis in research.
Every kind of data has a rare quality of describing things after assigning a specific value to it. For analysis, you need to organize these values, processed and presented in a given context, to make it useful. Data can be in different forms, here are the primary data types
- Qualitative data: When the data presented has words and descriptions, then we call it qualitative data. Although you can observe this data, it is subjective and, therefore, harder to analyze data in research, especially for comparison. Example: Quality data represents everything describing taste, experience, texture, or an opinion is considered as a quality data. This type of data is usually collected through focus groups, personal interviews, or using open-ended questions in surveys.
- Quantitative data: Any data expressed in numbers of numerical figures are called quantitative data. This type of data can be distinguished into categories, grouped, measured, calculated, or ranked. Example: questions such as age, rank, cost, length, weight, scores, etc. everything comes under this type of data. You can present such data in graphical format, charts, or you can apply statistical analysis methods to this data. The (Outcomes Measurement Systems) OMS questionnaires in surveys are a significant source of collecting numeric data.
- Categorical data: It is data presented in groups. However, an item included in the categorical data cannot belong to more than one group at a time. Example: a person responding to a survey by telling his living style, marital status, smoking habit, or drinking habit comes under the categorical data. A chi-square test is a standard method used to analyze this data.
Data analysis and research in qualitative data work a little differently than the numerical data as the quality data is made up of words, descriptions, images, objects, and sometimes symbols. Getting insight from such complicated information is a complicated process, hence is typically used for exploratory research and data analysis.
Although there are several ways to find patterns in the textual information, a word-based method is the most relied and widely used global technique for research and data analysis. Notably, the data analysis process in qualitative research is manual. Here the researchers usually read the available data and find repetitive or commonly used words.
For example: while studying data collected from African countries to understand the most pressing issues faced by people, researchers might find “food” and “hunger” are the most commonly used words and will highlight them for further analysis
The keyword context is another widely used word-based technique. In this method, the researcher tries to understand the concept by analyzing the context in which the participants use a particular keyword.
For example, researchers conducting research and data analysis for studying the concept of ‘diabetes’ amongst respondents might analyze the context of when and how the respondent has used or referred to the word ‘diabetes.’
Scrutiny based technique is also one of the highly recommended text analysis methods used to identify a pattern in the quality data. Compare and contrast is the widely used method under this technique to differentiate how a specific text is similar or different from each other.
For example: to find out the “importance of resident doctor in a company,” the collected data is divided into people who think it is necessary to hire a resident doctor and those who think it is unnecessary. Compare and contrast is the best method that can be used to analyze the polls having single answer questions types.
Metaphors can be used to reduce the data pile and find patterns in it so that it becomes easier to connect data with theory.
Variable Partitioning is another technique used to split variables so that researchers can find more coherent descriptions and explanations from the enormous data.
There are several techniques to analyze the data in qualitative research, but here are some commonly used methods,
Content Analysis: It is widely accepted and the most frequently employed technique for data analysis in research methodology. It can be used to analyze the documented information from text, images, and sometimes from the physical items also. It depends on the research questions to predict when and where to use this method.
Narrative Analysis: This is a method used to analyze content gathered from various sources. Here, the source can be personal interviews, field observation, and surveys. The majority of times, stories or opinions shared by people are focused on finding answers to the research questions.
Discourse Analysis: Similar to narrative analysis, discourse analysis is used to analyze the interactions with people. Nevertheless, this particular method takes into consideration the social context under which or within which the communication between the researcher and respondent takes place. In addition to that, discourse analysis also focuses on the lifestyle and day-to-day environment while deriving any conclusion.
Grounded Theory: When you want to explain why a particular phenomenon happened, then using grounded theory for analyzing quality data is the best resort. Grounded theory is applied to study data about the host of similar cases occurring in different settings. When researchers are using this method, they might alter explanations or produce new ones until they arrive at some conclusion.
The first stage in research and data analysis is to make it for the analysis so that the nominal data can be converted into something meaningful. Data preparation consists of four phases
Phase I: Data Validation
Data validation is done to understand if the collected data sample is per the pre-set standards, or it is a biased data sample again divided into four different stages
- Fraud: To ensure an actual human being records each response to the survey or the questionnaire
- Screening: To ensure each participant or respondent is selected or chosen in compliance with the research criteria
- Procedure: To ensure ethical standards were maintained while collecting the data sample
- Completeness: To ensure that the respondent has answered all the questions in an online survey. Else, the interviewer had asked all the questions devised in the questionnaire.
Phase II: Data Editing
More often, an extensive research data sample comes loaded with errors. Respondents sometimes fill in some fields incorrectly or sometimes skip them accidentally. Data editing is a process wherein the researchers have to confirm that the provided data is free of such errors. For that, they need to conduct necessary checks and outlier checks to edit the raw edit and make it ready for analysis.
Phase III: Data Coding
Out of all three, this is the most critical phase of data preparation, which is associated with grouping and assigning values to the survey responses. Suppose a survey is completed with a 1000 sample size, then the researcher will create an age bracket to distinguish the respondents based on their age. Thus, it becomes easier to analyze small data buckets rather than to deal with the massive data pile.
After the data is prepared for analysis, researchers are open to using different research and data analysis methods to derive meaningful insights. For sure, statistical techniques are most favored to analyze numerical data. The method is again classified into two groups. First, ‘Descriptive Statistics’ used to describe data. Second, ‘Inferential statistics’ that helps in comparing the data.
This method is used to describe the basic features of versatile types of data in research. It presents the data in such a meaningful way that pattern in the data starts making sense. Nevertheless, the descriptive analysis does not go beyond making conclusions. The conclusions are again based on the hypothesis researchers have formulated so far. Here are a few major types of descriptive analysis methods
Measures of Frequency
- Count, Percent, Frequency
- It is used to denote home often a particular event occurs
- Researchers use it when they want to showcase how often a response is given
Measures of Central Tendency
- Mean, Median, Mode
- The method is widely used to demonstrate distribution by various points
- Researchers use this method when they want to showcase the most commonly or averagely indicated response
Measures of Dispersion or Variation
- Range, Variance, Standard deviation
- Here the field equals to high/low points
- Variance standard deviation = difference between the observed score and mean
- It is used to identify the spread of scores by stating intervals
- Researchers use this method to showcase data spread out. It helps them identify the depth until which the data is spread out that it directly affects the mean.
Measures of Position
- Percentile ranks, Quartile ranks
- It relies on standardized scores helping researchers to identify the relationship between different scores.
- It is often used when researchers want to compare scores with the average count.
For quantitative market research use of descriptive analysis often give absolute numbers, but the analysis is never sufficient to demonstrate the rationale behind those numbers. Nevertheless, it is necessary to think of the best method to be used for research and data analysis suiting your survey questionnaire and what story researchers want to tell. For example, the mean is the best way to demonstrate the average scores of the students in schools. It is better to rely on the descriptive statistics when the researchers intend to keep the research or outcome limited to the provided sample without generalizing it to the population. For example, when you want to compare average voting done in two different cities, then differential statistics is enough.
Descriptive analysis is also called a ‘univariate analysis’ since it is commonly used to analyze a single variable.
Inferential statistics are used to make predictions about a larger population after research and data analysis of the collected sample of the representing population. For example, at a movie theater, you can ask some odd 100 audiences if they like the movie they are watching. Researchers then use inferential statistics on the collected sample to reason that about 80-90% of people like the movie they are watching.
Here are two significant areas of inferential statistics
- Estimating parameters: it takes statistics from the sample research data and uses it to demonstrate something about the population parameter.
- Hypothesis test: it’s about sampling research data to answer the survey research questions. For example, researchers might be interested to understand if the new shade of lipstick recently launched is good or not, or if the multivitamin capsules help children to perform better at games.
These are sophisticated analysis methods used to showcase the relationship between different variables instead of describing a single variable. It is often used when researchers want something beyond absolute numbers to understand the relationship between variables.
Here are some of the commonly used methods for data analysis in research
- Correlation: When researchers are not conducting experimental research wherein the researchers are interested to understand the relationship between two or more variables, they opt for correlational research methods.
- Cross-tabulation: Also called as contingency tables, cross-tabulation is a method used to analyze the relationship between multiple variables. Suppose a provided data has age and gender categories presented in rows and columns, then a two-dimensional cross-tabulation helps for seamless data analysis and research by showing the number of males and the number of females in each age category.
- Regression analysis: For understanding the strong relationship between two variables, researchers do not look beyond the primary and commonly used regression analysis method, which is also a type of predictive analysis used. In this method, you have an essential factor called the dependent variable, and you also have multiple independent variables in regression analysis, you undertake efforts to find out the impact of independent variables on the dependent variable. The values of both independent and dependent variables are assumed as being ascertained in an error-free random manner.
- Frequency tables: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
- Analysis of variance: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
- Researchers must have the necessary skills to analyze the data, Getting trained to demonstrate a high standard of research practice. Ideally, researchers must possess more than a basic understanding of the rationale of selecting one statistical method over the other to obtain better data insights.
- Usually, research and data analytics methods differ by scientific discipline; therefore, obtaining statistical advice at the beginning of analysis helps in designing a survey questionnaire, selecting data collection methods, selecting samples.
- The primary aim of data research and analysis is to derive ultimate insights that are unbiased. Any mistake in or keeping a biased mind to collect data, selecting analysis method, or in choosing audience sample il to result in drawing a biased inference.
- Irrelevant to the sophistication used in research data and analysis is enough to rectify the poorly defined objective outcome measurements. It does not matter if the design is at fault or intentions are not clear, but lack of clarity might mislead readers, therefore avoid the practice.
- The motive behind data analysis in research is to present accurate and reliable data. As far as possible, avoid statistical errors, and find a way to deal with everyday challenges like outliers, missing data, data altering, data mining, or developing graphical representation.
The sheer amount of data generated daily is frightening. Especially when data analysis has taken center stage. in 2018. In last year, the total data supply amounted to 2.8 trillion gigabytes. Hence, it is evidently clear that enterprises willing to survive in the hypercompetitive world must possess an excellent capability to analyze complex research data, derive actionable insights, and adapt to new market needs.
QuestionPro is an online survey platform that empowers organizations not only in data analysis and research but also by providing them a medium to collect data by creating appealing surveys