Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers for reducing data to a story and interpreting it to derive insights. A process where a large chunk of data is reduced in smaller fragments which make sense. Three major things take place during data analysis; data organization, data reduction through summarization and categorization, so that the patterns and themes in the data can be easily identified and linked. Data analysis can be done in both, top-down or bottom-up fashion.
Marshall and Rossman, on the other hand, describe data analysis as a messy, ambiguous, and time-consuming, but a creative and fascinating process through which a mass of collected data is brought to order, structure and meaning.
We can simply say that “the data analysis and interpretation is a process representing the application of deductive and inductive logic to the research and data analysis ”.
Researchers rely heavily on data as they have a story to tell or problems to solve. It starts with a question and data is nothing but an answer to that question. But, what if there is no question to ask? Well! It is possible to explore data even without a question – the method is called as ‘Data Mining’ which often reveal some interesting patterns within the data that are worth exploring.
Irrelevant to the type of data, researchers explore, their mission and audiences vision guide them to find the patterns so they could shape the story they want to tell. One of the major things expected from researchers while analyzing data is to stay open and remain unbiased towards unexpected patterns, expressions, and results. Remember, sometimes data analysis tells the most unexpected yet interesting stories that were not at all expected at the time of initiating data analysis. Therefore, rely on the data you have at hand and enjoy the journey of exploratory research.
Every type of data has a rare quality of describing things if a certain value is assigned to it. For analysis, these values just need to be organized processed, and presented in a given context to make it useful. Data can be in different forms, here are the major data types
- Qualitative data: When the data presented has words and descriptions then we call it as qualitative data. Although you can observe this data, it is subjective and therefore harder to analyze this type of data, especially for comparison. Example: anything that describes, taste, experience, texture or an opinion is considered as a quality data. This type of data is usually collected through focus groups, personal interviews, or using open-ended questions in surveys.
- Quantitative data: Any data that is expressed in numbers of numerical figures are called quantitative data. This type of data can be distinguished into categories, grouped, measured, calculated or ranked. Example: questions such as age, rank, cost, length, weight, scores, etc. everything comes under this type of data. You can present such data in graphical format, charts, or you can apply statistical analysis methods to this data. The (Outcomes Measurement Systems) OMS questionnaires in surveys is a major source of collecting numeric data.
- Categorical data: A data that is presented in groups is called as categorical data. However, an item included in the categorical data cannot belong to more than one group at a time. Example: a person responding to a survey by telling his living style, marital status, smoking habit or drinking habit comes under the categorical data. Chi-square test is a common method used to analyze this data.
- Continuous data: A numerical data when measured on a continuous range or scale it is called as a continuous data giving all possible values without any gaps. Example: a data dealing with a person’s height, weight, or temperature can be categorized as continuous data. There are different types of data analysis techniques used along with continuous data including effect size calculations.
Data analysis and research in qualitative data work little differently than the numerical data as the quality data is made up of words, descriptions, images, objects and sometimes symbols. Getting insight from such complicated data is a complicated process, hence is typically used for exploratory research and data analysis.
Although there are several ways to find patterns in the textual data, a word-based method is the most relied and widely used global technique for research and data analysis. The process is manual wherein the researchers usually read the available data and find repetitive or commonly used words. For example: while studying data collected from African countries to understand the most pressing issues faced by people, researchers might find “food” and “hunger” are the most commonly used words and will highlight them for further analysis
Keyword context is another widely used word-based technique. In this method, the researcher tries to understand the concept by analyzing the context in which a particular keyword is used. For example, researchers conducting research and data analysis for studying the concept of ‘diabetes’ amongst respondents might analyze the context of when and how the respondent has used or referred to the word ‘diabetes’.
Scrutiny based technique is also one of the highly recommended text analysis methods used to identify a pattern in the quality data. Compare and contrast is the widely used method under this technique to differentiate how a specific text is similar or different from each other. For example: to find out the “importance of resident doctor in a company” the collected data is divided into people who think it is necessary to hire a resident doctor and those who think it is unnecessary. Compare and contrast is the best method that can be used to analyze the polls having single answer questions types.
Metaphors can be used to reduce the data pile and find patterns in it so that it becomes easier to connect data with theory.
Variable Partitioning is another technique used to split variables so that researchers are able to find more coherent descriptions and explanations from the huge data.
The quality data can be analyzed using several techniques, but here are some commonly used methods,
Widely accepted and most commonly used the technique for data analysis in research. It can be used to analyze the documented information from text, images, and some times from the physical items also. It depends on the research questions to predict when and where to use this method.
This is a method used to analyze content gathered from various sources. Here the source can be personal interviews, field observation, and surveys. Majority of times stories or opinions shared by people are focused to find answers to the research questions.
Similar to narrative analysis, discourse analysis is used to analyze the interactions with people. Nevertheless, this particular method takes into consideration the social context under which or within which the communication between the researcher and respondent takes place. In addition to that, discourse analysis also focuses on the lifestyle and day-to-day environment while deriving any conclusion.
When you want to explain why a certain phenomenon happened, then using grounded theory for analyzing quality data is the best resort. Grounded theory is applied to study data about the host of similar cases happening in different settings. When researchers are using this method, they might alter explanations or produce new ones until they arrive at some conclusion.
The first stage in research and data analysis is to prepare it for the analysis so that the nominal data can be converted into something meaningful. Data preparation consists of four phases
Phase I: Data Validation
Data validation is done to understand if the collected data sample is in accordance with the pre-set standards or it is a biased data sample. It is again divided into four different stages
- Fraud: To ensure each response to the survey or the questionnaire is recorded by an actual human being
- Screening: To ensure each participant or respondent is selected or chosen in compliance with the research criteria
- Procedure: To ensure ethical standards were maintained while collecting the data sample
- Completeness: To ensure the respondent either answered all the questions in the online survey or the interviewer has asked all the questions devised in the questionnaire.
Phase II: Data Editing
More often a large research data sample comes loaded with errors. Respondents sometimes fill in some fields incorrectly or sometimes skips them accidentally. Data editing is a process wherein the researchers have to confirm that the provided data is free of such errors and for that they need to conduct basic checks and outlier checks to edit the raw edit and make it ready for analysis.
Phase III: Data Coding
Out of all three, this is the most important phase of data preparation which is associated with grouping and assigning values to the survey responses. Suppose a survey is completed with a 1000 sample size, then the researcher will create an age bracket to distinguish the respondents based on their age. Thus, it becomes easier to analyze small data buckets rather than to deal with the large data pile.
After the data is prepared for analysis, researchers are open to using different research and data analysis methods to derive meaningful insights. For sure, statistical methods are most favored to analyze the numerical data which are again classified into two groups’ first method is used to describe data called ‘Descriptive Statistics’ and the second one is used to compare the data called as ‘Inferential statistics’.
This method is used to describe basic features in data in such a meaningful way that pattern in the data starts making sense. Nevertheless, the descriptive analysis does not go beyond making conclusions beyond the amount of data analyzed or to reach conclusions beyond any hypothesis researchers have made so far. Here are a few major types of descriptive analysis methods
Measures of Frequency
- Count, Percent, Frequency
- It is used to denote home often a particular event occurs
- Researchers use it when they want to showcase how often a response is given
Measures of Central Tendency
- Mean, Median, Mode
- The method is widely used to demonstrate distribution by various points
- Researchers use this method when they want to showcase the most commonly or averagely indicated response
Measures of Dispersion or Variation
- Range, Variance, Standard deviation
- Here the range equals to high/low points
- Variance standard deviation = difference between the observed score and mean
- It is used to identify the spread of scores by stating intervals
- Researchers use this method to showcase data spread out. It helps them identify the depth until which the data is spread out that it directly affects the mean.
Measures of Position
- Percentile ranks, Quartile ranks
- It relies on the standardized scores helping researchers to identify the relationship between different scores.
- It is often used when researchers want to compare scores with the normal score.
For quantitative market research use of descriptive analysis often give absolute numbers, but the analysis is never sufficient to demonstrate the rationale behind those numbers. Nevertheless, it is absolutely necessary to think of the best method to be used for research and data analysis suiting your survey questionnaire and what story researchers wat to tell. For example, a mean is the best way to demonstrate the average scores of the students in schools. It is better to rely on the descriptive statistics when the researchers intend to keep the research or outcome limited to the provided sample without generalizing it to the population. For example, when you want to compare average voting done in two different cities, then differential statistics is enough.
Descriptive analysis is also called a ‘univariate analysis’ since it is commonly used to analyze a single variable.
Inferential statistics are used to make predictions about a larger population after research and data analysis of the collected sample of the representing population. For example, at a movie theater, you can ask some odd 100 audiences if they like the movie they are watching. Researchers then use inferential statistics on the collected sample to reason that about 80-90% of people like the movie they are watching.
Here are two major areas of inferential statistics
- Estimating parameters: it takes statistics from the sample research data and uses it to demonstrate something about the population parameter.
- Hypothesis test: it’s about sampling research data to answer the survey research questions. For example, researchers might be interested to understand if the new shade of lipstick recently launched is good or not, or if the multivitamin capsules help children to perform better at games.
These are complex analysis methods used to showcase the relationship between different variables instead of describing a single variable. It is often used when researchers want something beyond absolute numbers to understand the relationship between variables.
Here are some of the commonly used methods for data analysis and research
When researchers are not conducting experimental research wherein the researchers are interested to understand the relationship between two or more variables they opt for correlational research method.
Also called as contingency tables, cross-tabulation is a method used to analyze the relationship between multiple variables. Suppose a provided data has age and gender categories presented in rows and columns, then a two-dimensional cross tabulation helps for seamless data analysis and research by showing the number of males and the number of females in each age category.
To understand the strong relationship between two variables researchers do not look beyond the basic and commonly used regression analysis method, which is also a type of predictive analysis used. In this method, you have an important factor called the dependent variable and you also have multiple independent variables, in regression analysis, you undertake efforts to find out the impact of independent variables on the dependent variable. The values of both independent and dependent variable are assumed as being ascertained in an error-free random manner.
- Frequency tables:
To get a clear picture, researchers tabulate the large volume of unfragmented data by keeping values in ascending order of magnitude along with their corresponding frequencies ensuring a clear picture of the data set.
- Analysis of variance:
The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A great degree of variation means research findings were significant. In many contexts ANOVA testing and variance analysis are similar.
- Researchers must have the necessary skills to analyze the data, Getting trained to demonstrate a high standard of research practice. Ideally, researchers must possess more than a basic understanding of the rationale of selecting one statistical method over the other to obtain better data insights.
- Usually, research and data analytics methods differ by scientific discipline; therefore, obtaining statistical advice at the beginning of analysis helps in designing a survey questionnaire, selecting data collection methods, selecting samples.
- The primary aim of data research and analysis is to derive ultimate insights that are totally unbiased. Any mistake in or keeping a biased mind to collect data, selecting analysis method, or in choosing audience sample il to result in drawing a biased inference.
- Irrelevant to the sophistication used in research data and analysis is enough to rectify the poorly defined objective outcome measurements. It does not matter if the design is at fault or intentions are not clear, but lack of clarity might mislead readers, therefore avoid the practice.
- The motive behind data analysis in research is to present honest and reliable data. As far as possible avoid the statistical errors, and fond a way to deal with common challenges like outliners, missing data, data altering, data mining, or developing graphical representation.
Sheer amount of data generated on a daily basis is frightening, especially when data analysis has taken the center stage. in 2018, total data supply amounted to 2.8 trillion gigabytes, based on this fact, it is evidently clear that enterprises willing to survive in the hypercompetitive world must possess an excellent capability to analyze complex research data, derive actionable insights, and adapt to new market needs. QuestionPro is an online survey platform that empowers organizations not only in data analysis and research but also by providing them a medium to collect data by creating appealing surveys.