(Emerson) Data analysis and interpretation
(Emerson) Data analysis and interpretation
Qualitative data refers to non-numeric information such as interview transcripts, notes, video and audio
recordings, images and text documents. Qualitative data are forms of information gathered in a
nonnumeric form. Common examples of such data are:
Interview transcript
Field notes (notes taken in the field being studied)
Video
Audio recordings
Images
Documents (reports, meeting minutes, e-mails)
Manual methods
Notes and interviews are transcribed and transcripts and images etc. are copied. The researcher then uses
folders, filing cabinets, wallets etc. to gather together materials that are examples of similar themes or
analytic ideas. This facilitates easy retrieval of such linked material, but necessitates two things:
1. Making multiple copies of the original data as the same data may represent two or more themes or
analytic ideas.
2. A careful method of labelling the material in the folders or files so that it is possible to check back
and examine the broader context in which that data occurred. The analyst needs to know where the
snippets of data in the files came from so that they can be re-contextualized.
Computer based
With the advent of the personal computer that proved excellent at manipulating text, it was clear that with
the right software much of the manual organization could be done efficiently with a PC. Thus many
researchers have replaced physical files and cabinets with computer based directories and files along with
the use of word processors to write and annotate texts. Many analysts now also use dedicated computer
assisted qualitative data analysis (CAQDAS) packages that not only make the coding and retrieval of text
easy to do Coding can be done manually or using qualitative data analysis software such as NVivo, Atlas
ti 6.0, HyperRESEARCH 2.8, Max QDA and others.
Qualitative data analysis can be divided into the following five categories:
1. Content analysis. This refers to the process of categorizing verbal or behavioral data to classify,
summarize and tabulate the data.
2. Narrative analysis. This method involves the reformulation of stories presented by respondents taking
into account context of each case and different experiences of each respondent. In other words, narrative
analysis is the revision of primary qualitative data by researcher.
3. Discourse analysis. A method of analysis of naturally occurring talk and all types of written text.
4. Framework analysis. This is more advanced method that consists of several stages such as
familiarization, identifying a thematic framework, coding, charting, mapping and interpretation.
5. Grounded theory. This method of qualitative data analysis starts with an analysis of a single case to
formulate a theory. Then, additional cases are examined to see if they contribute to the theory.
Qualitative data analysis can be conducted through the following three steps:
Step 1: Developing and Applying Codes. Coding can be explained as categorization of data. A ‘code’
can be a word or a short phrase that represents a theme or an idea. All codes need to be assigned
meaningful titles. A wide range of non-quantifiable elements such as events, behaviours, activities,
meanings etc. can be coded.
There are three types of coding:
1. Open coding. The initial organization of raw data to try to make sense of it.
2. Axial coding. Interconnecting and linking the categories of codes.
3. Selective coding. Formulating the story through connecting the categories.
Step 2: Identifying themes, patterns and relationships. Unlike quantitative methods, in qualitative data
analysis there are no universally applicable techniques that can be applied to generate findings. Analytical
and critical thinking skills of researcher plays significant role in data analysis in qualitative studies.
Therefore, no qualitative study can be repeated to generate the same results.
Nevertheless, there is a set of techniques that you can use to identify common themes, patterns and
relationships within responses of sample group members in relation to codes that have been specified in
the previous stage.
Specifically, the most popular and effective methods of qualitative data interpretation include the
following:
Word and phrase repetitions – scanning primary data for words and phrases most commonly used by
respondents, as well as, words and phrases used with unusual emotions;
Primary and secondary data comparisons – comparing the findings of interview/focus
group/observation/any other qualitative data collection method with the findings of literature review
and discussing differences between them;
Search for missing information – discussions about which aspects of the issue was not mentioned by
respondents, although you expected them to be mentioned;
Metaphors and analogues – comparing primary research findings to phenomena from a different
area and discussing similarities and differences.
Step 3: Summarizing the data. At this last stage you need to link research findings to hypotheses or
research aim and objectives. When writing data analysis section, you can use noteworthy quotations from
the transcript in order to highlight major themes within findings and possible contradictions.
Grounded theory works in opposite way to traditional research and it may even appear to contradict
scientific method. An inductive methodology, grounded theory methodology comprises the following
four stages:
1. Codes. Anchors are identified to collect the key points of data.
2. Concepts. Codes of similar content are collected to be able to group the data.
3. Categories. Broad groups of similar concepts are formed to generate a theory.
4. Theory. A collection of explanations are generated that explain the subject of the research
(hypothesis).
You could then go on to explain why a particular answer is expected - you put forward a theory.
Most often when a researcher is interested in hypothesis testing they will conduct an experiment to
gather their data. So, we could take one sample of students, give them some training in how to search
and then ask them to find some specific information. We ask another sample of students to search for
the same specific information - and we see which group did better through a variety of different
measures, some subjective and some objective.
Quantitative studies result in data that provides quantifiable, objective, and easy to interpret results. The data can
typically be summarized in a way that allows for generalizations that can be applied to the greater population and the
results can be reproduced. The design of most quantitative studies also helps to ensure that personal bias does not
impact the data. Quantitative data can be analyzed in several ways. This module describes some of the most
commonly used quantitative analysis procedures.
The first step in quantitative data analysis is to identify the levels or scales of measurement as nominal,
ordinal, interval or ratio. The data can typically be entered into a spreadsheet and organized or “coded” in
some way that begins to give meaning to the data.
Descriptive Statistics
The next step is to use descriptive statistics to summarize or “describe” the data. It can be difficult to
identify patterns or visualize what the data is showing if you are just looking at raw data. Following is a
list of commonly used descriptive statistics:
Frequencies – a count of the number of times a particular score or value is found in the data set
Percentages – used to express a set of scores or values as a percentage of the whole
Mean – numerical average of the scores or values for a particular variable.
To calculate the arithmetic mean add up all the data, and then divide this total by the number of values in the data. For
example seven students are timed whilst searching for information on the Internet. What is the arithmetic mean time
taken to search? 2 minutes + 2 minutes + 3 minutes + 5 minutes + 5 minutes + 7 minutes + 8 minutes = 32 minutes
There are 7 values, so you divide the total by 7: 32 ÷ 7 = Arithmetic mean = 4.57 minutes (to 2 decimal places).
Median – the numerical midpoint of the scores or values that is at the center of the distribution of the
scores.
f there are two values in the middle then you find the mean of these two values, so: 2, 2 , 3 , (5) , 5 , 7 , 8
The middle value is marked in brackets, and it is 5. So the median is 5.
Mode – the most common score or value for a particular variable. The mode time taken to search is
the value which appears the most often in the data.
It is possible to have more than one mode if there is more than one value which appears the most. So: 2 , 2 , 3 , 5 , 5 ,
7 , 8. The values which appear most often are 2 and 5. They both appear more time than any of the other data
values. So the modes are 2 and 5.
The Range - To find the range, you first need to find the lowest and highest values in the data. The
range is found by subtracting the lowest value from the highest value.
The data values: 2 , 2 , 3 , 5 , 5 , 7 , 8. The lowest value is 2 and the highest value is 8. Subtracting the lowest from
the highest gives: 8 - 2 = 6. So the range is 6.
It is now apparent why determining the scale of measurement is important before beginning to utilize descriptive
statistics. For example, nominal scales where data is coded, as in the case of gender, would not have a mean score.
Therefore, you must first use the scale of measurement to determine what type of descriptive statistic may be
appropriate. The results are then expressed as exact numbers and allow you to begin to give meaning to the data. For
some studies, descriptive statistics may be sufficient if you do not need to generalize the results to a larger
population. For example, if you are comparing the percentage of teenagers that smoke in private versus public high
schools, descriptive statistics may be sufficient. However, if you want to utilize the data to make inferences or
predictions about the population, you will need to go another step farther and use inferential statistics.
Inferential Statistics
Inferential statistics examine the differences and relationships between two or more samples of the
population. These are more complex analyses and are looking for significant differences between
variables and the sample groups of the population. Inferential statistics allow you test hypotheses and
generalize results to population as whole. Following is a list of basic inferential statistical tests:
Correlation – seeks to describe the nature of a relationship between two variables, such as strong,
negative positive, weak, or statistically significant. If a correlation is found, it indicates a relationship
or pattern, but keep in mind that it does indicate or imply causation
Analysis of Variance (ANOVA) – tries to determine whether or not the means of two sampled
groups is statistically significant or due to random chance. For example, the test scores of two groups
of students are examined and proven to be significantly different. The ANOVA will tell you if the
difference is significant, but it does not speculate regarding “why”.
Regression – used to determine whether one variable is a predictor of another variable. For example,
a regression analysis may indicate to you whether or not participating in a test preparation program
results in higher scores for high school students. It is important to note that regression analysis are
like correlations in that causation cannot be inferred from the analyses.
Finally, the type of data analysis will also depend on the number of variables in the study. Studies may be
univariate, bivariate or multivariate in nature.
A set of analytical software can be used to assist with analysis of quantitative data. The following
table illustrates the advantages and disadvantages of three popular quantitative data analysis software:
Microsoft Excel, Microsoft Access and SPSS.
Advantages Disadvantages
Excel Cost effective or Free of Charge Big Excel files may run slowly
Spreadsheet Can be sent as e-mail attachments & viewed by Numbers of rows and columns are limited
most smartphones Advanced analysis functions are time consuming to
All in one program be learned by beginners
Excel files can be secured by a password Virus vulnerability through macros
Microsoft One of the cheapest amongst premium Difficult in dealing with large database
Access programs Low level of interactivity
Flexible information retrieval Remote use requires installation of the same version
Ease of use of Microsoft Access
The following table contains examples of research titles, elements to be coded and identification of
relevant codes:
Research title Elements to be coded Codes
Born or bred: revising The Great Man Born leaders
theory of leadership in the 21st century Leadership practice Made leaders
Leadership effectiveness
A study into advantages and Wholly-owned subsidiaries
disadvantages of various entry Joint-ventures
strategies to Chinese market Market entry strategies Franchising
Exporting
Licensing
Impacts of CSR programs and Philanthropy
initiative on brand image: a case study Supporting charitable courses
of Coca-Cola Company UK. Activities, phenomenon Ethical behaviour
Brand awareness
Brand value
An investigation into the ways of Viral messages
customer relationship management in Customer retention
mobile marketing environment Tactics Popularity of social networking sites