Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Quantitative data can be divided into two distinct groups: categorical and numerical.
Categorical data refer to data whose values cannot be measured numerically but can be either classified into
sets (categories) according to the characteristics that identify or describe the variable or placed in rank order.
They can be further sub-divided into descriptive and ranked. In descriptive we also include dichotomous data
(female-male, youngsters-adults, related-unrelated, English-non English, European-not European). Ranked (or
ordinal) data are a more precise form of categorical data. Rating or scale questions, such as where a respondent
is asked to rate how strongly she or he agrees with a statement, collect ranked (ordinal) data. Despite this, some
researchers argue that, where such data are likely to have similar size gaps between data values, they can be
analysed as if they were numerical interval data.
Numerical data, which are sometimes termed ‘quantifiable’, are those whose values are measured or counted
numerically as quantities. There are two possible ways of sub-dividing numerical data: into interval or ratio
data, alternatively, into continuous or discrete data. If you have interval data you can state the difference or
‘interval’ between any two data values for a particular variable, but you cannot state the relative difference. This
means that values on an interval scale can meaningfully be added and subtracted, but not multiplied and
divided. In contrast, for ratio data, you can also calculate the relative difference or ratio between any two data
values for a variable. (Figure 1).
Your initial analysis should explore data using both tables and diagrams. Your choice of table or diagram will
be influenced by your research question(s) and objectives, the aspects of the data you wish to emphasise, and
the scale of measurement at which the data were recorded. This may involve using:
– tables to show specific values;
– bar charts, multiple bar charts, histograms and, occasionally, pictograms to show highest
– and lowest values;
– line graphs to show trends;
– pie charts and percentage component bar charts to show proportions;
– box plots to show distributions;
– scatter graphs to show relationships between variables.
(see examples)
All data should, with few exceptions, be recorded using numerical codes to facilitate analyses. Where possible,
you should use existing coding schemes to enable comparisons. For primary data you should include pre-set
codes on the data collection form to minimize coding after collection. For variables where responses are not
known, you will need to develop a codebook after data have been collected for the first 50 to 100 cases. You
should enter codes for all data values, including missing data. Why?
– you save time;
– the data are normally well tested;
– you allow comparisons of your results with other (often larger) surveys
– easily check for errors
– in case of missing data and say goodbye to a whole research for reasons:
• data were not required from the respondent, perhaps because of a skip generated by a filter question in a
survey.
• The respondent refused to answer the question (a non-response).
2
• The respondent did not know the answer or did not have an opinion. Sometimes this is treated as implying an
answer; on other occasions it is treated as missing data.
• The respondent may have missed a question by mistake, or the respondent’s answer may be unclear.
Subsequent analyses will involve describing your data and exploring relationships using statistics. As before,
your choice of statistics will be influenced by your research question(s) and objectives and the scale of
measurement at which the data were recorded. Your analysis may involve using statistics such as:
– the mean, median and mode to describe the central tendency;
– Chi square, Cramer’s V and Phi to test whether two variables are significantly associated;
– Kolmogorov-Smirnov to test whether the values differ significantly from a specified population;
– T-tests to test if null hypothesis is supported,
– Levent, to test whether groups are significantly different or; and ANOVA or Friedman test used to detect
differences in treatments across multiple test attempts.
– correlation and regression to assess the strength of relationships between variables;
– regression analysis to predict values
– descriptive analysis, made simply to describe the variables
One of the questions you are most likely to ask in your analysis is: ‘How does a variable relate to another
variable?’ In statistical analysis you answer this question by testing the likelihood of the relationship (or one
more extreme) occurring by chance alone, if there really was no difference in the population from which the
sample was drawn. There are two main groups of statistical significance tests: non-parametric and
parametric.
Non-parametric statistics are designed to be used when your data are not normally distributed. Not
surprisingly, this most often means they are used with categorical data. Non-parametric data are those which
make no assumptions about the population, usually because the characteristics of the population are unknown.
In contrast, parametric statistics are used with numerical data. Parametric data assume knowledge of the
characteristics of the population, in order for inferences (implications) to be able to be made securely; they often
assume a normal, Gaussian curve of distribution, as in reading scores. Non-parametric data are often
derived from questionnaires and surveys (though these can also gain parametric data), while parametric data
tend to be derived from experiments and tests (e.g. examination scores). Although parametric statistics are
considered more powerful because they use numerical data, a number of assumptions about the actual data
being used need to be satisfied if they are not to produce false results (Blumberg et al. 2008). These include:
3
• the data cases selected for the sample should be independent, in other words the selection of any one case for
your sample should not affect the probability of any other case being included in the same sample;
• the data cases should be drawn from normally distributed populations
• the populations from which the data cases are drawn should have equal variances
• the data used should be numerical.
Qualitative data are non-numerical data that have not been quantified. They result from the collection of non-
standardised data that require classification and are analysed through the use of conceptualisation. Qualitative
analysis generally involves one or more of: summarising data, categorising data and structuring data using
narrative to recognise relationships, develop and test propositions and produce well-grounded conclusions. It
can lead to reanalysing categories developed from qualitative data quantitatively.
The processes of data analysis and data collection are necessarily interactive. There are a number of aids that
you might use to help you through the process of qualitative analysis, including temporary summaries, self-
memos and maintaining a researcher’s diary. Qualitative analysis procedures can be related to using either a
deductively based or an inductively based research approach. The use of computer-assisted qualitative data
analysis software (CAQDAS) can help you during qualitative analysis with regard to project management and
data organisation, keeping close to your data, exploration, coding and retrieval of your data, searching and
interrogating to build propositions and theorise, and recording your thoughts systematically.
Ways of collecting data and the representation method of them according to the purpose of the variables
into discussion.
BAR CHART
Histogram
The correlation coefficient enables you to quantify and determine the relationship between the variables.
7
Ways of collecting data and the representation method of them according to the result or outcome to get
from the variables into discussion.
8