Sampling 1: Sampling-based studies as case
studies
Daniel Gile
12 July 2006
The concept
of sampling implies the idea that what is sampled is representative of a larger
entity with similar characteristics. Thus, the most fundamental property sought
in a sample is its representativeness with respect to
the larger entity.
In statistics, samples are subsets of populations
(of people, light bulbs, rocks, manufacturing errors, crime occurrences, votes,
etc.). In TS, units that make up samples and populations are translators,
students, texts, text genres, words, errors, user reactions etc.
If all units are identical with respect to the
characteristics to be investigated (translation strategies, speed, errors,
quality, linguistic features of the target text etc.), studying one unit should
be enough to learn about the population. In real life, such cases are rare. Variability
is the rule and forces investigators to look at larger samples (their size
depends to a large extent on how much the relevant features vary) and draw
inferences on the basis of central tendencies and variability measured in the
sample. Because of this variability, there is uncertainty in these inferences. Inferential
statistics measure such uncertainty with mathematics-based tests.
It is important to understand that in most
studies, sampling actually occurs in many dimensions, only one or a few of
which are controlled. For instance, an experiment on the effect of experience
on translation quality may be designed around the comparison of the performance
of translators with different levels of experience (say 0-4, 5-9, 10-14, 15 and
more years of experience) – this will be the main dimension of the study. Experimenters
will probably attempt to make sure that all participants have the same language
combination and similar knowledge of the passive language (a second dimension) and
of the theme addressed in the text (a third dimension). Perhaps they will
attempt to make sure that all participants have had similar background
education (fourth dimension) and translation training (fifth dimension), that
their usual professional market is similar (sixth dimension), etc., but they
may not control motivation (seventh dimension), general personality features
(eighth dimension), physiological parameters at the time of the experiment
(ninth dimension), the subjects’ mood (tenth dimension) etc. There remains the
possibility that different values in each of these dimensions could affect the
subjects’ work. Variability in each of these uncontrolled dimensions may “hide”
fundamental tendencies.
On the other hand, if a particular experiment
does show some “significant” trend (i.e. one which is not likely to have been
caused by chance alone), this trend is specific to a
particular set of parameter values: a certain text or text genre, a specific
language combination, certain levels of knowledge of the passive language and
of the theme, etc. There is no guarantee that the same trend would be found in
different sets of parameter values, say with different texts or text genres,
different language combinations, etc. In other words, even with a fairly large
sample, most studies remain case studies for several potentially relevant
dimensions.