Experimental I

Why conduct surveys?

Self reports are easy to administer, low in cost, considered to provide reliable reports of a wide variety of characteristics or features.

Sampling from populations

Theoretical distributions of expected outcomes (assuming all things being equal) are used to estimate the likelihood of specific samples representing their populations. This helps pollsters estimate the 'truth' of their polls.

Scientists, like pollsters 'make bets' against the odds that their samples represent the real populations and their hypothesis is not incorrect (falsification) - null that there is no difference.

Sampling theory & Probabilities

Based upon the probability of events problems are addressed, questions are asked about samples and the theoretical populations from which they came.

2 common sorts of problems:

1) Compare theoretical distribution to observed distribution (your data)

2) Calculate expected frequencies for X values

Probability = relative frequency of events X, Y, Z

Probability distribution = theoretical frequency distribution of which there are many

Binomial & Normal distributions are theoretical sampling distributions that are used in sorting out these problems of representation and inference.

E.g., binomial expansions

(p + q) k

p = probability of event happening

q = 1 - p (probability of not happening)

k = sample size (number of trials)

the coefficient indicates the number of ways an outcome may occur

for k=3 ----> p3 + 3p2q + 3pq2 + q3 = 1

three coins TTT 3HTT 3HHT HHH

Binomial distribution is appropriate for

1) discrete events

2) independent events

3) 2 classes of events (p&q)

4) any sample size, but difficult for large k
(k>30 use poisson or normal distribution to approximate binomial)

5) any values of p = (0 to 1)

Confidence intervals give you information about the likelihood of error in the sample vs. the population, this is called sampling error.

Sample Size has an impact on the size of the confidence limit, where the sample needs to be larger for a more narrow limit e.g., from Cozby:






+ 3 %

+ 5 %

+ 10 %





















over 100,000





Sampling Techniques

Probability sampling - making use of probabilities to select specific people. E.g., simple random-using phone numbers to randomly select

Stratified random sampling - by dividing the population into groups or "strata" one randomly selects people from a given stratum.

E.g., 'ethnics' live in a particular part of town, can use census tracts to randomly select people from various parts of town to match the overall proportions in the general population.

Cluster sampling - may wish to find people in various units that exist, e.g., schools in various school district across the whole province. The districts are the clusters samples the schools.

Haphazard Sampling take whomever you can find, from what ever channels are available, such as word of mouth, subject pools, advertising,

'Snowballing' - working through social networking gradually build larger samples through clumping.

Evaluating samples

Sampling frames are used by researchers to provide definitions of populations from within which their samples arise.

Response or return rates are important as they indicate the proportion of people who agree to respond, or completed vs incomplete questionnaires

Constructing questions

Defining Research Objectives - What is the question your are asking? What type of information can you acquire? The types of questions you ask will partially be defined by the nature of the sample but also your intended goals

What to survey? E.G.
Attitudes and Beliefs are most commonly assessed through survey questionnaires. Often such traits or styles are related to factual and demographic information that respondents can provide. Thus, when and where you are born, how much TV you watch, what kind of vacuum you own, beer drinking

Question wording

Simplicity - it is best to keep questions to a simple level of comprehension. However, sometimes, when dealing with complex issues that is impossible.

Double-barreled questions - arise when two or more thoughts or ideas are being questioned at once. Again it is sometimes unavoidable best done in semantic differential. (divide and add together?)

Semantic Differential - bipolar scales for the relative comparison of two terms or ideas. E.g., good-bad, strong-weak, active-passive, introvert-

Loaded questions - leading or strong terms (negative emotional, morally judgmental)

Negative wording - can be confusing yet sometimes need to avoid positive response bias such as . . .

Yea Saying always agree; positive response bias.



Responses to Questions

Open - ended vs. fixed questions - Compare the open-ended questions about Canadian Identity and the diversity of responses vs. Agree - Disagree.

Number of response alternatives - try to keep them to a moderate level (5 - 7) not too many

Responses sets arise when respondents provide responses that are biased or altered in some form.

i.e., faking good or bad, positive bias, random, ...

Rating Scales

Types of scales can vary greatly from number to picture or check mark on a long line.

non-verbal scales - can use faces or symbols to acquire information (traffic lights, hands, ...)


Labels for response alternatives strongly agree-disagree; never, rarely, sometimes, usually, always



Finalising the questionnaire

keep the same style throughout, including the fonts and scales, unless there is a good reason not to do so (construct validation, is a variable)

Rewritting & Refining

Using face validity checks (have friends and colleagues read it through) and pilot tests to refine questionnaires. Convergent and Discriminant validity studies also help to work out theoretical constructs and questionnaire forms.

Administering surveys

Questionnaires can be administered to groups in person, through the mail or e-mail.

Interviews can be administered over the telephone or in face to face interviews. There are possible advantages to each but they may fall prey to interviewer bias. Watch out for biases in asking leading questions and in interpreting answers or giving answers. Sometimes interviewers "look for" answers or complete thoughts for participants.

Focus groups are also a good way to survey attitudes or feelings on certain issues. Can range from 6-8 or as many as 30 people forums where issues of concern can be addressed or answered.

Experimental Design

Experimental designs are generally used to explore questions of causality through inferential hypothesis testing. Usually it is the case that control is used to isolate possible causes.

Confounding factors occur when two or more variables or potential sources of influence or causality occur together. Experimental controls may ignore or not test important factors that might be the real causality behind the apparent influence of an independent variable.

Poor designs to avoid include having no control group or comparison. Simple correlation at best.

One group pre-post test may also be confounded by the following factors:

History - any event that occurs between pre & post-test that may have an influence

Maturation - physical, psychological and social development that may change Dependent Var.

Testing - taking a test once may change your behaviour next time your take that test

Instrument decay - may also occur when the instrument itself may not work as well on second or third trials. People may get bored.

Regression toward the mean is a tendency for score to move towards the mean on second trials, particularly for extreme scores (high or low)

Non-equivalent control group design occurs when there are selection factors that may play a role in the outcome. E.g., when 'treatment' group is self-selected vs. the control group as in smokers, or in phone list take volunteers and others call control.

Well designed experiments attempt to account for specific possible confounding variables. Keeping K things constant or controlled while varying X factors across Y levels.

Post-test only - Group taken from population and randomly assigned to group & measure Dependent Variable. Assumes that the population is homogeneous and normally distributed
(probability functions: normal, t, X2, F).)

Pre-post test - baseline studies where a first recording is done, the treatment is presented then a second recording is made. The first data record can be compared with the second or subsequent ones.

Assignment to groups is done methodically to ensure certain conditions for the study. Usually to establish 'equivalent' groups random assignment is done.

Random - can be simple random assignment using first come or even random number generators

Matched pairs or groups can also be randomly assigned to conditions. Keeping the pre-test scores as markers for pairing (high-low) can assign

Repeated Measures have advantages over equivalent groups because they are more closely the same person. "Within subjects" designs give the same or similar measures to participants, acting as their own 'controls' (like in baseline). However, ordering, practice fatigue and other factors may play a role.

Order effects - sequence of events, tests and procedures may have significant impact on the performance of the participant, tainting results

Counter balancing is when researchers alter the order of presentation for the various conditions. There are several methods.

Latin squares are done to ensure that 1) each condition is in each ordinal position and that 2) each condition precedes and follows each condition once. To determine order effects:

Figure 2 from Cozby - Mental Rotations






Row 1





Row 2





Row 3





Row 4





The number of orders (Rows) is equivalent to the number of conditions for two or more people each.

Randomised blocks - can also be used to eliminate order effects where a number of blocks are presented each having a randomised sequence within (e.g., lists of words).

Time interval between treatments because of need to take affect or rest to relieve fatigue, developmental studies. May have drop out.

Choosing between independent groups
& repeated measures

Between - Within designs

statistically ?