This unit covers Research methodology (I-E).
Students completing this unit will be able to discuss:
Observational studies, like
naturalistic observation and correlational studies are nonexperimental.
For example, a correlational study is not an experimental design because we do not
manipulate an independent variable in a correlational study. We only
record data concerning traits or behaviors. We use statistical procedures
like the Pearson r to calculate the relationship between pairs of
variables.
We use correlational studies when we cannot or ethically should not
manipulate an independent variable. The major limitation of correlational
studies is that they cannot establish internal validity—that the
independent variable was responsible for changes in the dependent
variable.
Correlations cannot prove causation for three reasons. First,
correlations are not directional (depression could reduce sleep or
impaired sleep could produce depression). Second, the problem of
bidirectional causation means that each variable could influence the
other (depression could reduce sleep and impaired sleep could increase
depression). Finally, the third variable problem occurs when a hidden
variable affects both correlated variables (alcohol abuse could both
disrupt sleep and increase depression).
In a randomized controlled study, researchers manipulate an independent
variable and randomly assign subjects to conditions, with or without
prior matching on subject variables. A randomized controlled study can
achieve internal validity because matching and random assignment increase
the likelihood that the groups were identical on relevant subject
variables at the start of the experiment.
In the Brady (1958) study, the experimenter pretested primates on a shock
avoidance task and then assigned the faster learners to the executive
condition and the slower learners to the control condition. The
executives could prevent a painful shock by hitting the control button at
least once every 20 seconds. Failure to do this caused both the executive
and yoked control subjects to receive a shock. The controls had no control over
the delivery of the painful shock.
Brady reported that executives had many ulcers while the controls did
not. Brady confounded this study by using biased assignment. We have no
reason to believe that the two groups were equivalent on subject
variables related to the development of ulcers. Alternative explanations
are that the executives were more sensitive to shock than the controls or
were more ulcer prone.
Weiss (1972) replicated the Brady
study using rats, and randomly assigned them to executive and control
conditions. In contrast to Brady, he found that lack of control over
shock was more stressful than responsibility for pushing a control button
to prevent shock (Myers & Hansen, 2006).
An independent variable (IV) is the variable (antecedent condition) an experimenter intentionally manipulates. An experiment requires at least two levels or values of an independent variable (like prayer and a waiting list).
A dependent variable (DV) is the outcome measure the experimenter uses to assess the change in behavior produced by the independent variable. The value of a dependent variable depends on the level of the independent variable. In a study of the efficacy of slow cortical potential (SCP) biofeedback in treating migraine, the number of emergency room visits per month could serve as one of the dependent variables.
Internal validity is the degree to which the experiment can demonstrate that changes in the dependent variable across treatment conditions are due to the independent variable. In experiments, researchers create values of the independent variable and measure their effect on the dependent variable. Internal validity is important because it establishes a causal relationship between the independent and dependent variables.
An extraneous variable is a variable not controlled by the experimenter that could affect the DV. Confounding occurs when an extraneous variable systematically changes across the experimental conditions. For example, in a study comparing the effects of two different relaxation methods on lowering blood pressure, if the meditation group also exercised more than the prayer group, this would confound the experiment.
When confounding occurs, a researcher has competing explanations for the experimental findings which cannot be ruled out. This prevents the experimenter from drawing cause-effect conclusions. The experimenter's dilemma is analogous to a detective with two or three equally plausible suspects.
The classic threats to internal validity include history, maturation, testing, instrumentation, statistical regression, selection, subject mortality, and selection interaction (Campbell, 1957; Campbell & Stanley, 1966).
History threat occurs when an event outside the experiment threatens internal validity by changing the DV. For example, in a weight loss study, subjects in group A were weighed before lunch while those in group B were weighed after lunch.
Maturation threat is produced when physical or psychological changes in the subject threaten internal validity by changing the DV. For example, boredom may increase subject errors on a test of continuous attention.
Testing threat occurs when prior exposure to a measurement procedure affects performance on this measure during the experiment. For example, if experimental subjects regularly use a blood pressure cuff during relaxation training, and control subjects only use one during pretesting, the experimental subjects could have lower pressures due to familiarity.
Instrumentation threat is when changes in the measurement instrument or measurement procedure threatens internal validity. For example, EEG sensor placement may be less consistent for one treatment group than another.
Statistical regression threat occurs when subjects are assigned to conditions on the basis of extreme scores, the measurement procedure is not completely reliable, and subjects are retested using the same procedure to show change on the DV. The scores of both extreme groups tend to regress to the mean on the second measurement: high scorers are lower and low scorers are higher on the second testing. This effect is caused by the random action of measurement error.
For example, all subjects are pretested on anxiety and only those who scored high or low are used as subjects, both groups receive relaxation training, and then the anxiety test is readministered. The scores for lows increase from pre to post while those for highs decrease.
Selection threat occurs when individual differences are not balanced across treatment conditions by the assignment procedure. For example, despite random assignment, subjects in the experimental group may have higher systolic blood pressures than those in the control group.
Subject mortality threat occurs when subjects drop out of experimental conditions at different rates. For example, even if random assignment distributed subject characteristics equally across the conditions at the start of the experiment, dropout could render the conditions unequal on an extraneous variable like hypnotic susceptibility.
Selection interactions occur when a selection threat combines with at least one other threat (history, maturation, testing, instrumentation, statistical regression, or subject mortality).
When one of these classic threats confounds an experiment, the study lacks internal validity and we cannot establish a cause-effect relationship between the independent and dependent variables.
Researchers have several options for controlling extraneous variables when random assignment to experimental conditions is insufficient to prevent confounding. Three important types of extraneous variables are physical variables, social variables, and personality variables.
Physical variables are properties of the physical environment like time of day, room size, or noise. These variables may be controlled, in order of preference, through elimination, constancy of conditions, and balancing. Elimination removes the variable (soundproofing a room). Constancy of conditions keeps the variable about the same for all treatment conditions (running all subjects at night). Balancing distributes the variable's effects across all treatment conditions (running half of each condition's subjects in the morning and half in the evening).
Social variables are aspects of the relationships between researchers and participants. Two major social variables are demand characteristics and experimenter bias. Demand characteristics are situational cues (like students packing up their belongings at the end of class) that signal expected behavior (dismissal of class). An experimenter can control demand characteristics by performing a single-blind experiment in which subjects are not told their treatment condition. For example, in a single-blind drug study, the experimental and control groups receive capsules that look and taste identical. Alternatively, a cover story, a false plausible explanation of the experimental procedure, can conceal the research hypothesis.
Experimenter bias occurs when
the researcher knows the subjects' treatment condition and acts in a manner that confirms the experimental hypothesis (encourages experimental subjects more than control subjects). A double-blind experiment, which conceals the subjects' treatment condition from both subjects and the experimenter, can control both demand characteristics and experimenter bias. For example, in a double-blind drug study, the experimental and control groups receive capsules that look and taste identical, and the experimenter does not know whether a subject has received a drug or placebo.
Personality variables are personal aspects of a subject or experimenter like anxiety or warmth. Researchers can control a subject's personality variables through random assignment to treatment conditions, matching subjects on this variable and then randomly assigning them to different conditions, or randomly assigning subjects to all treatment conditions using a within-subjects design. They can control the impact of experimenter personality by assigning the same person to run all subjects (constancy of conditions) or assigning several researchers to run an equal number of subjects in each condition (balancing).
The issue of specific and nonspecific treatment effects is crucial to our
understanding of how biofeedback works and its credibility. A
specific treatment effect is a
measurable symptom change associated with a measurable
psychophysiological change produced by biofeedback. For example, there
would be a specific treatment effect on airway resistance in a patient
diagnosed with asthma if decreased resistance were correlated with
increased heart rate variability (HRV).
A nonspecific treatment effect is a
measurable symptom change that is not contingent on a specific
psychophysiological change. For example, training to increase or decrease
frontal SEMG activity might each be associated with reduced anxiety. When
psychophysiological changes in both directions produce the same reduction
in symptom severity, the mechanism responsible for clinical improvement
may be associative learning. For example, demand
characteristics may change psychophysiology through mechanisms like
classical conditioning. As research on the
placebo response or an associatively-conditioned healing response has shown, nonspecific treatment effects are
not confined to subjective self-reported symptoms. The changes produced
by demand characteristics can involve very specific and measurable
changes like reductions in herpes virus counts.
Since effective biofeedback and medical treatments produce both specific
and nonspecific treatment effects, it is important to control demand
characteristics in clinical outcome studies to ensure
internal validity (behavior change is
only due to the independent variable) and understand the mechanisms of
action.
We described earlier how researchers can control demand characteristics and experimenter effects by using a double-blind experiment. An extension of this approach is the double-blind crossover experiment, where
subjects start with one treatment and then receive an alternative
treatment.
Wickramasekera (1999) argues that controlled clinical studies and
double-blind research have shown that “neither the magnitude nor the
direction of physiological change in biofeedback training was strongly
related to the magnitude of the reduction of clinical symptoms.” (p. 92)
He estimates that physiological change accounts for “a small portion
(e.g., 20%) of the variance in clinical outcome.” (p. 93)
He proposes that nonspecific cognitive and emotional factors may account
for the most of the variance in symptom reduction when treating problems
like headache and low back pain.
He advances three mechanisms for symptom reduction: First, “biofeedback
temporarily increases trait hypnotic ability and probably the patients’
‘openness’ to altered perceptions of the factors triggering and
maintaining their clinical symptoms.” Second, “biomedical instruments
used in biofeedback elicit a respondently conditioned mind-body
therapeutic placebo response based on the memory of prior healing.”
Third, “biofeedback recruits one of the primary mechanism of risk (high-
or low-trait hypnotic ability) for stress disorders and reverses its
direction of action.”
From Wickramasekera's perspective, biofeedback should produce the greatest symptom
reduction in patients with low to moderate hypnotic ability.
Self-hypnosis and other deep relaxation procedures should produce the
greatest symptom reduction in patients with high hypnotic ability.
A control group receives a zero-level of the independent variable. While
standard or active placebos are easy to administer in pharmaceutical
research, creation of an appropriate control group is very difficult in
biofeedback research.
Biofeedback control groups have included waiting lists, sitting quietly,
pseudo-relaxation procedures, detectable non-contingent feedback
(subjects could discover that the feedback was false), credible
non-contingent feedback (subjects could not detect that the feedback was
false), and reverse contingent feedback (real feedback for physiological
change in the opposite of the clinically-desired direction).
The problem with waiting lists, sitting quietly, and pseudo-relaxation
procedures is they may possess very different demand characteristics than
biofeedback training. They may be less credible and motivating to
patients than biofeedback training.
Detectable non-contingent feedback is feedback the patient learns is
false as soon as it fails to track a voluntary behavior like muscle bracing or
breath holding. Noncontingent feedback can be frustrating to begin with
if the subject perceives that he or she is unable to learn to control the
signal.
Reverse contingent feedback provides reinforcement for the opposite of
clinically-desired changes (skin conductance increase instead of
decrease). While a potentially powerful strategy, it might not be
ethically possible in clinical studies where these changes could harm the
patient (increasing theta amplitude in children diagnosed with ADHD).
In case studies, a researcher
compiles a descriptive study of a subject's experiences, observable
behaviors, and archival records kept by an outside observer.
Deviant case analysis compares
deviant and normal cases to identify differences, which may explain the
causes of a disorder. Example: comparison of identical
twins to identify factors that protected one twin from developing
Asperger's syndrome..
Case studies have several advantages:
Case studies also have potential problems:
In biofeedback research, a case study
is a record of patient experiences and behaviors compiled by a therapist and is nonexperimental. A therapist documents change
in patient symptoms across the phases of treatment, but cannot prove that
the change was due to biofeedback training.
A descriptive case study might use an AB design,
where A is the baseline phase and B is the treatment phase.
N represents the number of subjects required for an experiment. Classic experiments like Peniston and Kulkosky's (1990) alcoholism study utilized large N designs that compared the performance of groups of subjects (those who received alpha-theta neurofeedback and those who received conventional medical treatment).
In comparison, a small N design examines one or two subjects. Researchers who advocate small N studies argue that large N studies ignore individual subject responses to the independent variable and instead report aggregate results or trends. When subjects vary greatly in their response to the independent variable, this can cancel out treatment effects so that there appears to be no difference between the groups.
A clinical psychologist could use a small N design to test a treatment when there are insufficient subjects to conduct a large N study and when he or she wants to avoid the ethical problem of an untreated control group. Animal researchers prefer small N designs to minimize the acquisition and maintenance cost, training time, and possible sacrifice of their animal subjects.
Small N designs have been most extensively used in operant conditioning research. B. F. Skinner examined the continuous behavior of individual subjects in preference to analyzing discrete measurements from separate groups of subjects.
We will review four types of small N designs: reversal designs, multiple baseline designs, changing criterion designs, and discrete trials designs.
Small N researchers often use variations of the ABA reversal design where the same subject participates in
all treatment conditions. A subject is observed in a control condition (A), treatment
condition (B), and then returns to the control condition. The requirement for this design
is that the treatment be reversible.
In both large and small N designs, baselines are control conditions that allow us to measure behavior without the influence of the independent variable. A return to baseline, which may be repeated several times, is needed to rule out the effects of extraneous variables like history and maturation threats.
An ABABA design is illustrated below.
Baseline1 measures the number
of items of clothing the husband leaves in the living room. Doing dishes contingent1 penalizes the husband with dishwashing when he leaves more clothes in the
living room than his wife. Baseline2 measures the number of items of clothing the husband leaves in the living
room when there is no penalty. Doing dishes
contingent2 penalizes the husband with dishwashing when he leaves more clothes in the
living room than his wife. Post-checks are measurements of the dependent variable after completion of training to assess the maintenance of behavior change, which was picking up
clothing.
Many clinical reversal studies do not return to a final baseline because it would be unethical to risk patient relapse after treatment appeared to improve behavior. When a reversal study does not end with a baseline condition, we can't rule out the possibility that the patient's clinical improvement was caused by an extraneous variable.
A multiple baseline design is a type of small N design. In this approach, a series of baselines and treatments are compared within the same subject, and once treatments are administered, they are not withdrawn. This design could be used to evaluate the effect of a treatment administered to different individuals after baselines of different lengths. A researcher could also evaluate the effects of a treatment on two or more behaviors or on the same behavior in different settings.
A hypothetical behavior modification program is designed to reduce a single child's cartoon watching on both before school and after school. A parent rewards activities, performed in place of cartoon watching, with points that can be saved and traded for toys at the end of the week.
The first intervention is before school. Following a baseline period, the treatment is started and remains in place. The after school baseline starts when the before school baseline does, but lasts twice as long.
The after school intervention starts later to ensure that an extraneous variable (history threat) did not affect cartoon watching during either time period.
The graph below shows that cartoon watching only declined after each treatment was introduced. If the results were due to a history threat, cartoon watching before and after school would have started to decline at the same time. It didn't.
A multiple baseline design overcomes the ethical problem of withdrawing an effective treatment by never withdrawing a treatment.
A changing criterion design is a third type of small N design. The criteria for reinforcement are incrementally increased as the participant succeeds. For example, initially, a subject might receive a reward for 30 minutes of daily exercise, later, for 45 minutes, and finally, for 60 minutes. Reinforcement for successive approximations of the target behavior is central to athletic coaching, cognitive-behavior modification, and biofeedback/neurofeedback.
A discrete trials design is a small N procedure without baselines that is utilized in sensory research. To replace baselines, the effects of different levels of the independent variable are averaged across 100s to 1000s of trials for each subject. The large number of data points produced by these trials provides a very reliable measurement of the effect of the independent variable. The similarity of human sensory systems allows researchers to generalize from a small number of subjects.
For example, Ivry and Lebby (1993) tested whether high or low frequency tones were processed more efficiently by the left or right hemispheres. They tested four subjects using a discrete trials design.
Subjects were asked to judge whether a set of three different tones was on the average, higher or lower than a 1900 Hz or 200 Hz target tone. Each subject participated in four different sessions, 2 tested high-frequency tones and 2 tested low-frequency tones, for a total of 2688 trials. The tones were presented to either the right or left ear. Responses were measured for accuracy and speed.
The authors found that discrimination of high-frequency tones was faster and more accurate when they were presented to the right ear (controlled by the left hemisphere). Discrimination of low-frequency tones was faster and more accurate when they were presented to the left ear (controlled by the right hemisphere).
A small N study is appropriate when studying a clinical subject (such as a self-injurious child) or when very few subjects are available. A large N design would be desirable when we have sufficient subjects and want to increase generalizability. The generalizability of a large N study depends on how we select our sample, since a seriously biased sample will not represent the population. In contrast, the generalizability of a small N study depends on repeated successful replications with different subjects (Myers & Hansen, 2006).
A meta-analysis is not an experiment, but a statistical analysis of several comparable studies. A meta-analysis utilizes statistical procedures to combine and quantify data from many experiments that use the same operational definitions for their independent and dependent variables to calculate a typical effect size.
For example, Yucha, Clark, et al. (2001) reported a meta-analysis of 23 studies that compared biofeedback training with active treatments like meditation and inactive treatments like sham biofeedback controls and blood pressure measurement.
The authors found that biofeedback produced greater systolic (6.7 mmHg) and diastolic (3.8 mmHg) blood pressure reductions than the inactive treatments.
The following guidelines for evaluating the clinical efficacy of biofeedback and neurofeedback interventions were recommended by a joint Task Force and adopted by the Boards of Directors of the Association for Applied Psychophysiology (AAPB) and the International Society for Neuronal Regulation (ISNR) (LaVaque et al., 2002).
Level 1: Not empirically supported
Supported only by anecdotal reports and/or case studies in non-peer reviewed venues.
Level 2: Possibly efficacious
At least one study of sufficient statistical power with well identified outcome measures, but lacking randomized assignment to a control condition internal to the study.
Level 3: Probably efficacious
Multiple observational studies, clinical studies, wait list controlled studies, and within subject and intrasubject replication studies that demonstrate efficacy.
Level 4: Efficacious
a. In a comparison with a no-treatment control group, alternative treatment
group, or sham (placebo) control utilizing randomized assignment, the
investigational treatment is shown to be statistically significantly superior
to the control condition or the investigational treatment is equivalent to a
treatment of established efficacy in a study with sufficient power to
detect moderate differences, and
b. The studies have been conducted with a population treated for a specific
problem, for whom inclusion criteria are delineated in a reliable,
operationally defined manner, and
c. The study used valid and clearly specified outcome measures related to
the problem being treated, and
d. The data are subjected to appropriate data analysis, and
e. The diagnostic and treatment variables and procedures are clearly
defined in a manner that permits replication of the study by independent
researchers, and
f. The superiority or equivalence of the investigational treatment has been
shown in at least two independent research settings.
Level 5: Efficacious and specific
The investigational treatment has been shown to be statistically superior to credible sham therapy, pill, or alternative bona fide treatment in at least two independent research settings.
AB design: in a descriptive case study, A is the baseline phase and B is the treatment phase.
ABA reversal design: in a small N design or large N design, subjects are observed in a control condition (A), treatment condition (B), and then return to the control condition (A).
balancing: method of controlling a physical variable by distributing its effects across all treatment conditions (running half of each condition's subjects in the morning and half in the evening).
baseline: control condition in which participants receive a zero-level of the independent variable (sitting quietly without receiving feedback about physiological performance).
bidirectional causation: reason that correlation does not imply causation; each of two variables could influence the other (depression could reduce sleep and sleep loss could increase depression).
case study: nonexperimental procedure in which a researcher compiles a descriptive study of a subject's experiences, observable behaviors, and archival records kept by an outside observer.
changing criterion design: small N design in which the criteria for reinforcement are incrementally increased as the participant succeeds (shaping).
confounding: loss of internal validity when an extraneous variable systematically changes across the experimental conditions; for example, in a study comparing the effects of two different relaxation methods on lowering blood pressure, if the meditation group also exercised more than the prayer group, this would confound the experiment.
constancy of conditions: method of controlling a physical variable by keeping it constant across all treatment conditions (running all subjects in the evening).
contingent feedback: feedback of a participant's actual physiological performance.
control group: in an experiment, a group that receives a receives a zero-level of the independent variable (placebo or wait-list).
correlational study: nonexperimental procedure in which the researcher does not manipulate an independent variable and only records data concerning traits or behaviors (investigation of the relationship between body mass index and severity of low back pain).
cover story: a false plausible explanation of the experimental procedure used to conceal the research hypothesis to control demand characteristics.
demand characteristics: situational cues (like students packing up their belongings at the end of class) that signal expected behavior (dismissal of class).
dependent variable (DV): the outcome measure the experimenter uses to assess the change in behavior produced by the independent variable (airway resistance could be used to measure the effectiveness of heart rate variability training for asthma).
detectable non-contingent feedback: control condition in which participants could discover that the feedback was false (SEMG "feedback" that does not change when a participant suddenly contracts a monitored muscle).
deviant case analysis: extension of the case study method compares deviant (Asperger's syndrome) and normal cases to identify differences, which may explain the causes of a disorder.
discrete trials design: small N procedure without baselines that is utilized in sensory research in which the effects of different levels of the independent variable are averaged across 100s to 1000s of trials for each subject.
double-blind crossover experiment: extension of a double-blind design where subjects start with one treatment (placebo) and then receive an alternative treatment (theta enhancement training)..
double-blind experiment: the experimenter and participant do not know the condition to which the participant has been assigned to control both demand characteristics and experimenter bias.
elimination: method of controlling a physical variable by removing it (soundproofing a room).
experimenter bias: confounding that occurs when the researcher knows the subjects' treatment condition and acts in a manner that confirms the experimental hypothesis (encourages experimental subjects more than control subjects).
extraneous variable: variable not controlled by the experimenter (room temperature).
history threat: classic threat to internal validity that occurs when an event outside the experiment threatens internal validity by changing the DV (in a weight loss study, subjects in group A were weighed before lunch while those in group B were weighed after lunch).
independent variable (IV): the variable (antecedent condition) an experimenter intentionally manipulates (training to reduce blood pressure: heart rate variability or SEMG biofeedback).
instrumentation threat: classic threat to internal validity in which changes in the measurement instrument or measurement procedure threatens internal validity (EEG sensor placement may be less consistent for one treatment group than another).
internal validity: the degree to which the experiment can demonstrate that changes in the dependent variable across treatment conditions are due to the independent variable.
large N designs: studies that examine the performance of groups of participants.
level 1: not empirically supported.
level 2: possibly efficacious.
level 3: probably efficacious.
level 4: efficacious.
level 5: efficacious and specific.
maturation threat: classic threat to internal validity that occurs when physical or psychological changes in participants threaten internal validity by changing the dependent variable (boredom may increase subject errors on a test of continuous attention).
meta-analysis: a statistical analysis that combines and quantifies data from many experiments that use the same operational definitions for their independent and dependent variables to calculate a typical effect size.
multiple baseline design: type of small N design in which a series of baselines and treatments are compared within the same subject, and once treatments are administered, they are not withdrawn.
N: number of participants.
nonspecific treatment effect: a measurable symptom change that is not contingent on a specific psychophysiological change (training to increase or decrease frontal SEMG activity might each be associated with reduced anxiety).
observational studies: nonexperimental procedures like naturalistic observation and correlational studies.
Pearson r: statistical procedure that calculates the strength of the relationship (from -1.0 to +1.0) between pairs of variables measured using interval or ratio scales.
personality variables: personal aspects of a subject or experimenter like anxiety or warmth.
physical variables: properties of the physical environment like time of day, room size, or noise.
placebo response: associatively-conditioned healing response.
post-checks: measurements of the dependent variable (discarded clothing) after completion of training to assess the maintenance of behavior change
randomized controlled study: researchers manipulate an independent variable and randomly assign participants to conditions, with or without prior matching on subject variables.
reverse contingent feedback: reinforcement for the opposite of
clinically-desired changes (skin conductance increase instead of
decrease for clients with anxiety).
selection interactions: the combination of a selection threat with at least one other threat (history, maturation, testing, instrumentation, statistical regression, or subject mortality).
selection threat: classic threat to internal validity in which individual differences are not balanced across treatment conditions by the assignment procedure (participants in the experimental group may have higher systolic blood pressures than those in the control group).
single-blind experiment: subjects are not told their treatment condition.
small N design: study that examines one or two subjects instead of groups of participants.
social variables: aspects of the relationships between researchers and participants like demand characteristics and experimental bias.
specific treatment effect: a measurable symptom change associated with a measurable psychophysiological change produced by biofeedback; for example, there would be a specific treatment effect on airway resistance in a patient diagnosed with asthma if decreased resistance were correlated with increased heart rate variability (HRV).
statistical regression threat: classic threat to internal validity that occurs when subjects are assigned to conditions on the basis of extreme scores, the measurement procedure is not completely reliable, and subjects are retested using the same procedure to show change on the DV; the scores of both extreme groups tend to regress to the mean on the second measurement so that high scorers are lower and low scorers are higher on the second testing.
subject mortality threat: classic threat to internal validity that occurs when subjects drop out of experimental conditions at different rates; for example, even if random assignment distributed subject characteristics equally across the conditions at the start of the experiment, dropout could render the conditions unequal on an extraneous variable like hypnotic susceptibility.
testing threat: classic threat to internal validity that occurs when prior exposure to a measurement procedure affects performance on this measure during the experiment; for example, if experimental subjects regularly use a blood pressure cuff during relaxation training, and control subjects only use one during pretesting, the experimental subjects could have lower pressures due to familiarity.
third variable problem: reason that correlation does not mean causation; a hidden variable may affect both correlated variables (alcohol abuse could both disrupt sleep and increase depression).
Now that you have completed this module, describe how you use or could
use the case study approach in your clinical practice to assess treatment
efficacy.
Brady, J. V. (1958). Ulcers in "executive" monkeys. Scientific American, 199(4), 95-100.
Campbell, D. T. (1957). Factors relevant to the validity of experiments in social settings. Psychological Bulletin, 54, 297-312.
Campbell, D. T., & Stanley, J. T. (1966). Experimental and quasi-experimental designs for research. Chicago: Rand McNally.
Hersen, M., & Barlow, D. H. (1976). Single case experimental designs. New
York: Pergamon Press.
Ivry, R. B., & Lebby, P. C. (1993). Hemispheric differences in auditory perception are similar to those found in visual perception. Psychological Sciences, 4(1), 41-45.
LaVaque, T. J., Hammond, D. C., Trudeau, D., Monastra, V., Perry, J., Lehrer, P., Matheson, D., & Sherman, R. (2002). Template for developing guidelines for the evaluation of the clinical efficacy of psychophysiological evaluations. Applied Psychophysiology and Biofeedback, 27(4), 273-281.
Myers, A., & Hansen, C. (2006). Experimental psychology. Pacific Grove,
CA: Wadsworth.
Peniston, E. G., & Kulkosky, P. J. (1990). Alcoholic personality and alpha-theta brainwave training. Medical Psychotherapy, 3, 37-55.
Weiss, J. M. (1972). Psychological factors in stress and disease. Scientific American, 226(6), 104-113.
Wickramasekera, I. A. (1988). Clinical behavioral medicine: Some concepts
and procedures. New York: Plenum Press.
Wickramasekera, I. A. (1999). How does biofeedback reduce clinical
symptoms and do memories and beliefs have biological consequences? Toward
a model of mind-body healing. Applied Psychophysiology and Biofeedback,
24(2), 91-105.
Yaremko, R. M., Harari, H., Harrison, R. C., & Lynn, E. (1982).
Reference handbook of research and statistical methods in psychology: For
students and professionals. New York: Harper & Row, Publishers, Inc.
Yucha, C. B., Clark, L., Smith, M., Uris, P., Lafleur, B., & Duval, S. (2001). The effect of biofeedback in hypertension. Applied Nursing Research, 14(1), 29-35.