New Evidence and Design Considerations

for Repeated Measure Experiments in Survey Research

Diana Jordan
Duke University
Trent Ollerenshaw
University of Houston


APW & MEAD
2025 November 10
Andrew Trexler
UW-Madison

Why Should You Care


  • If you…
    • Conduct experiments with surveys
    • Review or consume scholarship with survey experiments
    • Want accurate estimates of ATEs
    • Are concerned about the replication crisis
  • We show…
    • Traditional experimental design is often suboptimal
    • Repeated measure designs dramatically improve power
    • Suitable for many experimental settings, but…
    • Not without costs: slight attenuation of ATEs

A Motivating Example

  • Do you think that federal spending on foreign aid should be increased or decreased?


\(\bar Y_0 = 0.310\)

  • Spending on foreign aid makes up about 1% of the federal budget. Do you think that federal spending on foreign aid should be increased or decreased?

\(\bar Y_1 = 0.398\)

Traditional Post-only Design


\(Y_i = \beta_0 + \beta_1T_i + \epsilon_i\)

  • Unbiased under weak assumptions (randomization, SUTVA, attrition)
  • Imprecise
  • Requires large samples for adequate power
  • Can be improved with covariate adjustment if…
    • \(X \perp T\)
    • \(corr(X,Y) \neq 0\)

Repeated Measure Designs

Clifford, Sheagley, & Piston (2021)
  • Traditional post-only design
    • \(Y_{i_{post}} = \beta_0 + \beta_1T_i + \epsilon_i\)
  • Repeated measures designs
    • Pre-post: \(Y_{i_{post}} = \beta_0 + \beta_1T_i + \beta_2Y_{i_{pre}} + \epsilon_i\)
    • Quasi: \(Y_{i_{post}} = \beta_0 + \beta_1T_i + \beta_2Y_{i_{quasi}} + \epsilon_i\)
    • True within-subject: \(Y_{ij} = \alpha + \beta T_i + \epsilon_{ij}\)

Conventional Concerns


  • Consistency pressures
  • Demand incentives
  • Priming effects
  • Could produce treatment attenuation or exaggeration

Clifford, Sheagley, and Piston (2021)


  • 6 studies with randomized designs (post-only, pre-post, or quasi-post)
  • Found no evidence of design bias, but large precision gains
    • \(N=1000\) post-only experiment \(\rightarrow\) \(N \approx\) 200 to 600 with RM
  • Heavily cited by researchers adopting pre-post designs

“Given the clear gains in precision and weak evidence that repeated measures designs change treatment effects, we recommend that researchers use pre-post and within-subjects designs whenever possible.(CSP, 1062)

Need for Further Study


  • CSP provide compelling evidence
    • But enough to shift experimental design doctrine?
  • CSP use nonprobability samples
    • Less professionalized respondents may behave differently
  • CSP use just 1 within-subject study
    • Large % of citations are for within-subject studies
    • Student sample (\(n = 900\))
  • Other design considerations remain unanswered

Aims of Our Study


  • Large-scale replication of the central claim (no design effect)
  • Three primary extensions:
    • Analyze both probability & non-probability samples
    • Field more within-subject experiments
    • Assess differences in DEs by proximity of repeated measures

Experimental Design

Design


  • We field 6 experiments on three separate samples (\(N_j = 18\) total studies)
  • Each respondent completed 6 experiments
  • Random 2 experiments used a post-only design
  • Random 4 experiments used a repeated measure design
  • Question order was randomized to vary distance between measures

Experiments


  • Pre-post designs
    1. Info treatment on foreign aid (Gilens 2001)
    2. Party cues treatment on drug imports (CSP 2021)
    3. Framing treatment on GMOs (CSP 2021)
  • Within-subject designs
    1. Welfare/assistance to the poor (Smith 1987)
    2. Affirmative action for minorities/women (Wilson 2009)
    3. Opioid clinic nearby/distant (De Benedictis-Kessner 2019)

Samples


  • NORC AmeriSpeak Panel (probability sample, \(n = 4033\))
  • Prolific (nonprobability, \(n = 4261\))
  • Lucid (nonprobability, \(n = 4869\))
  • Combined \(N_{ij}=78,978\) observations

Randomization


  1. Randomly assign treatment/control for each experiment.
  2. Randomly assign 2 experiments to post-only designs, 4 to repeated measures.
  3. Randomize order of pre-treatment blocks.
  4. Randomly assign each participant to 1 of 2 order randomizations.
  5. Randomize order of post-treatment blocks.

Results

Average Treatment Effects

Bootstrapped ATEs

Estimated Design Effects

Meta-analyses

Design Considerations


We find no meaningful differences…

  • Between sample providers
  • By respondent professionalization
  • By respondent attentiveness
  • Between experiment types
  • By how far apart RM are placed
  • But repeated exposure to repeated measures may increase DEs!
  • Attitude recall questions may be the real culprit

Taking Stock


  • RM designs attenuate treatment effects by ~20%
  • But also shrink SEs by ~50%
  • Suitable for many applied settings
  • Precision gains usually trump design bias

Taking Stock


  • How to consider the attenuation and precision tradeoff?
  • We simulate 300,000 experiments
  • Vary design, sample size, true ATE, & attenuation
  • Evaluate power, absolute error, false discovery, & coverage

Simulations

Simulations

Simulations

Simulations

Caveats & Next Steps


  • All evidence is from online panels
  • Limited range of interventions & topics
  • Repeated measure designs in other survey modes
  • Sensitive topics/interventions
  • Measurement scales
  • Other design considerations?

Thank you!



Special thanks to TESS, the Rapoport Family Foundation, and Duke Bass Connections for supporting this research.

Contact:

Diana Jordan
Duke University
scholars.duke.edu

Trent Ollerenshaw
University of Houston
trentoll.github.io

Andrew Trexler
UW-Madison
atrexler.com

Appendix

Respondent Professionalization

Respondent Attention

Respondent Attention

  • Identify first between-groups RM experiment for each respondent
  • Assess accuracy of self-reported attitude change
  • Estimate DE in subsequent between-groups RM experiments for in/accurate respondents
  • No consistent differences

Repeated Measure Proximity

Repeated Measure Proximity