7 Experiments

Experiments are the backbone of causal inference, and text analysis is no exception. Whether in a laboratory or on Amazon’s Mechanical Turk, experiments can be carefully controlled and are a good way to mitigate the effects of confounding variables. Though many people associate advanced natural language processing with “big data,” the methods discussed in this book can be used effectively even in small-scale laboratory experiments.

An example of using experiments in quantitative language research: Sap et al. (2020) had online participants write either true stories that happened to them recently, or fictional stories about the same topic. They then used a large language model, GPT, to measure two likelihoods for each sentence in the story: the likelihood of the sentence given the previous sentence, and the likelihood of the sentence given a rough summary of the story. The ratio of these two likelihoods is a measure of how predictably the story flows from one point to another. Sap et al. (2020) found that fictional stories flow much more predictably than true ones. They also found that true stories begin to flow more predictably when they are retold 2-3 months later. Sap et al. (2022) reproduced these findings using a more advanced language model, GPT-3. We will discuss these and other methods of measuring linguistic complexity in Chapter 22.

Advantages of Experimental Data Collection

Control: Experiments mitigate the effects of confounding variables.
Customization: Experimenters can tailor the experiment to fit their particular research questions.

Disadvantages of Experimental Data Collection

Expensive
Time-Consuming
Small Sample Size: Because they are costly and time-consuming, experiments generally result in small datasets.

Sap, M., Horvitz, E., Choi, Y., Smith, N. A., & Pennebaker, J. (2020). Recollection versus imagination: Exploring human memory and cognition via neural language models. In D. Jurafsky, J. Chai, N. Schluter, & J. Tetreault (Eds.), Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 1970–1978). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.178

Sap, M., Jafarpour, A., Choi, Y., Smith, N. A., Pennebaker, J. W., & Horvitz, E. (2022). Quantifying the narrative flow of imagined versus autobiographical stories. Proceedings of the National Academy of Sciences, 119(45), e2211715119. https://doi.org/10.1073/pnas.2211715119