Preregistered Replication of the Auditory Deviant Effect: A Robust Benchmark Finding

Short-term memory of visually presented lists of items is disrupted by auditory distraction. The auditory deviant effect refers to the finding that a sequence in which a single auditory event deviates from all other auditory objects disrupts serial recall more than a sequence without such a deviant. The changing-state effect refers to the finding that auditory changing-state sequences with changes from one auditory distractor item to the next disrupt the immediate serial recall of verbal items more than steady-state sequences consisting of distractor repetitions. One purpose of the present study is to perform a preregistered replication of the auditory deviant effect as well as (for the purpose of comparison) the changing-state effect and to provide reference data sets for the auditory deviant effect and the changing-state effect in the benchmarks repository with a large sample of participants and trials. Both effects were robustly obtained over the course of two sessions in which participants were tested. We also explored the relationship between auditory distraction and personality, and found auditory distraction to be unrelated to extraversion, neuroticism, and psychoticism.

In the benchmarks project (Oberauer et al., 2018), a list of benchmark findings has been defined that theories and computational models of working memory should be able to account for. Both the changing-state effect and the auditory-deviant effect have been included in this list. Specifically, all benchmark findings of working memory have been ranked in three levels of priority (A, B, C) according to the three criteria (1) reproducibility, (2) generalizability, and (3) theoretical leverage. In a first step, the theoretical importance of each phenomenon was ascertained via expert judgments (Oberauer et al., 2018). However, some questions remained about the reproducibility and generalizability of the findings. While the changing-state effect is considered well established and thus has been assigned a B rating, the auditory deviant effect has been assigned "a C rating because the auditory deviant effect constitutes a relatively novel finding in the working memory literature that is of high theoretical leverage but for which robustness and generality still need to be ascertained" (Oberauer et al., 2018, p. 945). The first aim of the present study is to determine the robustness of the auditory deviant effect by performing a large preregistered replication. Given that the auditory deviant effect and the changing-state effect are often studied together to compare their relative impact on performance (e.g., Hughes et al., 2005;Körner et al., 2019), the changing-state effect is included in the preregistered replication as well to allow for a direct comparison of the two phenomena. However, note that it is not a prerequisite for obtaining the effects to investigate both effects within a single study as both the changing-state effect and the auditory deviant effect have been detected independently of each other as well (e.g., Jones et al., 1992;Vachon et al., 2017).
Computational models do not only aim at explaining effects but also at reproducing real data reflecting these effects. To support these efforts, a long-term goal of the benchmarks team is to collect a set of reference data for each of the phenomena on the benchmarks list and to make this data available on an open data repository. Preferentially, these data sets should "use large samples of participants and trials to provide the basis for precise estimates of model parameters (i.e., effect sizes in statistical models, and estimates of latent variables in theoretical models)" and be "preregistered replications of benchmark findings; such replications are desirable because, despite our efforts to ensure that all benchmarks are robust and replicable, we cannot rule out that the available evidence is compromised by publication bias" (Oberauer et al., 2018, p. 889). A second purpose of the present preregistered replication is thus to provide a reference data set for the changing-state effect and the auditory deviant effect in the benchmarks repository that fulfills the criteria stated above.

Materials and procedure
A standard serial recall paradigm was used (see, for example, Röer et al., 2015). Participants were seated in separate cubicles with sound-absorbing walls in sessions with up to five participants. The instructions emphasized that anything presented over the headphones was to be ignored. Participants started each trial by pressing the space bar. After a 1000 ms blank screen, eight digits (randomly sampled without replacement from the set {1, 2, …, 9} were successively presented at the center of the screen, each for 1000 ms, in 80 pt Monaco font. Then eight question marks appeared on the screen that had to be replaced by the digits. Participants used the number pad of the keyboard to enter the recalled digits in forward order. They were not allowed to correct their responses. To motivate participants to concentrate on the task, they received feedback about how many digits they had remembered correctly after each trial. A digit was only scored as correctly recalled when it was entered at the correct serial position. During the encoding of the digits, three types of auditory distractor sequences had to be ignored. In the steady-state condition, the same word was repeated 10 times (e.g., Berg, Berg, Berg, Berg, Berg, Berg, Berg, Berg, Berg, Berg). In the auditory deviant condition, the steady-state sequence was disrupted by a deviant word presented at the 7th position (e.g., Berg, Berg, Berg, Berg, Berg, Berg, Chef, Berg, Berg, Berg). In the changing-state condition, ten different words were presented (e.g., Dank, Berg, Zeug, Chef, Wind, Gold, Typ, Ruf, Haut, Mund). The distractors were played for 500 ms with a 300 ms inter-distractor-interval. In all conditions, the distractors were randomly drawn from the set {Berg, Chef, Dank, Gold, Haut, Hof, Mund, Rand, Ruf, Typ, Wind, Zeug}. All words were spoken by the same female voice. The distractors were played at about 65 dB(A) through headphones with high-insulation hearing protection covers (beyerdynamic DT-150) that were plugged directly into the Apple iMacs that controlled the experiment. The software used for this purpose was written in LiveCode. 1 Following 16 steady-state training trials (which were not to be analyzed), the experimental trials (consisting of 8 steady-state trials, 8 auditory deviant trials, and 8 changing-state trials) followed in random order. This procedure was chosen because it closely resembles that of previous studies conducted in our lab in which the auditory deviant effect was robustly obtained (Röer et al., 2015;Röer, Körner, Buchner, & Bell, 2018). We only used 8 trials in each condition because it has been observed that the auditory deviant effect decreases over trials (Röer et al., 2015(Röer et al., , 2018. We thus refrained from using more trials per session. It could have decreased the chances of finding a significant effect if the study had provided more opportunity for habituation. To nevertheless ensure that the data set included a large number of trials, we decided to invite the participants to take part in a second session in the subsequent week in which the experiment described above was repeated in the exact same way. Including two sessions with a large number of trials has many advantages for the usefulness of the data in statistical modeling (Oberauer et al., 2018) and allows to examine whether the measures are robustly observed. At the end of each session, the participants completed the Eysenck personality questionnaire EPQ-R (Ruch, 1999).

Design
A repeated-measures 3 (distractor condition) × 2 (experimental session) × 8 (serial position) design was used. The distractor conditions were steady state, auditory deviant, and changing state. The exact same experiment was completed two times in two sessions separated by approximately 1 week. Serial recall was scored according to a strict serial-recall criterion: Only digits recalled at the correct serial position were scored as correct.

Stopping rule
We collected as many data sets as possible in the six weeks the laboratory was available to us. In the first five weeks, we advertised the study on campus to try to find as many participants as possible. In the sixth week, we only invited those participants who had completed their first session in the previous week to complete data collection for the second session.

Differences from pre-data collection plan
Some participants did not have time to participate in the week following the first session, so we gave them the option to participate either in the same week or two weeks after the first session. Therefore, the participants included in the dataset were tested in two sessions that were 3 to 24 days apart; the mean delay between the two sessions was 7 days (SD = 2).

Participants
As specified in the preregistration, only complete data sets (i.e., data sets of those participants who completed both sessions) were included in the analysis. Five participants did not show up for the second session, as a consequence of which their data were not analyzed. One person turned the volume of the computer off and therefore could not hear the distractors; his data files were removed before analysis. The remaining sample consisted of 273 participants (mean age = 22, SD of age = 4), 199 of whom were female. A sensitivity analysis showed that an auditory deviant effect of the size η p 2 = .05 could be detected with a statistical power of 1 -β = .95 (Faul, Erdfelder, Lang, & Buchner, 2007).

Preregistered comparisons
For the benchmarks project, the most important question is whether or not the changing-state effect and the auditory deviant effect can be replicated. The following analyses were performed to test this.

Overall distraction
A 3 × 2 × 8 repeated-measures MANOVA with distractor condition (steady state, auditory deviant, changing state), session (1, 2), and serial position (1 to 8) as independent variables and serial recall as dependent variable revealed a main effect of distractor condition, F(2,271) = 75.07, p < .01, η p 2 = .36. Helmert-contrasts showed that performance in the steady-state control condition was better than in the other two conditions, F(1,272) = 102.53, p < .01, η p 2 = .27. Furthermore, performance was worse in the changing-state condition than in the auditory deviant condition, F(1,272) = 63.36, p < .01, η p 2 = .19. All of the effects are in the expected direction.

Exploratory analyses
Serial position Serial position curves are shown in Figure 1. Recall showed a typical serial position curve, F(7,266) = 276.53, p < .01, η p 2 = .88. The interaction between distractor condition and serial position was significant, F(14,259) = 3.41, p < .01, η p 2 = .16. Both the changing-state effect, F(7,266) = 5.39, p < .01, η p 2 = .12, and the auditory deviant effect, F(7,266) = 2.18, p = .04, η p 2 = .05, interacted with serial position. Averaged across both sessions, the changing-state effect was significant at all serial positions; the auditory deviant effect was significant at the .05 level at all serial positions with the exception of the highest and lowest points of the serial recall curve (Positions 1 and 7).

Habituation
Performance was better in the second session in comparison to the first session, F(1,272) = 41.43, p < .01, η p 2 = .13, but distractor condition did not interact with session, F(2,271) = 0.35, p = .70, η p 2 < .01. Neither the changing-state effect, F(1,272) = 0.67, p = .41, η p 2 < .01, nor the auditory deviant effect, F(1,272) = 0.31, p = .58, η p 2 < .01, interacted with session. Figure 2 shows serial recall across the eight trials of both sessions in each condition. Performance increased across trials, F(7,266) = 2.55, p = .01, η p 2 = .06. There was also an interaction between trial and session, F(7,266) = 2.69, p = .01, η p 2 = .07; the increase in performance across trials was less pronounced in the second session in which performance was already high at the start of the session. Distractor condition did not interact with trial, F(14,259) = 1.44, p = .13, η p 2 = .07. There was also no three-way interaction between distractor condition, trial, and session, F(14,259) = 1.20, p = .27, η p 2 = .06. The interaction between the linear contrast component of the trial variable and the variable contrasting the steady-state control condition with the other two conditions was not significant, F(1,272) = 0.04, p = .85, η p 2 < .01. The interaction between the linear contrast component of the trial variable and the variable contrasting the auditory deviant condition and the changing-state condition was not significant either, F(1,272) = 1.19, p = .28, η p 2 < .01.

Distraction and personality
To examine the relationship between auditory distraction and personality, we followed the procedure of previous studies (Körner et al., 2017;Sörqvist, 2010), and correlated the changing-state effect and the auditory deviant effect with the personality scales. Neither the changing-state effect nor the auditory deviant effect correlated significantly with any of the personality traits (Figure 3).

Discussion
The auditory deviant effect was successfully replicated. In the original paper, Hughes et al. (2005) reported auditory deviant effects that were associated with sample effect sizes of η p 2 = .23, .05, .17, and .18 in their Experiments 1, 2, 3, and 4, respectively. Here, the sample effect size of the auditory deviant effect was η p 2 = .10 which is comparable to that reported by other large-sample studies (Röer et al., 2015(Röer et al., , 2018. The effect is thus somewhat smaller than the changing-state effect that was associated with a sample effect size of η p 2 = .35 in the present study. However, the auditory deviant effect was reliably obtained. The auditory deviant effect interacted with serial position, but it was not restricted to the serial position at which the auditory deviant was presented. The effect was instead significant at the .05 level at all serial positions except Positions 1 and 7 (at which performance was highest and lowest, respectively). This suggests that the auditory deviant interfered with the representation of the list in memory rather than with the encoding of the digit presented simultaneously with the auditory deviant (Körner et al., 2019). The same applies to the changing-state effect (Miles, Jones, & Madden, 1991), which showed a similar dependence on serial position as the auditory deviant effect but was significant at all serial positions.
Habituation of the auditory deviant effect was examined by testing whether the difference between the auditory deviant condition relative to the steady-state condition decreased over trials. Serial recall performance increased in all conditions during the first session and remained at a high level in the second session, but the auditory deviant effect was constant within sessions and across sessions. The changing-state effect likewise showed no evidence of across-trial or across-session habituation.
Even though the sample size was much larger than in previous studies, the results provide evidence against a differential relationship between either form of auditory distraction and certain personality traits. Both the auditory deviant effect and the changing-state effect did not correlate with extraversion, neuroticism, or psychoticism, as assessed by the EPQ-R (Ruch, 1999). However, the homogeneity of the sample as well as the low reliability of the difference scores which constitute the auditory deviant effect and the changing-state effect (cf. Körner et al., 2017), may have reduced the chances of observing correlations that are significantly different from zero. The present study also focuses only on a narrow part of the disruptive potential of auditory stimuli as steady-state sequences already cause significant distraction relative to quiet (Bell, Röer, Lang, & Buchner, 2019) and complex distractor sequences such as sentential speech and music have been found to cause more distraction than sequences consisting of one-syllable words (Körner et al., 2017;Röer, Bell, & Buchner, 2014a). The present study thus successfully replicates two benchmark findings of working memory. Although both the changing-state effect and the auditory deviant effect have already been reported in many previous studies, there are several good reasons to perform a preregistered replication study with a large sample of participants providing reference data sets in the benchmarks repository. It has been discovered in recent years that seemingly well-established effects do not replicate (e.g., Hagger et al., 2016). The rate of failed replications is disconcerting, leading to the conclusion that many seemingly well-established findings may be false and effect sizes may be inflated (Open Science Collaboration, 2015). To counter this replication crisis, it has been proposed that replication efforts should target findings with the lowest likelihood of replication to filter out false positives that cannot be reproduced (Dreber et al., 2015). The benchmarks project takes an opposite, but equally important, approach by compiling a list of well-reproducible and theoretically relevant findings that then can be used as solid bases for developing and evaluating theories and computational models of working memory. In a first step within the benchmarks project, the theoretical relevance of findings was determined by asking experts in the field to rate the importance of empirical phenomena for models of working memory (Oberauer et al., 2018). It makes sense to accompany this effort with large preregistered replication studies to obtain information on the reproducibility as well as unbiased estimates of the effect sizes of the benchmark findings. This may help to focus theoretical and empirical research and resources on effects that are not only theoretically relevant but also empirically robust and, therefore, likely to be useful in efforts to advance our understanding of working memory. The present preregistered replication shows that two benchmarks of working memory-the auditory deviant effect and the changing-state effect-can be robustly obtained.

Data accessibility statement
The raw data are available in the benchmarks repository at https://osf.io/g49c6/. The preregistration and the personality data are accessible at https://osf.io/hyp94/.

Ethics and consent
Research has been performed in accordance with the Declaration of Helsinki. Written informed consent has been obtained from each participant prior to participation. All data were anonymized before being published, that is, information about the participants' code, gender, and age were removed.

Figure 3:
Correlations between the distraction effects and the personality scales. The changing-state effect corresponds to the difference in the total number of recalled digits between the steady-state condition and the changing-state condition, and the auditory deviant effect corresponds to the difference in the total number of recalled digits between the steady-state condition and the auditory deviant condition (higher values indicate more distraction). None of the correlations were significant at the .05 significance level. The figure was created with JASP (JASP Team, 2018).