Momentary, Offset-Triggered Dual-Task Interference in Visual Working Memory

There are conflicting views on whether the storage of visual working memory (VWM) involves active maintenance mechanism. If so, it should suffer from secondary tasks that tap into the same mechanisms during the delay. We found that when observers were asked to remember a visual pattern, the interference from a secondary task was only momentary. This suggests that active attentional processing may serves the initial consolidation, rather than maintenance, of information during the delay period. In three experiments, observers remembered a visual grating while having to perform an intervening change detection task on sets of letters presented at various intervals from the to-be-memorized pattern. The secondary task affected memory on the first task, but only for a limited time window post stimulus offset. We furthermore show that this temporary interference is locked to the offset rather than the onset of the to-be-memorized pattern, suggesting that attention may be transiently recruited for consolidation on a just-in-time basis.


Introduction
Visual working memory (VWM) enables the cognitive system to flexibly maintain a limited amount of visual information necessary for an ongoing task (Baddeley, 1992). Theories of VWM have assumed that storage involves a strong role for active maintenance mechanisms in sustaining or refreshing the neural activity associated with a stored representation, and thus protecting it against interference (e.g., Awh, Vogel, & Oh, 2006;Chun, 2011;Cowan, 1998;Gazzaley & Nobre, 2012;Jonides et al., 2008;Kiyonaga & Egner, 2012;Logie, 2014;Olivers, 2008;Theeuwes, Belopolsky, & Olivers, 2009). However, this assumption is contradicted by evidence that VWM representations readily survive even when in the meantime observers are asked to perform a different intervening task which is assumed to share at least some of the maintenance mechanisms (e.g., Hollingworth & Maxcey-Richard, 2013;Lewis-Peacock, Drysdale, Oberauer, & Postle, 2011;van Moorselaar, Olivers, Theeuwes, Lamme, & Sligte, 2015;Olivers, Peters, Houtkamp, & Roelfsema, 2011;Wang, Theeuwes, & Olivers, 2017. In fact, one could argue that being able to perform one task while holding on to information for another is a core functional characteristic of working memory. To bridge these views, we propose that in multi-task environments, the role of attention during VWM tasks is momentary, serving the initial consolidation of information rather than the sustained activation of that information throughout the delay period (see Ricker, Nieuwenstein, Bayliss, & Barrouillet, 2018 for a review). Durable maintenance is then still related to active processing, because the more active a memorized representation is, the better it is consolidated, and the more likely it is preserved across interference. At the same time, once an item is consolidated, active rehearsal does not continue to be crucial throughout the delay period. To test this account, here we investigated at which point in time dual task interference in memory occurs, and thus what the crucial time window is for the process of storing and maintaining a pattern.
Relevant evidence on a momentary role of consolidation processes in (V)WM comes from studies on the attentional blink, the phenomenon that processing of a second target (T2) is impaired shortly after presentation of a first target (T1; Raymond, Shapiro, & Arnell, 1992). Researchers have argued for a two-stage process, in which multiple sensory representations can initially be activated rapidly and in parallel, but of which only a subset can subsequently pass through a serial, attention-based memory consolidation process that serves conscious report (e.g., Chun & Potter, 1995;Jolicoeur & Dell'Acqua, 1998; see Olivers & Meeter, 2008 for an alternative account). The attentional blink then occurs because consolidation of T1 draws attention away from T2. Relevant for the present purpose is that the attentional blink reflects a temporary interference effect occurring between 100 and 500 ms after T1. Consistent with the two-stage model, Nieuwenstein and Wyble (2014) found evidence that the reverse scenario, where a later task can hinder the consolidation of information for an earlier task, also holds. They presented observers first with a to-be-remembered set of letters, shortly followed by a second task in which observers judged the parity of a digit. They varied the time between the two tasks and found that memory was impaired the closer the second task was to the first, with the shortest interval being 250 ms. Nieuwenstein and Wyble (2014) argued that the second task specifically disrupts the consolidation of the earlier presented items (see Barrouillet, Bernardin, & Camos, 2004;Barrouillet, Bernardin, Portrat, Vergauwe, & Camos, 2007;Ricker & Cowan, 2010;Ueno, Allen, Baddeley, Hitch, & Saito, 2010 for similar dual-task interference). Note that within these studies as well as the current study, consolidation is defined as the mechanism needed to make memory impervious to secondary task interference. This is related to, but not identical to a more perceptual definition, according to which consolidation protects memory against sensory interference from masks 1 (Liu & Becker, 2013;Vogel, Woodman, & Luck, 2006).
Given the earlier evidence, we sought to investigate exactly when a later secondary task interferes with VWM storage for an earlier task. In three experiments, participants remembered the color and orientation of a visual grating for a later memory test presented several seconds later. In the crucial conditions, participants were also required to perform a letter change detection task during the delay period (Dual Task block). In a control condition, the same letter displays were passively viewed (Single Task block; see Figure 1 for illustration). Importantly, we varied the time between the offset of the visual grating and the onset of the letter task between 0, 100, 250, 500, and 1000 ms. Experiment 1 provided evidence that the secondary task adversely affected memory of the grating, but only during a limited time window centered around 250 ms. Furthermore, Experiments 2 and 3 provide evidence that the interference is locked to the offset rather than the onset of the pattern, suggesting that attention may be efficiently recruited for consolidation on a just-in-time basis.

Experiment 1
Participants memorized both color and orientation of a grating stimulus for a memory test at the end of the trial. In the Dual Task block, participants performed another memory task in between, for which they had to remember a display of six letters. This task was presented at various intervals after the grating to assess when dual-task interference was maximal. In order to control for any sensory effects, a Single Task block also included the letter displays but without any additional task requirements. Finally, both types of block contained baseline trials without any intervening letter displays.

Method
Participants Twenty-four university students (22 females, mean age = 19.8 years) were recruited for monetary compensation. The number of participants was based on a pilot study with an estimated power of 0.65 for the two-way interaction with the alpha of 0.05. All provided written informed consent and reported normal color vision, and normal or corrected-to-normal visual acuity. Two participants were excluded because their average performance was close to chance (54% and 56% correct). The procedure complied with a generic protocol approved by the Scientific and Ethical Review Board of the Faculty of Behavioral and Movement Sciences of the Vrije Universiteit Amsterdam.
Apparatus and stimuli Participants were tested in a dimly lit laboratory and were required to keep the chin on a chinrest, positioned at a viewing distance of approximately 65 cm from a 17-in. color monitor. Stimulus presentation and response registration was controlled by custom scripts written in Python.
Task and stimuli were adopted from Wang et al. (2017Wang et al. ( , 2018, and are illustrated in Figure 1. On each trial observers remembered a single grating, presented on a gray screen (22 cd/m 2 ) and cropped by a circular mask (with a radius of 4°). Its spatial frequency was 0.5 cycles per degree, its foreground color was randomly selected from one of two color values (either 45° or 135° from the CIE L*a*b* color space centered at L = 70, a = 5, b = 0 with a radius of 60 color values, plus or minus an offset randomly selected from a range of ±10°; luminance: 36-44 cd/m 2 ), 2 and its orientation was randomly chosen from one of two orientations (either 45° or 135° from vertical, plus or minus an offset randomly selected from a range of ±10°). The additional random jitter in feature value was used to discourage verbal recoding. The test grating's color and orientation were either both the same as the sample one, or there was either a ±10° color change, or a ±15° orientation change (but never both). Also, here, the small change relative to the reference stimulus was meant to discourage verbal recoding.
On Intervening Display Present trials, the grating was followed by the presentation of two sets of six black letters (4 cd/m 2 , each subtending 1° × 1°). Letter locations were randomly selected from an invisible 3 × 3 grid, 9° × 9° visual angle centered in the middle of the display. The first set was randomly chosen from the English alphabet, without replacement. The second set was the same as the first set, except on change trials, when one letter was replaced with a new one. Figure 1 illustrates the main procedure. All trials started with a 500 ms fixation cross, followed by the tobe-memorized grating, which was presented in the center of the display for 300 ms. The primary task was to remember this grating for a memory test at the end of the trial. Note that, the name of "primary task" was adopted only for our own purpose, and for participants either task was stressed as equally important. Participants judged whether, compared to the original one, the grating was changed in either color or orientation (never both), by pressing the "Z" (no) or "/" (yes) key. One third of the trials had no change, 1/3 had a color change, and 1/3 had an orientation change. The test grating was presented until response. The next trial began after a randomly jittered (500-700 ms) inter-trial interval. Only accuracy was emphasized in the instructions.

Procedure and design
On Intervening Display Present trials, the offset of the to-be-remembered grating was followed by a blank interval varying between 0, 100, 250, 500, and 1000 ms, after which the first of two letter displays appeared for 500 ms, followed by a blank display for 500 ms, and then the second letter display until response or 2000 ms (whichever came first). The second display was identical to the first in 50% of the trials, while in the other 50% one of the letters was replaced with a new one. After the second display there was an interval of 3500 ms before the test grating appeared. On Intervening Display Absent trials, no letter displays were presented during the delay period, which instead remained blank for the same total duration. For efficiency, two thirds of trials had intervening displays, with one third no intervening display trials, randomly mixed within blocks.

Figure 1: A)
Sequence of events in the with-letters and without-letters conditions. The interval between the letter and the grating probe displays was variable between 0, 100, 250, 500, and 1000 ms in Experiment 1, and was variable between 100, 250, and 500 ms in Experiments 2 and 3. All displays were shown against a gray background. B) Since the colors adopted from CIELAB are difficult to be imaged, we present a case of colors used in sample and probe displays for the primary task.
Two main conditions were blocked. In the Single Task condition observers only did the grating memory task, and any intervening letter displays were to be ignored. In the Dual Task condition, participants performed a change detection task on the intervening letter displays, when they appeared. Participants indicated whether any letter had changed or not by pressing the "Z" (no) or "/" (yes) key, and were given two seconds to respond. Here too only accuracy was stressed.
A 2 (Block Type: Single vs. Dual task) × 2 (Intervening Display: Absent vs. Present) × 5 (Inter-stimulus Interval: 0, 100, 250, 500, and 1000 ms) within-subject design was adopted. Different consolidation intervals and the presence of intervening displays were randomly mixed within blocks, while different block types (single vs dual task) were tested in different sessions, with order counterbalanced across participants. Participants completed 27 practice trials and four sessions of 315 trials, on each of four consecutive days. This resulted in 84 trials per consolidation interval in each of the task conditions (single and dual task) when an intervening display was present, and 42 trials per interval when no intervening display was present.

Results
Primary task performance Figure 2A shows the mean accuracy score for the grating memory task as a function of Block Type (single vs. dual task), Intervening Display (present vs. absent), and ISI (0, 100, 250, 500, and 1000 ms). A repeated measures ANOVA revealed significant main effects for block type, F(1, 21) = 4.45, p = .047, partial η 2 = .18, and intervening display, F(1, 21) = 8.23, p = .009, partial η 2 = .28, but not for ISI, F < 1. Overall, as expected, memory performance was better in single task conditions and without intervening letter displays. Importantly, the three-way interaction between block type, intervening display, and ISI was reliable, F(4, 84) = 6.65, p < .001, partial η 2 = .24. No other interactions were reliable. Figure 2B shows the interference effects caused by the presence of the letter displays as a function of ISI in the single and dual task blocks (subtracting accuracy when letter displays were present from accuracy when letter displays were absent), and Figure 2C plots the interference effect caused by the additional memory task, minus any effects caused by the mere presence of the letter displays (i.e. the double subtraction), as a function of ISI. The more positive this value, the stronger the interference from the secondary task, above and beyond any potential masking effects stemming from the additional displays. As can be seen, additional task interference was weak early in the interval, peaked at 250 ms, and then waned again towards the longer ISIs of 500 and 1000 ms. Paired t-tests (two-tailed, uncorrected) confirmed this pattern as they showed reliable interference at ISIs 0 ms, t(21) = 2.14, p = .044, 250 ms, t(21) = 6.57, p < .001, and 500 ms, t(21) = 2.7, p = .014, but not for ISIs 100 ms, t(21) = 1.43, p = .167, and 1000 ms, t < 1. Moreover, interference was reliably stronger for the 250 ms ISI than for the 0 ms and 500 ms ISIs, both ts ≥ 2.4, both ps ≤ .026. Please note that we used uncorrected t-tests as we were interested in tracing the specific source of the already significant omnibus interaction, rather than searching for any effects per se.
Although Figure 2C indicates a clear peak in net interference around 250 ms, it appears from Figures 2A and 2B that the effect stems partly from the baseline conditions, when no letters were presented in the dual task block, or in the single task blocks when observers could be presented with the secondary task displays, but did not have to remember them. We will return to this after Experiment 2, where a similar pattern emerged.

Secondary task performance
Mean accuracy scores for the intervening letter change task as a function of inter-stimulus interval (ISI: 0, 100, 250, 500, and 1000 ms) are shown in Table 1. A repeated measures ANOVA with the same factor showed no significant effect of ISI, F < 1.

Experiment 2
The results of Experiment 1 suggest that the consolidation of the grating needed up to about 500 ms, with secondary task interference peaking at 250 ms. Experiment 2 served to replicate and extend Experiment 1 by investigating whether the post-stimulus interference was locked to stimulus onset or stimulus offset. To this end, in addition to varying the consolidation time (here between 100, 250, and 500 ms), we also varied the presentation duration of the grating, between 100, 300, and 500 ms. If consolidation is locked to the onset of the memory stimulus, we should see a systematic shift in interference with increasing sample duration. In contrast, if the consolidation is locked to the offset of the memory stimulus, then the same temporal pattern of interference should emerge for all sample durations.

Method
Twenty-four participants (24 females, mean age = 19.8 years) took part. The procedure was the same as in the Experiment 1 except that we added the factor Sample Time, with 3 different levels (100, 300, and 500 ms), plus we reduced the number of ISIs to three (100, 250, and 500 ms) to keep the size of the experiment within limits. Thus, a 2 (Block Type: Single vs. Dual task) × 2 (Intervening Display: Present vs. Absent) × 3 (ISI: 100, 250, and 500 ms) × 3 (Sample Duration: 100, 300, and 500 ms) within-subject design was adopted. Different sample times were tested between blocks. Participants completed 27 practice trials and six sessions of 81 trials each in three successive days. This resulted in 54 trials per cell when intervening displays were present, and 27 trials per cell when intervening displays were absent.

Primary task performance
The main results are presented in Figure 3A, which shows the accuracy scores as a function of Block Type (single vs. dual task), Intervening Display (present vs. absent), and ISI (100, 250, and 500 ms), separately for each of the three Sample Durations (100, 300, and 500 ms), as well as the average across sample durations. A repeated measures ANOVA revealed significant main effects for block type, F(1, 23) = 12.14, p = .002, partial η 2 = .35, intervening display, F(1, 23) = 29.32, p < .001, partial η 2 = .56, and sample duration, F(1, 23) = 5.54, p = .027, partial η 2 = .2, but not for ISI, F < 1. As expected, memory performance was better in the single task conditions, without intervening letter displays, and with longer sample durations. Importantly, and replicating Experiment 1, the three-way interaction between block type, intervening display, and ISI was reliable, F(2, 46) = 11.58, p < .001, partial η 2 = .34. Moreover, and interestingly, none of the interactions involving the variable sample duration were significant, all Fs < 1. Figure 3B shows the interference effects caused by the presence of the letter displays as a function of ISI in both the single and dual task blocks (subtracting accuracy when letter displays were present from accuracy when letter displays were absent). Figure 3C then shows the resulting dual task-based interference (i.e. the double subtraction of secondary task interference minus any interference caused by mere intervening display presence) as a function of ISI, for each of the sample durations, as well as averaged across sample durations. One-way ANOVAs on these difference scores revealed an effect of ISI that just failed to reach significance in the 100 ms sample duration, F(2, 46) = 3.09, p = .055, partial η 2 = .12, but was reliable for the 300 ms and the 500 ms sample durations, F(2, 46) = 3.92, p = .027, partial η 2 = .15; F(2, 46) = 3.48, p = .039, partial η 2 = .13. Paired t-tests consistently revealed reliable interference for the 250 ms ISI, regardless of sample duration, all ts ≥ 2.95, all ps ≤ .007 (at 300 ms sample duration interference was also reliable for the 500 ms consolidation interval, t(23) = 2.26, p = .034, all other ts ≤ 1.76, n.s.). Finally, when averaged across sample durations, interference at the 250 ms ISI was stronger than at the 100 ms and 500 ms ISIs, both ts ≥ 3.84, both ps ≤ .001.

Figure 3: The results in Experiment 2. A)
The results of the main task for different consolidation intervals and different conditions. B) The interference effects caused by the presence of the letter displays in the single and dual task blocks. C) The dual-task costs for different consolidation intervals. Error bars denote within-subjects 95% confidence intervals.
As in Experiment 1, Figures 3A and 3B suggest that also here the ISI effect stems at least in part from the baseline conditions, especially the reverse pattern in the single task baseline. Here interference from the letter display (there was no task) was actually less at 250 ms than at all the other ISIs, as confirmed by significant interaction between intervening display (single task letters present vs. absent) and ISI, F(2, 46) = 6.46, p = .003, partial η 2 = .22. Nevertheless, there was also still the opposite effect of ISI within the dual task condition, F(2, 46) = 4.78, p = .013, partial η 2 = .17. This baseline effect is rather puzzling. What we may observe here is the other side of the same coin: Any sensory interference from the secondary display onset is minimized when observers are maximally engaged in the consolidation of the primary grating stimulus. Such consolidation efforts with regards to the primary task may be stronger when visual interference is expected, which, given the mixed design of the current experiments, may have been the case for trials with and without interference. Mixing secondary display present and absent trials within blocks may also cause violations of expectations, which in turn may interfere with memory in as yet unknown ways. In Experiment 3 we therefore blocked the baseline conditions.

Secondary task performance
Mean accuracy scores for the intervening letter change task as a function of ISI (100, 250, and 500 ms) and Sample Duration (100, 300, and 500 ms) are shown in Table 1. A repeated measures ANOVA with the same factors showed no significant main effects nor interactions, all Fs < 1.

Experiment 3
Experiment 2 again suggests that the consolidation of the visual memory occurs around 250 ms, which is when secondary task interference was the strongest. Quite surprisingly, Experiment 2 also indicates that this interference was time-locked to the offset of the memorandum, rather than the onset. Because this was an unexpected finding, we sought to replicate this pattern in Experiment 3. Second, we wanted to disentangle the previous baseline effects, which may have been caused by the way in which single task trials with and without interfering letter displays were mixed in Experiments 1 and 2. Here we also blocked these conditions and presented them in separate blocks. This resulted in three main conditions: Dual task with intervening letter displays, single task with intervening letter displays, and single task without intervening letter displays (note that due to the blocking of these conditions the dual task without intervening letters condition became obsolete). As in Experiment 2, we expected an effect of ISI on dual task interference in the dual task condition, while in the single task with letter displays present, we may again find the opposite pattern, as maximum attention to the primary memorandum may maximally protect against sensory interference. In the single task without intervening letters we should now find little effect of ISI, as no potential expectations were violated in that condition.

Method
Eighteen participants (18 females, mean age = 19.6 years) took part. Relative to Experiment 2, we chose to run fewer participants and instead to increase the number of trials per cell to reduce measurement error (from 54 to 90). The procedure was the same as in the Experiment 2 except that only three conditions were tested, which were moreover fully blocked. This led to a 3 (Block type: Dual task with intervening letters, single task with intervening letters, and single task without intervening letters) × 3 (ISI: 100, 250, and 500 ms) × 3 (Sample Duration: 100, 300, and 500 ms) within-subject design was adopted. Participants completed 27 practice trials and 9 sessions of 270 trials each in five successive days. This resulted in 90 trials per cell.

Primary task performance
The main results are presented in Figure 4A, which shows the accuracy scores as a function of Block Type (dual task with letters, single task with letters, and single task without letters) and ISI (100, 250, and 500 ms), separately for each of the three Sample Durations (100, 300, and 500 ms), as well as the average across sample durations. A repeated measures ANOVA revealed significant main effects for block type, F(2, 34) = 25.96, p < .001, partial η 2 = .6; but not for ISI, F < 1, nor sample duration, F(2, 34) = 2.61, p = .088, partial η 2 = .13. As expected, memory performance was worse in the dual task condition. Importantly, as in Experiments 1 and 2, the twoway interaction between block type and ISI was reliable, F(4, 68) = 8.28, p < .001, partial η 2 = .33. Moreover, none of the interactions involving the variables sample duration and ITI were significant, both Fs < 1. Figure 4B shows the resulting dual task-based interference (i.e. the subtraction of secondary task interference minus any interference caused by mere intervening display presence) as a function of ISI, for each of the sample durations, as well as averaged across sample durations. One-way ANOVAs on these difference scores revealed an main effect of ISI for the 100 ms sample duration, F(2, 34) = 3.76, p = .034, partial η 2 = .18, for the 300 ms sample duration, F(2, 34) = 5.27, p = .01, partial η 2 = .24; and for the 500 ms sample duration, F(2, 34) = 6.54, p = .004, partial η 2 = .28. Paired t-tests consistently revealed reliable interference for the 250 ms ISI, regardless of sample duration, all ts ≥ 5.09, all ps ≤ .001. At the 300 ms sample duration interference was also reliable for the 100 ms ISI, t(17) = 2.39, p = .029, and 500 ms ISI, t(17) = 2.37, p = .03; all other ts ≤ 1.88, ns. Finally, when averaged across sample durations, interference at the 250 ms ISI was stronger than at the 100 ms and 500 ms ISIs, both ts ≥ 4.31, both ps ≤ .001.
When assessing the separate blocks, we found a main effect of ISI in the dual task condition, reflecting worst performance at 250 ms, F(2, 34) = 12.48, p < .001, partial η 2 = .42. Interestingly, the opposite pattern occurred in the single task when the letters were presented, F(2, 34) = 6.44, p = .004, partial η 2 = .28, resulting in a baseline effect similar to the previous experiments. However, no effect of ISI occurred in the block in which the letters were absent, F < 1, as performance remained essentially flat.
Together, the main finding of Experiment 3 replicates those of Experiments 1 and 2: The secondary task interfered most with the primary memory when presented 250 ms after offset of the memorandum. And again, as in Experiment 2, this effect was time-locked to the offset, rather than the onset of the memorandum. In addition, we find a similar but opposite time course in the single task baseline where the intervening displays were presented without the extra task. Thus, it appears that the primary memorandum is maximally protected against sensory interference at around 250 ms post offset. In contrast, time per se had little effect, as performance remained constant with ISI in the single task condition without intervening letters.

Secondary task performance
Mean accuracy scores for the intervening letter change task as a function of ISI (100, 250, and 500 ms) and Sample Duration (100, 300, and 500 ms) are shown in Table 1. A repeated measures ANOVA with the same factors showed no significant main effects nor interactions, all Fs < 1.

General Discussion
Previous studies have come to different conclusions as to whether performing a secondary task during retention of a pattern in VWM results in performance costs, and thus whether active maintenance mechanisms play a continuous role in VWM storage throughout the delay period. We found that a case where a secondary attention-demanding task can cause interference during VWM retention, but only transiently, within a relatively short time window centered around 250 ms. Either less or no interference was found for both earlier and later time points during the delay period. We interpret these findings as evidence that in VWM, at least for a single grating stimulus as used here, it is primarily the consolidation of the pattern in memory and different conditions. B) The dual-task costs for different consolidation intervals. Error bars denote within-subjects 95% confidence intervals.
that uses task resources, rather than the continuous maintenance. Presumably, after a strong memory trace has been created, resources can be safely diverted to an intervening task. This then begs the questions what these momentarily required task resources are. Many have previously argued for attentional mechanisms to be central to VWM storage (e.g., Awh et al., 2006;Chun, 2011;Cowan, 1998;Gazzaley & Nobre, 2012;Jonides et al., 2008;Kiyonaga & Egner, 2012;Logie, 2014;Olivers, 2008;Theeuwes et al., 2009). We argue here that attention may serve to consolidate a memory into a durable trace by strengthening connectivity patterns rather than by sustained neural firing. This conclusion is also supported by earlier work using the retro-cueing paradigm in which after the memory items have disappeared, a cue signals which items is likely to be tested. As in the first retro-cueing studies (Griffin & Nobre, 2003), Rerko, Souza, and Oberauer (2014) found that directing attention to a specific item within VWM improved report for that item. Importantly, Rerko et al. found that this improvement remained even when observers performed another attention-demanding task between the retro-cue and the test, consistent with the idea that attention helps in consolidating a memory, but not in maintaining it. However, we point out that although we chose the secondary task (on letter change detection) to be attentionally demanding, it itself involved a memory component. Hence, the interference with presumed consolidation processes that we found here may have been memory-related rather than attention "in general". Specifically, the second task may itself have triggered consolidation mechanisms, which then interfered with consolidation of the primary task information. Although we cannot exclude this possibility, it remains to be seen what those consolidation-specific mechanisms would be -in our view, attention is a strong candidate to serve exactly this role, as it has been known for decades to enhance memory. In other words, attention may be the de facto consolidation mechanism, but before this is confirmed the argument remains circular.
Interestingly, we found the moment of maximum interference to be tied to the offset, rather than the onset, of the primary stimulus. This would suggest that the process of consolidation only starts when the stimulus disappears. Offset-related memory-based performance has been reported in one study, Schmidt & Zelinsky (2011), which asked observers to remember a cue as a target template for a subsequent visual search task. They varied both the cue duration and the time between the cue offset and the search display, and found that search performance varied with the latter, but not the first, suggesting that turning a stimulus into a search template started with the offset, rather than the onset of the cue. Why would memory consolidation be locked to the offset of a stimulus? One reason is that there is no need to commit a stimulus to memory if it is still present in the outside world, as it is available directly to perception (O'Regan, 1992;Rensink, 2002). The stimulus offset then provides the signal to turn remaining iconic activity into a more durable format. This scenario is also consistent with ''just in time'' conceptions of working memory which state that observers will avoid using effortful and expensive cognitive processes if tasks can be completed using "cheaper" sensory mechanisms (Ballard, Hayhoe, & Pelz, 1995;Droll, Hayhoe, Triesch, & Sullivan, 2005;Hayhoe, Shrivastava, Mruczek, & Pelz, 2003).
Another important question is why interference peaked at 250 ms. Given the protracted nature of the secondary task, one would expect that it should also interfere at 0 and 100 ms ISIs. On the surface, the optimal consolidation time window estimated here appears very much compatible with earlier estimates from dual task paradigms including the attentional blink (Chun & Potter, 1995;Jolicoeur & DellAcqua, 1998), and retroactive interference with memory for letters and Japanese characters (Nieuwenstein & Wyble, 2014). Also, in those types of task maximum interference was found to be around 250 ms (though Nieuwenstein & Wyble, 2014, did not measure any shorter durations). Closer temporal spacing of the relevant information in the stream within the attentional blink leads to a phenomenon called lag-1 sparing, where the second task is actually spared from interference despite its close temporal proximity. Explanations for lag-sparing vary, but either assume that both targets are combined within one mnemonic trace (Bowman & Wyble, 2007), or share the same attentional enhancement (Olivers & Meeter, 2008). Furthermore, Olivers, Van der Stigchel, and Hulleman (2007) found that the attentional blink can be postponed by presenting sequences of contiguous targets (rather than a single T1), which would be compatible with consolidation being delayed as long as relevant information is coming in. Likewise, then, observers may combine representations or mechanisms when presented immediately following each other. However, although the lack of interference at 0 and 100 ms ISI here bears some surface resemblance to such sparing. We note that attentional blink findings have relied on SOA (stimulus onset asynchrony) manipulations rather than ISI, and thus any blink related processes were either shifted forward in time relative to ours, or were onset-related rather than offset-related, which makes the timing of the current phenomena and the attentional blink pattern rather different. One factor is that attentional blink studies featured rapid and masked presentation conditions. It is possible that these data-limited circumstances force observers to start consolidation as soon as possible, rather than as soon as the stimulus disappears. All in all, as long as we do not have a clear idea of the exact timing of the underlying perceptual, attentional, and mnemonic stages involved in the primary as well as the secondary task, we cannot explain why interference peaked specifically at 250 ms ISI, and other setups may result in different windows of interference. What we do claim our findings show is that interference is temporary, and time-locked to stimulus offset.
The finding that interference from the secondary task was only temporary may be due to the fact that only a single object needed to be stored. Storing multiple objects may require each object to be attended in turn for it to be consolidated (Liu & Becker, 2013). There is also evidence that storage of multiple objects may benefit from re-attending (Hardman & Cowan, 2015;. Averaged across trials, such serial consolidation or resampling may then be expressed as extended demands for attention during the maintenance period. The 250 ms time frame appears at odds with earlier estimates of VWM consolidation using masked presentation (e.g., Ricker & Hardman, 2017;Vogel et al., 2006). Specifically, Vogel et al. (2006) presented displays of multiple memory items, and found that capacity increased with increasing stimulus-mask intervals. From this performance slope they estimated that each additional memory item adds another 50 ms consolidation time, thus arriving at a consolidation rate of 50 ms/item. However, we note that this estimate refers to the duration of the consolidation process (i.e. the additional time it takes to consolidate another item), while our estimate pertains to the moment in time when consolidation occurs (or more precisely, the time point at which the two task interferes maximally with each other on what we assume to be consolidation). Also, we presented only a single grating per trial, whereas multiple items may be processed partly or entirely in parallel, processing which may moreover continue beyond the presentation of a mask (Nieuwenstein and Wyble 2014). Parallel and continued processing would lead to an underestimation of the duration of consolidation. That said, a relatively short process fits with the transient character of the interference we find here.
In conclusion, we provide evidence that, at least under the conditions tested here, VWM only momentarily suffers from a secondary task during maintenance, and thus that attention is only momentarily required during VWM retention. This is consistent with a role in initial consolidation rather than sustained maintenance. Moreover, it appears that this consolidation process is invoked as soon as, but no earlier than the stimulus itself disappears.

Ethics and Consent
All studies reported here were approved by the Scientific and Ethical Review Board of the Faculty of Behavioral and Movement Sciences of the Vrije Universiteit Amsterdam (VCWE-2016-215). All participants provided informed consent by signing a printed consent form.