Working memory capacity and speech production in L 2 : evidences from a picture description task

This is an experimental study which aimed at investigating the ralationsship between working memory capacity and measures of L2 speech performance in a picture description task. The main assumption underlying the present study was that L2 speaking is a complex cognitive task which is carried out within the constraints of a limited-capacity system, namely, working memory. In this system, there are trade-off effects between the storage and processing functions of working memory just as in L2 speaking there are trade-off effects among fluency, accuracy and complexity when L2 learners perform under processing pressure.

t would not be too far-fetched to assume that speaking a language fluently is the ultimate goal of most L2 learners and yet, as surprising as it may seem, L2 speaking has received considerably less attention from research than other skills such as reading.One possible reason for this imbalance may be that skills involving language comprehension are more easily assessed than those involved in language production (FORTKAMP, 2000).Nevertheless, given the importance of speaking and particularly the difficulty of speaking a L2 fluently, the effort is worth making.
As a cognitive process, speaking involves many complex subprocesses.One possible way to look at these processes is to adopt the information processing approach which conceptualizes human beings as autonomous, active, and limited-capacity processors who have a working memory system responsible for online processing and temporary maintenance of information in the performance of complex tasks, such as problem solving, reading and speaking, among others (BADDELEY & LOGIE, 1999).The mental processes involved in the performance of complex tasks compete for the limited attention capacity of the working memory, which has to be shared between on-line processing and storage of relevant information.Working memory has become such a powerful construct in cognitive research that many authors have attempted to define its characteristics, structures and functions, as becomes evident in Miyake and Shah's book (1999).Most of these models of WM have focused I on either its functions or structures, the latter being more evident in recent studies due to the technology available now for neuroimaging, which allows the location of particular functions of WM in the brain.
Baddeley, by far the most cited author in WM research, conceptualizes WM as a multi-component system that comprises two slave systems -the phonological loop and the visuo-spatial sketchpad -and the central executive which is responsible for coordinating the two slave systems and allocating attention from a limited capacity pool (1990).His model of WM is not unitary and focuses more on the functions of WM although recently he seems to have switched his attention to the possible structures and locations of WM in the brain (1986), suggesting that the frontal lobes play a crucial part in subserving functions assigned to the central executive.
Although a wide range of definitions and conceptualizations of WM can be found in the literature, Miyake and Shak (1999) offer the most all-encompassing definition of the term, based on their study of ten different models of WM.The objective of this study is not to review models of WM, but we feel it is necessary to bring at least one model (in this case we opted for Baddeley´s since it is the most cited) and a working definition of the construct.Thus, we will borrow Miyake and Shah's definition because we feel it taps the dynamic nature of WM: "WM is those mechanisms or processes that are involved in the control, regulation, and active maintenance of taskrelevant information in the service of complex cognition, including novel as well as familiar, skilled tasks" (p.45).According to Conway et al (2005), performance on WM span tasks depends on multiple factors, with domain-specific skills such as chunking and rehearsal facilitating storage, and a domain-general capability allowing for cognitive control and executive attention.Working memory span tests are said to predict complex cognitive behavior across domains, primarily because of the general executive-attention demands of the tasks, rather then their domain-specific demands.Daneman and Green (1986) and Daneman (1991) claim that working memory capacity (WMC) is task-specific and varies as a function of how efficient a person is at the task in which WM is involved.They were criticized by the proponents of the domain-free view (TURNER and ENGLE, 1989;ENGLE et al, 1992;ENGLE & ORANSKY, 1999; among others), who, conversely, state that WMC is a stable construct, not variable across tasks.
The ongoing debate of whether WMC is a domain-free or taskspecific construct has not been able to shed much light on WM research.It remains to be seen, through empirical investigation, whether the issue is worth pursuing further or neglecting through a change of focus of the research agenda.The view of WMC as a source of individual differences in L1 acquisition and development is already indisputable (JUST & CARPENTER, 1980;DANEMAN & GREEN, 1986;TURNER & ENGLE, 1989;CONWAY & ENGLE, 1996;ENGLE et al, 1999;KANE et al, 2000).A growing number of researchers have begun to see WMC as a possible independent constraint on the processes of second language acquisition as well, since in L2 there is an extra load imposed on the system, affecting speed and quality of acquisition and processing.
Aiming at investigating the relationship between WMC and L2 speech production, Fortkamp (1999) set out to verify whether WMC would correlate with fluent L2 speech, by replicating Daneman's (1991) study.The measures used to assess WMC were the speaking span test (SST) and the reading span test (RST), both in L1 (Portuguese) and L2 (English); and the measures to assess fluency were the Speaking Generation Task (SGT), the Oral Slip Task (OST) and the Oral Reading Task (ORT).Results showed no significant correlations between the SST in L1 and in L2, nor between the SST and the RST in both languages.However, the SST in L2 correlated significantly with the SGT, indicating that larger WM capacity corresponds to faster speech rate.In sum, the findings of Fortkamp's (1999) study give partial support to the task-specific view of WMC and suggest that speakers seem to draw on different pools of cognitive resources when L1 and L2 speech are produced.
With the aim of expanding on her previous study, Fortkamp (2000) set off to investigate individual differences in WMC and their relationship with the production of fluent, accurate, complex and lexically dense L2 speech.In order to measure WMC, a SST and an operation-word span test (OWST) were used, both adapted to L2, following Daneman (1991).The participants' speech production was elicited through a picture description and a narrative task.The rationale behind the decision to use two measures of WMC was to see which of these measures correlated best with L2 speech production.Her assumption was that if the SST correlated better than the task-specific then that view would be supported in her study.Conversely, if the OWST correlated better with the L2 speech production, then she would have more evidence for the domain-free view of WMC.Unfortunately, she had methodological problems with the OWST and was not able to use its data in the analysis due to ceiling effects.
Results from her study showed a significant correlation between individuals' WMC and fluency, accuracy and complexity.However, against her assumption, no significant correlation was found between WMC and weighted lexical density.
Other researchers have also looked into the relationship between WM and L2 speech production, showing mixed results.D'Ely (2004) looked at the relationship between WMC and L2 performance in various domains, one of which was speech production.Surprisingly, despite using the same WMC measures used by Fortkamp (2000), D'Ely did not find significant correlations between WMC and fluency.Weissheimer and Fortkamp (2004) looked at the role of strategy use and practice in WMC and found positive and significant correlations, thus corroborating the predictive power of the SST.
As can be seen from the studies reviewed above, more systematic research is needed to shed light into the relationship between WMC and L2 speech production, especially with such contrasting evidence found in the few studies which were carried out with this goal.Aiming at investigating this relationship, we partially replicated Fortkamp´s (2000) study using two measures of WMC, namely, SST and OWST and four measures of speech production -fluency, accuracy, complexity and weighted lexical density -adapting her OWST to avoid ceiling effects.Thus, one of the objectives of this study was to verify which span test correlated best with L2 speech production measures so as to gather more evidence for either the task-specific or the domainfree view of individual differences in working memory capacity.The main assumption supporting the present study is that L2 speaking is a complex cognitive task which is carried out within the constraints of a limited-capacity system, namely, working memory.In this system, there are trade-off effects between the storage and processing functions of working memory, just as in L2 speaking there seems to be now sufficient evidence for the trade-off effects among fluency, accuracy and complexity when L2 learners perform under processing pressure (FORTKAMP, 2000;BYGATE, 2000).Aiming at investigating this relationship, the following research question was put forward: Is there a relationship between working memory capacity and L2 speech production measures in a picture description task?
Aiming at answering this general question, four hypotheses were raised: 1.There is no relationship between measures of WMC (SST and OWST) and L2 speech production in terms of fluency, accuracy, complexity and lexical density in a picture description task.2. There is a relationship between measures of WMC (SST and OWST) and L2 speech production in terms of fluency, accuracy, complexity and lexical density in a picture description task.
3. There is a relationship between measures of WMC (SST and OWST) and/or fluency, accuracy, complexity and lexical density in a picture description task.4.There is a relationship only between the SST and/or fluency, accuracy, complexity and lexical density in a picture description task. 5.There is a relationship only between the OWST and/or fluency, accuracy, complexity and lexical density in a picture description task.Twelve EFL intermediate level participants (6 males and 6 females) participated in this study.They were graduate students taking part in an experimental group at the Federal University of Santa Catarina (UFSC).They had all been pre-tested to participate in this group to ensure that all participants had the same L2 proficiency level.One of the researchers was teaching the group during the entire semester and was responsible for collecting the data, which was done individually with each participant, following this order: the picture description task, the Speaking Span Test and the Operation Word Span Test.

Data collection and analysis Data collection and analysis Data collection and analysis Data collection and analysis Data collection and analysis
In order to investigate the relationship between individual differences in working memory capacity and L2 speech production, two working memory tests were used, namely, the speaking span test (SST) and the operation-word span test (OWST).The assumption underlying the use of the former is that it is taps a more task-specific ability whereas the latter a more domain-free aspect of L2 speaking and, when measuring an abstract construct such as WMC, it is methodologically safer to use multiple measures (CONWAY et al, 2005).
The SST used in this study followed Daneman and Green's SST (1986) and was adapted to L2.This test consists of 3 trials of sets of 60 unrelated words presented by two, three, four, five and six each time, which were read by the subjects silently.At the end of each set, subjects were required to produce a sentence aloud for each word presented.Each sentence had to be formulated following its original form and order of presentation.
The OWST test used consisted of 60 operation strings and 60 English words, following Turner and Engle (1989).It was also adapted to L2 and controlled for ceiling-effects (FORTKAMP, 2000).In the adapted version of the test, participants were required to calculate and speak the result into the microphone while trying to memorize the word following the operation.The speaking test consisted of a picture description.
In this study, four measures of speech production were investigated, namely: fluency, accuracy, complexity and lexical density.Fluency was assessed in terms of unpruned speech rate (including self-repetitions and corrections) which was calculated by dividing the total number of words produced by the total time (including pausing time) expressed in seconds that the participants took to complete the task.The resulting figure was then multiplied by 60 to express the number of words produced per minute.Accuracy was calculated counting the number of errors per 100 words.Complexity was operationalized as the number of dependent clauses divided by the time taken to accomplish the task -in seconds -and the resulting figure was then multiplied by 60 to express the number of dependent clauses per minute.Finally, weighed lexical density was calculated by counting the number of grammatical and lexical items in the speech sample.Lexical and grammatical items were divided into high-frequency and low-frequency and the low-frequency items were given one point whereas the high-frequency ones were given half.The total number of lexical items was then determined by dividing the total number of weighed linguistic items and multiplied by 100 so as to obtain the percentage of weighed lexical items over the total number of weighed linguistic items in the speech sample.All speech production measures were calculated following Fortkamp (2000).This section presents the results of the statistical analysis carried out to address whether there is a relationship between WM capacity and L2 speech production in terms of fluency, accuracy, complexity and weighted lexical density in a picture description task.It is divided into three main subsections.Section 5.1 reports the descriptive statistics of the SST and the OWST.Section 5.2 presents the descriptive results for L2 speech production measures.In section 5.3, the correlational results for the WM capacity and L2 speech production measures are reported.Finally, section 5.4 offers a general discussion of the findings.This subsection presents the descriptive statistical results of two different variables that might influence L2 speech production in terms of fluency, accuracy, complexity and weighted lexical density: the SST and the OWST.Table 5.1 reports the mean (M), standard deviation (SD) and the minimum (Min) and maximum (Max) scores for the SST and OWST (see appendix A for individual scores on these variables).

N=12
As can be seen from table 5.1, the highest possible score for the speaking span test was 35, with a smaller standard deviation: 3,64.The variation between the minimum and maximum scores on this variable was a 12-point range, which indicates that most of the participants performed similarly on this test.This trend can be observed in Figure 1: Figure 1 Participants' behavior on the SST Differently, results on the operation word span tests show a maximum of 42 and scores varying along a 26-point range, with a larger degree of variability (SD) in relation to the mean -7,21, which means that most participants' scores on these tests tended to be spread across the distribution (far from the mean -32,75) thus revealing a more heterogeneous behavior, as can be seen in Figure 2: s p ea k in g s p a n t e s t 3 5 ,0 32 ,5 3 0 ,0 27 ,5 2 5 ,0 22 ,5 s p e a k in g sp a n t e st This subsection depicts the descriptive statistical results for L2 speech production measures.Table 5.2 displays the mean (M), standard deviation (SD) and the minimum (Min) and maximum (Max) scores for fluency, accuracy, complexity and weighted lexical density (see Appendix B for individual scores on these variables).As can be observed from table 5.2, the mean score on the speech rate (SR) variable was high -84,86, with a large standard deviation of -30,33.The minimum and maximum scores varied over 98 raw scores, indicating that half of the participants performed above the mean.
The accuracy (ACC) variable seems to have a different profile, since it presented a small mean value -2,51 and also a small standard deviation score -1,31.The variation between the minimum and maximum scores was over a 4-point range, which means very heterogeneous behavior among participants.However, despite this high variability, ACC scores seem still normally dispersed across the distribution, as is illustrated in Figure 3: Regarding complexity, the COM variable produced the lowest mean and standard deviation scores -0,75 and 0,93, respectively.Another surprising results is that the variability between the minimum and maximum scores is exactly the maximum score -2,68, showing that some participants did not produce any complex language at all (minimum score = 0), as is depicted in Figure 4: Similarly to the SR variable, the weighed lexical density (WLD) variable presented a high mean score -62,46.However, it had a low standard deviation value if compared to the SD score of the former variable -6,60.This result seems to suggest that most participants performed under the mean score.The maximum score for the WLD variable was 78,26, varying over 23 raw scores from the minimum value of -54,88, suggesting heterogeneous behavior among participants.This trend is illustrated in Figure 5: This section presents the results of the Pearson Product Moment Coefficient of Correlation (two-tailed), computed among SST, OWST, SR, ACC, COM and WLD, in order to address our main research question: Is there a relationship between working memory capacity and L2 speech production measures in the picture description task?
Table 5.3 depicts the correlation between WMC and L2 measures of speech production.As can be observed from Table 5.3,The Pearson Product Moment Coefficient of Correlation shows a statistically significant correlation between the speech rate and the speaking span test: N (12) = .632,p < 0.05, suggesting that participants with larger working memory capacity, as measured by the speaking span test, tended to produce L2 speech more fluently.Moreover, the fact that the operation word span test did not correlate with any measure of L2 speech production might be an indicator of the task-specific view of WM when the task concerns L2 speaking.
Similarly, another significant correlation was found between the speaking span test and the operation word span test: N (12) = .691,p < 0.05.This result suggests that, even though the operation word span test did not show any significant correlation with speech production measures, it seems somehow related to the speaking span test, in the sense that both tap participants' memory capacity for processing and storage of information.In other words, both tests seem to measure what they are expected to measure: working memory capacity.
The fact that no other measure of L2 speech production, except for speech rate, correlated significantly with the memory tests (SST and OWST) may be considered evidence for the trade-off effects among different aspects of oral production, as proposed by Bygate (2001) and Fortkamp (2000).
It is also noteworthy that some statistical significant correlations among speech production measures were found.Table 5.4 displays the correlations between speech rate and complexity, and speech rate and weighted lexical density.As it is possible to note from Table 5.4, there is a positive significant correlation between speech rate and complexity measures N (12) = .746,p < 0.01, suggesting that participants who were more fluent, that is, produced more words per minute, also used more complex language.However, for the relationship between speech rate and weighted lexical density, a significant negative correlation was found N (12) = -.621,p < 0.05.This might indicate that, in order to produce more fluent and complex speech, participants had to use more familiar words, thus penalizing their lexical density.
In sum, these results corroborate several studies in the literature (FORTKAMP, 2000;SKEHAN, 1998;FOSTER and SKEHAN, 1996) in the sense that there are indeed trade-offs among L2 speech production variables.Once L2 speakers favor certain aspects of oral production, others are, consequently, penalized.Unfortunately, this study was not able to address the issue of whether the nature of WMC is domainfree or task-specific through the analysis of the span tests used, perhaps due to the limited number of participants, which enabled very little variance in the data.
6. GENERAL DISCUSSION 6. GENERAL DISCUSSION 6. GENERAL DISCUSSION 6. GENERAL DISCUSSION 6. GENERAL DISCUSSION The assumption underlying this paper is that speaking is a complex cognitive task which is carried out under the constraints of a limited working memory system and that once attentional resources from this system are allocated to certain aspects of speech production, the remaining capacity is not enough to cover other aspects.Results of the present study seem to corroborate this assumption, since only one instance of significant correlation was found between WMC and L2 speech production measures, particularly between the SST and SR.It might be that, in order to be able to produce fluent speech, participants had to direct their attentional resources towards faster oral production, thus penalizing other aspects of L2 speech production.Speaking fast requires a lot of control and attention.Participants were probably left with few resources to allocate in the production of more accurate, complex and/or lexically dense speech.
Moreover, these findings appear to be in line with Foster and Skehan's (1996) and Skehan's (1998) claims that when L2 speakers perform under some information-processing pressure, they are likely to favor specific goals at the expense of others, thus indicating the existence of trade-off effects among speech production variables as a function of individual differences in working memory capacity.
Another important finding related to the trade-off effects in speech production concerns the positive significant correlation found between complexity (COM) and speech rate (SR) and a negative one between speech rate (SR) and weighted lexical density (WLD).Once again it is possible to claim that because L2 speech processes seem to place an extra load on speakers' cognitive system and because they possess a limited working memory capacity, it is likely that when aiming at speaking more fluently and using more complex language structures, L2 speakers will need to penalize other aspects of the skill, in this case, lexical density.
On the other hand, besides the role played by working memory in the performance of complex cognitive tasks, it seems that, in the present study, the lack of statistical significance correlations between span tests and speech production measures may be due to methodological reasons.That is, it might be the case that the picture selected to elicit participants' oral production did not present the appropriate visual stimuli needed to trigger more accurate, complex and, more specifically, lexical dense speech.Particular properties of the picture such as abstractedness and fuzziness might have inhibited production, leaving participants without much to say (see the picture used in Appendix D).
It is also important to highlight that only the SST correlated significantly with the speech rate (SR), which supports our hypothesis 4, thus providing evidence in favor of a task-specific view of working memory.In other words, participants with larger working memory capacity tended to outperform the lower spans in speaking tasks due to their greater efficiency in speech production processes.According to Daneman and Carpenter (1980), working memory capacity depends on one's processing efficiency at the specific task to which WM is being related, in this case, speaking.Giving that the findings of the present study speak for the existence of trade-off effects among speech production variables as a function of individual differences in working memory capacity, which, in turn, seems to be task-specific, some issues concerning both areas deserve full attention.First, the temporal organization of speech needs to be understood as variations in continuity and speed (VERHOEVEN, PAUW and KLOOTS, 2004).This variable, according to the authors, may be influenced by non-linguistic variables such as gender, age, social status and emotional factors.Studies in this field have shown that men speak faster than women, the elderly speak more slowly than youngsters, people in higher-ranking professions speak more slowly than those in lower-ranking professions, and increase in stress levels are related to faster speech.Hence, it seems plausible to suggest that for future research these variables should be controlled either during the selection of participants or through proper statistical analyses.Another suggestion would be to operationalise speech rate differently, by considering it as flow of delivery without interruptions (pauses) and/or number of selfcorrections, for instance.In this case, the fewer the pauses and number of self-corrections, the more fluent the speaker would be.
Second, if working memory capacity is in fact task-specific and processing efficiency is the reason why L2 speakers perform better, then it would be interesting to design and carry out a study in which the span test applied requires only the processing function of working memory and then correlate these measures with measures of L2 speech production, so as to have a better understanding of the relationship between memory and speaking.
Finally, different data collections with different pictures could be carried out, so as to minimize the effects of particular properties of pictures and the possible lack of sufficient visual stimuli.Different methods to elicit speech, such as a video cued and a listening cued narrative, would also be interesting to check for the relationship between speech perception and production and working memory capacity.
With regard to the limitations of the present study, it is necessary to mention the small sample size, resulting in a short variation in scores on the SST and the OWST, consequently not allowing us to carry out an analysis directly contrasting high and low spans.The lack of variation in memory span scores might have contributed to the lack of significant correlations with speech production measures.Participant 11 -Juliana Well, I can see eh… one, two, three, four people in a … a….crossroad… and they are very eh… in very different conditions and… are moving and there are a lot of people, trees too, and… uh… seems, it seems that there are music playing and….A very fun music, but there are a man that.. there a man who is…. who is in the ground eh.. looking for a girl, a girl….Who is.. with another man, and maybe they are … eh… they are trying to talk about something .. and… all of the other people are happy, uh… they seem happy, and it is a beautiful day, very sunny, eh… one car stopped uh… I think… the street was closed to cars, and… there are people with few clothes, and other people with more clothes, and…this… this picture seems to be a.. a…a….commercial, a commercial picture of a… I don't know, a jeans or… a… a… I think it is … it's a… a jeans.
Participant 12 -Bruno Well…uh… I can see the… I see in… the whole picture brings me the idea of… uh… young people… young people dancing… and…of course it's a picture that is advertising something, cause.. uh…it would be impossible to go out on the streets and see people dancing and so happy like this… probably it would be some kind of problem, some… somebody fighting because the… the traffic is cut out…and… it's a… it's a very beautiful picture but.. uh… advertising picture.. uh…an unreal picture in my opinion….eh…we have… we have some, a kind of… coreography…coreography scene…of dancing… it's young people who dress eh… young clothes… and colored clothes…and in the main…the main.. the main plain…eh…it's five dancers… that are in a kind of pose… a kind of… eh… pose but brings the idea of movement…cause they.. they are dancing, dancers… and they… they are in some kind of pose.
Eh, in the background we have eh… a lot of young people too… who dress young clothes too, and colored clothes, and they… they are making the background of the scene, of the main scene, the main plain… cause they are…they are dancing too and.. it's a kind of .. it's a kind of harmony in their movement, some kind of… something that was prepared….the…the background, the city … when, where… they are dancing… it's… shows a…a kind of big city, because we have big buildings, eh,..building with three or four or more floors, and.. the… the place where they stay is the… the corner of two… the cross of two… streets… uh… we have… uh…at the … at the right… eh… a building of three floors, and a … a flag with … …with the name Diesel… eh…it would bring me the idea of… that's the.. eh… the thing they are trying to sell.. so you have uh.. the building and people in this building who make part of the scene, of the background, and who are happy too, eh… this building is some kind of historical building, eh… and… have a lot of ve… vegetation too, in the background.On the other side, on the left side, we have a… a house, of one floor, who it's a… maybe it would be a kind of… in my opinion a kind of restaurant, because we have this… we have this… brings me the idea of a restaurant because of the type of … uh… edification…and, uh… more in the background, above this…this house… we have a building like… uh.. a typical building of a … a big city, like Florianopolis…a building that has.. that has more than five floors and… is not very detailed but is a common…uh… a common building… so is… that's it.

5. 1
Descriptive Statistics for WMC measures 5.1 Descriptive Statistics for WMC measures 5.1 Descriptive Statistics for WMC measures 5.1 Descriptive Statistics for WMC measures 5.1 Descriptive Statistics for WMC measures

Figure 2
Figure 2 Participants' behavior on the OWST operation word span test

Figure
Figure 3 Participants' oral behavior on the ACC variable

Figure 4
Figure 4 Participants' oral behavior on the COM variable

Figure 5
Figure 5 Participants' oral behavior on the WLD variable

Table 5
… there's a lot of people in this picture, uh… we have people dancing, people … dancing, and they are happy … uh … also, it's a picture from a … it's a ad from a … brand of … clothes, Diesel … uh … people, beautiful people … uh … there is uh, trees and … two buildings … and maybe a … coffee shop … uh, there is a car, and …it's a beautiful day, it's a sunny day … it's hot … there is a … light … for cars … I don't know the name … and … that's it, I don't know what to say.know uh… the marks I know is more usual than uh… the future.Uh… yeah.Uh… I'm seeing… the people are raising uh… their … arms in..., and they are keeping uh… their arms in the top, uh.. some of them are dancing, and … other are talking, talking not, sorry, uh,… walking, and looking to, at to each other…but picture… I believe is the uh…all I can see.Participant 10 -Aline Well, here, the.. there is a.. a street, I don't know where, what's the place… probably not in Brazil…uh.. and there is a street with a lot of people … and.. crazy… (laughs)..and… they are probably dancing.. and there is only one car… and… there are trees and… apartments….People are, uh, their arms… in the air, everybody, everybody… … I see a lot of colors here… ok, that's it.
in front of a… uh… building, but I don't know what city, uh… it is, probably is not in Brazil, cause uh… the adverstising is about uh… the future, a musical, a musical to believe in.I don't know this, what mark it is, uh…, I think is something about clothes or … another kind of wear .. that I don't… I don't know, I Correlation is significant at the 0.05 level (2-tailed).** Correlation is significant at the 0.01 level (2-tailed). *