Are bilingualism effects on the L 1 byproducts of implicit knowledge ? Evidence from two experimental tasks

Experimental studies in Linguistics rely on data from human participants performing language tasks. Therefore, understanding the constructs that such tasks tap into is fundamental for the interpretation of results yielded by experimental work. In the present study we address issues brought out by a previously published study based on a timed grammaticality judgment tasks that fails to replicate reported evidence of cross-linguistic interaction effects in bilingual processing of argument structure constructions that are not part of the bilinguals’ L1 construction repertoire. Although the timed grammaticality judgment task has been argued to be a valid measure of implicit linguistic knowledge, we review recent psychometric studies that challenge this assumption by showing that this task either does not tap into implicit knowledge at all, or does not tap into it as completely as online processing psycholinguistic tasks do. In the present study, we conducted two experiments with the same pool of subjects. One of the experiments employed an online processing task, and the other employed a timed grammaticality judgment task. In Revista de Estudos da Linguagem, Belo Horizonte, v.25, n.3, p.1685-1716, 2017 1686 our tasks, sentences in Brazilian Portuguese that emulated the linguistic behavior of the English resultative construction were the target items. We report results that show a mismatch in the observations yielded by the two task types, with only the online processing task revealing apparent L2 effects on performance in the L1. We interpret our results by suggesting that the locus of cross-linguistic interactions in bilingual language processing is mostly related to implicit processes.

our tasks, sentences in Brazilian Portuguese that emulated the linguistic behavior of the English resultative construction were the target items.We report results that show a mismatch in the observations yielded by the two task types, with only the online processing task revealing apparent L2 effects on performance in the L1.We interpret our results by suggesting that the locus of cross-linguistic interactions in bilingual language processing is mostly related to implicit processes.Keywords: implicit knowledge; cross-linguistic influences; psycholinguistic tasks; bilingualism; resultative construction.

Introduction
The study of human cognitive processes -which includes language and language processing -is challenged by a central problem for those who conduct it as an empirical scientific enterprise.Cognitive scientists often build up hypotheses about processes and architectures that are not liable to direct observation and measurement.Consequently, a crucial part of the job in this field of scientific enquiry is the establishment of reliable and consistent connections between observable facts (e.g.: overt behaviors, response accuracy and type, response latencies, eye movements, deflections in the registration of electrochemical waves produced in the brain, etc.) and the mental states, processes, and traces that the observable facts are hypothetically assumed to instantiate.Thus, advancement in cognitive science is much dependent on the enhancement of the validity of alleged homomorphisms between measures of observed behaviors and measures of the constructs (or latent traces) of which such behaviors are construed as surface manifestations (WILSON, 2005).
Psychometrics is the discipline that targets the development and employment of methods and techniques for construct validity analyses.In experimental work in Linguistics and Psycholinguistics, the critical importance of construct validity is especially highlighted.In this branch of language studies, experiments typically involve the collection and codification of instances of the behavior of human participants who are asked to engage with controlled language tasks.Among such tasks, the grammaticality or acceptability judgments play a significant role, which is only rivaled by tasks designed to capture the real-time processing of strings of linguistic units.The grammaticality judgment task is construed as an offline measure, that is, a measure that captures the output of completed language processing.However, specifically in studies in second language research, it has been proposed that time-ceiling manipulations for responses in grammaticality judgment would yield measures of distinct types of knowledge repositories and distinct levels of cognitive control (ELLIS, R., 2005;BOWLES, 2011).Accordingly, higher amounts of time for judgment calls would serve as a task tapping into explicit, declarative knowledge about language facts and more controlled processing, both of which would be unavailable should the task be administered in a mode that gave participants but a few seconds for their responses (usually 4 to 6 seconds).In such strictly timed grammaticality judgment tasks, the measure would be of implicit linguistic knowledge.
The interaction between languages in the bilingual mind, and the necessary control upon it that bilinguals must exercise to function in one of their languages, is a central issue in the psycholinguistics of bilingualism (BIALYSTOK et al., 2009).Such cross-linguistic interactions and their control have been hypothesized to be modulated by cognitive and contextual factors in ways that could respond to differences in ultimate attainment in L2 learning.And they have also been hypothesized to correlate with a number of possible cognitive advantages bilinguals would show over the lifespan, such as enhanced metalinguistic ability, and higher accuracy and speed of executive functions (BIALYSTOK et al., 2009).A number of studies show that cross-linguistic interactions are pervasive in bilingualism.
Cross-linguistic interactions have been show to emerge among high L2 proficiency bilinguals in the form of higher tolerance for argument structure constructions that would be either rejected or processed with much difficulty by monolinguals.Fernández and Souza (2016), as well as Souza (2014) and Fernández, Souza and Carando (2017), have documented observations of this phenomenon employing tasks that deal with language processing in comprehension and production.Such observations have led the authors to hypothesize that the phenomenon is not restricted to a temporary and highly localized processing lapse.Rather, the authors argue that bilinguals exhibit a certain degree of innovation in their overall linguistic representations, that is, in their overall linguistic competence in both L1 and L2.Fernández, Souza and Carando (2017) propose that such bilingual innovations might be one of the psycholinguistic mechanisms behind long-term, gradual contact-induced language change.But in a study that employed a timed grammaticality judgment task, Souza, Soares-Silva and Silva (2016) did not replicate a similar heightened tolerance for L2-like argument structure constructions in the L1 with a pool of equally high L2 proficiency Brazilian-Portuguese English bilinguals.The authors suggest that their findings could indicate that bilingual cross-linguistic effects are actually evanescent, failing to last long enough to emerge in the timed grammaticality judgment task.This interpretation is ultimately incompatible with the suggestion of bilingual innovations in overall linguistic representations.
However, recent psychometric studies have challenged the assumption that grammaticality judgment tasks and online processing tasks in experimental second language research can be regarded as measures of similar constructs (VAFAEE et al., 2016;KIM;NAM, 2016).Specifically, these studies suggest that even timed grammaticality judgments fail to tap implicit linguistic knowledge, which is the nature of representations that is largely assumed to subsidize fluent, automatic language processing.Rather, these psychometric studies suggest that judgment tasks, irrespective of higher or lower temporal ceilings for response, are tapping into explicit linguistic knowledge.
Based on the assumption that bilinguals tend to have enhanced metalinguistic ability when compared to monolinguals since a very early age (BIALYSTOK, 2001;BIALYSTOK et al., 2009), the research question that motivates the present study is whether it could be the case that a timed grammaticality task and an online processing task are actually capturing different phenomena.In order to answer this research question, we had the same pool of participants perform two experimental tasks.Half of our participants were monolingual speakers of Brazilian Portuguese, and half of them were bilingual speakers of Brazilian Portuguese and English with high proficiency in the L2.One of the tasks consisted of a procedure to measure the cost of online sentence processing: the maze task (FOSTER et al., 2009), and the other task was a timed sentence judgment task.Based on the studies we mentioned above, the hypothesis we sought to test was that a bilingualism effect, namely cross-linguistic influences, would be present in the measure that tapped into implicit knowledge (the online processing task), but it would be absent in the timed sentence judgment task.
In the following section, we discuss the relevance of implicit linguistic knowledge and implicit learning for bilingualism studies, and we also review psychometric studies that evaluate the construct validity of the timed grammaticality judgment task as a measure of implicit knowledge.We then move to the description of the linguistic focus of the present study: the English resultative construction.Although the surface form of this construction overlaps with an argument structure pattern licensed in Brazilian Portuguese, the resultative reading is only linked to the English reconstruction.In this study, we focus on the behavior of bilinguals in the two tasks with respect to sentences that forced the resultative reading into the surface form available in the Portuguese language.After the linguistic analysis of the resultative construction, we present the details of the study design, and we follow to the presentation of our results.We finish this report with a discussion of our interpretation of the results, and with a conclusion in which we point towards future directions.

Limplicit linguistic knowledge and learning, and its measurement
The hypothesis that the human cognitive architecture is supported by two relatively dissociable systems of learning, and therefore two systems for knowledge storage, has been a central debate in cognitive science for many decades now.Hayes and Broadbent's (1988) distinction of a dual learning system is based on the notion of selectivity.According to the authors (HAYES; BROADBENT, 1988, p. 251), one of the subsystems (or as in their terminology, "modes") is "selective, effortful, and reportable", whereas the other "involves the unselective and passive aggregation of information about the co-occurrence of environmental events and features".
Achieving the capacity to function fluently in a second language, especially after a first language is well established, is by all means a daunting cognitive task.Becoming bilingual after primary language acquisition involves adjustments in linguistic representation and processing that cannot be exclusively traced to any given learning and teaching situation, procedure, or strategy.In our opinion, it therefore comes as no surprise that the hypothesis of a dual system for learning and knowledge representation should provide a useful conceptual tool in psycholinguistic attempts to account for high levels of achievement in second language acquisition.
To the best of our knowledge, the first theoretical account of the nature and development of second language ability to rely on the concept of a dual system of representations was Stephen Krashen's model (for example, KRASHEN, 1994).The model, dubbed the Input Hypothesis, actually consisted of five interconnected hypotheses, among which the "acquisition" versus "learning" hypothesis1 lies at the core.This core hypothesis predicts that there are two distinct and non-commutable processes subsuming second language development.According to the hypothesis, "acquisition" refers to a process that is necessarily subconscious and incidental, and which leads to tacit linguistic knowledge.On the other hand, "learning" is construed as the output of intentional attempts at building knowledge about the organization and functioning of the L2 linguistic system, that is, it involves explicit metalinguistic formulations of some sort.For Krashen (1994), attainment of L2 competence and fluent L2 performance derives exclusively from acquired (i.e: subconscious, tacit, implicit) linguistic representations.Krashen's model predicts no interface between the two representational subsystems, a point made clear by the doubts cast by the author on the efficacy of overt grammatical instruction and corrective feedback for the development of L2 proficiency (KRASHEN, 1994, p 50-54).
Whereas Krashen's ideas were largely committed to the hypothesis that the acquisition of L2 competence was driven by innate mechanisms, other developments in cognitive psychology and psycholinguistics have conceptualized implicit learning and implicit representations in language within a framework that accommodates usage-based perspectives on language acquisition and processing. 2 Like Hayes andBroadbent's (1988), Reber (1989) and Winter and Reber (1994) define implicit learning as an individual's capacity to extract regularities from patterned stimuli in the environment, without consciousness of the learning task or reflective action towards it.Both Reber (1989) and Winter and Reber (1994) rely on experiments on miniature artificial grammar learning as the empirical base to argue for an inductive, automatic and mostly unconscious cognitive architecture that allows for probabilistic generalizations to be made for newly processed input based on previous experience with instances.
In Winter and Reber's (1994) definition there is no specification that implicit learning should be necessarily conceptualized as strictly incidental, that is, implicit learning is not necessarily cost-free in concern to cognitive resources, namely attentional allocation.The role of attention in language learning as a whole, and its role in implicit learning in particular has been a controversial issue among psycholinguistics-oriented second language researchers.A strong version of the notion of implicit learning could be defined as strictly incidental, as discussed above.This would predict that implicit learning might involve learning without attention.But as reviewed by Schmidt (1995Schmidt ( , 2001) ) and by Robinson et al. (2014), research findings have not supported such strong version.Although language learning may take place without overt intention and without availability of any conscious recollections of learning effort, the currently available evidence does not support claims that learning might take place without attention to the linguistic input available to the learner.This distinction is framed by Schmidt (1995) as a separation between learning without awareness (a possible operational definition of implicit learning), and learning without attention, a hypothetical mode of learning whose actual existence has not been substantiated by empirical findings.Schmidt (2001) argues that attention is an umbrella concept for a multicomponent cognitive function that can include subsystems such as alertness, orientation and processing selectivity (subsuming both activation and inhibition of information).In other words, according to Schmidt (2001), even though attention to linguistic patterns may unfold unavailable to introspection and independently of intentionality, it is qualitatively different from preconscious detection, as it requires controlled cognitive processes.If some threshold level of attention to properties and patterns in the L2 linguistic system is paramount for learning, then the emergence of L2 linguistic representations in the bilingual mind is, to quote Schmidt's (2001, p. 29) words, "a side effect of attended processing" of the L2.
Allocation of attentional resources over language processing routines can in turn also be a cognitive operation supported by implicit knowledge.N. Ellis (2006a) proposes a view of fluent language use as derived from a rational architecture by way of which speakers optimally and implicitly learn the distributional and associative probabilities of their languages, thus achieving processing efficiency that finely converges against the linguistic input.N. Ellis (2006b) further proposes that L2 learning is driven by the same rational learning procedure, despite the usual shortcomings of naturalistic L2 acquisition -when learners often fail to demonstrate acquisition of highly frequent features of the L2 despite intense exposure to the L2 input.The author suggests that the apparent failure to acquire features of an L2 may stem from implicitly learned attention, which results in the blocking of detection of linguistic cues in the weaker language that compete with key cues in the stronger language.
N. Ellis' (2006b) hypothesis is that one's very language learning history builds up implicit representations that guide language users as to what to attend to in the course of overall language processing.Learned attention is fine tuned to one's language experience, and it may in turn constrain one's capacity to promptly attend to, and therefore come to represent features in a new language.This hypothesis has been empirically tested in studies by N. Ellis andSagarra (2010, 2011), showing both short-term and long-term blocking effects of having learned to attend to adverbial cues in a known language on refocusing attention to verbal morphology cues when dealing with a new language.The authors argue that their findings pose challenges to explanations of L2 variability that evoke maturational decline in language acquisition capacity, interpreting such findings as evidence that the commonly reported failure of L2 learners to achieve native-likeness across the full range of L2-specific features may actually result from the implicit entrenchment of processing routines from the learners' previous language experience.That seems to be an interesting alternative to the hypothesis of a sudden and biologically determined halt in the brain's capacity to acquire new languages after a given age, as it seems to accommodate both the evidence of age effects on ultimate L2 attainment (LONG, 2013) and the growing body of evidence for human neuroplasticity and continued learning capacity despite increased processing demands over the lifespan (RAMSCAR et al., 2014;PAJAKA et al., 2016).
In view of the relevance of the notion of implicit L2 learning and representation, it is not a surprise that the development and validation of measures of implicit knowledge is a major concern for second language and bilingualism scholars.R. Ellis (2005) reports a study in which a psychometric operationalization of the constructs of implicit and explicit L2 knowledge was suggested.The study gathered data on 17 English grammatical constructions from 111 participants by way of five distinct tasks: an untimed grammaticality judgment task, a timed grammaticality judgment task, an oral imitation task, a oral narrative task, and a measure of participants' capacity to verbalize linguistic rules (thus a metalinguistic capacity measure).The author reports having found two distinct measures across the tasks in his study.His factor analysis3 yielded results showing that the scores in the metalinguistic capacity measure and in the untimed grammaticality judgment task loaded on one factor (which he describes as explicit knowledge), whereas scores in the timed grammaticality judgment task and in the two oral tasks involving production loaded on another factor (which he describes as implicit knowledge).The fundamental difference between the tasks tapping into each knowledge base, according to Ellis (2005), is the processing pressure imposed by the time constraints of the task at hand.
Ellis' ( 2005) study also showed a difference between the knowledge pool tapped by the grammatical and the ungrammatical sentences, specifically in the untimed grammaticality judgment task.The author suggests that when participants were given sufficient time to rely on explicit linguistic knowledge, accurate rejection of ungrammatical sentences loaded on the explicit knowledge factor.
Other studies support Ellis's (2005) proposal that timed grammaticality judgment tasks at least partially tap into implicit representations.For example, Bowles (2011) replicated Ellis' (2005) results using the same test batteries adapted for Spanish.Furthermore, Bowles (2011) reports an effect of language learning history that is convergent with the assumption that the tasks described by Ellis (2005) lead to processing that mostly relies on different knowledge repositories.In her study, Bowles included two groups of English-Spanish bilinguals, one formed with participants who learned Spanish mostly through classroom instruction, and another formed by Spanish heritage speakers.The classroom learner group performed higher in the tasks tapping into explicit knowledge, whereas the heritage speaker group showed the opposite pattern.The overall pattern of Ellis' (2005) findings were also replicated in Godfroid et al. (2015), a study in which sentence reading patterns were observed through eye-tracking in combination with timed and untimed grammaticality judgment tasks.In Godfroid et al. (2015), comparisons were made between native speakers and non-native speakers of English performing the two types of judgment tasks.The authors report that the type of reading found in the ungrammatical items during untimed task among non-native speakers distinguishes the nature of the processing of such stimuli from the others.The authors interpret such result as evidence of reliance on explicit knowledge, or some similar kind of controlled cognitive process.
Notwithstanding, in recent years there has been a growing controversy as to whether manipulations of time constraint for judgment calls are sufficient to ensure that the grammaticality judgment task can be taken as a valid measure of implicit knowledge.For example, Gutiérrez (2013) reports a factor-analytic study of both timed and untimed judgment tasks in which only the grammaticality or the ungrammaticality of the sentences, not the time constraint, loaded on two distinct factors.The author interprets these two factors as explicit knowledge for ungrammatical sentences in both timed and untimed judgments, and implicit knowledge for grammatical sentences, again irrespective of time constraints.Kim and Nam (2016) conducted a study to further verify the nature of representations that may be tapped into by distinct task formats, specifically comparing the timed grammaticality judgment task (which relies on receptive processing) and the oral elicited imitation task (which relies on speech production), 4 notably two tasks that have been identified as tapping into implicit knowledge in previous work.The authors found that the two task types do not load on the same factor, with the production task imposing stricter demands on performance.Kim and Nam (2016) interpret their results as indicating that although the timed grammaticality 4 Trials in this experimental task typically involve the experimenter reading the stimulus out loud to the participant, who then follows up with an oral repetition of the stimulus.For a discussion of the properties of the task and its validity as a measure of implicit linguistic knowledge, see Erlam (2009).judgment task may be taken as at least partially a measure of implicit knowledge, responses to such task might not cover the ultimate level of complexity in the network of implicit representations that subsidize language processing for production.Such network of representations should include form-meaning pairings, pragmatic entailments, collocational restrictions, and probably -in the specific case of bilingualism -crosslinguistic correspondences.Kim and Nam (2016) argue that their findings reveal that even if the timed grammaticality judgment task taps into some implicit knowledge, it may not tap into the same strength of implicit knowledge as that which actually guides real time language processing.
Vafaee et al. (2016) also investigated the validity of timed and untimed grammaticality judgment tasks as measures of implicit or explicit knowledge vis-à-vis online processing tasks (namely, self-paced reading and a word monitoring task).Through detailed factor analyses, the authors reject the hypothesis that even a timed grammaticality judgment task is a reliable measure of implicit knowledge.Vafaee et al. (2016) argue that the very nature of online psycholinguistic tasks, which capture language users sensitivity to violations and other linguistic features as comprehension unfolds, are likely to minimize the chances of access to conscious linguistic knowledge.On the other hand, the authors argue that the very nature of a grammaticality task, which quite clearly leads its participants to focus on linguistic forms rather than on comprehension, is likely to invoke explicit knowledge.According to Vafaee et al. (2016), even though the imposition of strict time constraints may make activation of explicit knowledge harder, it is not possible to rule out that such knowledge base is somehow at stake in judgment calls.
All in all, while it is largely accepted that fluent L1 and L2 users rely on implicitly represented knowledge to obtain efficient performance in language use, it is not yet clear that any form of linguistic judgment task provides researchers with a reliable measure of such implicit representations, at least in concern to bilinguals' L2 knowledge.Therefore, bilingualism studies that employ psycholinguistic online processing measures may be capturing phenomena that could fail to be promptly comparable with observations from experimental designs relying mostly on judgment task performances.We now pass over to the linguistic focus of the present study, which was specifically designed to explore this issue.

The Resultative Construction
In the present study we compare BP-English bilinguals and BP monolinguals behavior towards an English argument structure construction, namely the resultative construction (GOLDBERG; JACKENDOFF, 2004;WECHSLER, 2012;OLIVEIRA, 2016).This construction has as its main characteristic the fact that it expresses resultativity.Sentences (1) and (2), for example, express resultativity because in both we have the idea that <the table> reached the property <dry> as a result of the action <wipe> and this result was not entailed by the verb itself.Nevertheless, only sentence (2) is usually considered an instance of the resultative construction because, as opposed to sentence (1), its meaning is not predictable from its components parts (WECHSLER; NOH, 2001).
(1) Samuel wiped the table until it was dry.
There are many syntactic-semantic structures that have been considered instances of the resultative construction.As suggested by Goldberg and Jackendoff (2004), the sentences (3)-( 9), can all be considered part of the resultative construction family.In (3) and (4), we have sentences that have a transitive verb that can co-exist with the internal argument irrespective of presence of the resultative predicate.In (3) the result of the action is described by an AP and in (4) by a PP. ( 5) and ( 6) also have a transitive verb.However, different from (3) and (4), they cannot co-exist with the internal argument without the resultative predicate due to semantic restrictions.( 5) and ( 6) also have an AP and a PP as a resultative predicate respectively.In ( 7) and ( 8) we have resultative sentences that are formed by an intransitive verb.( 7) has an AP as its resultative predicate, whereas (8) has a PP.Finally, in (9) we have a resultative sentence whose internal argument is a reflexive pronoun.Even though these sentences vary in regards to some linguistic properties, they all express results that are not entailed by the verb itself and, hence, are considered instances of the resultative construction.In this study, we are going to focus on sentences that have the syntactic-semantic structure of ( 2) and (3), which are usually classified as the true resultative construction (LEVINSON, 2007).As we can observe in the aforementioned examples, these resultative sentences are formed by an external argument, an atelic transitive verb, an internal argument and a resultative predicate that generates telicity.The difference in telicity between the true resultative construction and its verb is one of the main characteristics of this construction.At first sight, one may think that any AP that can express result can be part of the true resultative construction.However, the resultative predicate has to be an AP that not only expresses the result of the action, but also indicates the limit of the action, which does not have an implicit endpoint.The relation between verbs and adjectives in this construction involves a homomorphic mapping between the temporal structure of the event described by the verb and the scalar property described by the adjective, as discussed by Wechsler (2012).According to the author, maximum endpoint adjectives seem to be the class that best fit such a role in the true resultative construction since they express the limit of a scale (ex: dry = 0% moisture).In (2), by way of illustration, the table is wiped until it goes down all the way on the moisture scale and reaches the limit <dry>.
Native speakers of English seem to be very sensitive to those restrictions in the resultative predicate.Oliveira (2014) conducted an acceptability judgment task with the magnitude estimation paradigm to observe if participants could distinguish true resultative constructions formed by maximum endpoint adjectives, such as (10) and true resultative construction formed by other adjective types (11).The results indicate that the first type of sentence had a mean acceptability of 0.73 in 0-to-1 scale, whereas the second type had a mean acceptability of 0.43.Therefore, there is empirical evidence that the true resultative construction does impose these restrictions to the resultative predicate and that native speakers are sensitive to them.
(10) One of the classrooms was very dirty, so Desiree swept it clean.(11) ??Tara bought a new table, but her crazy brother punched it broken.Oliveira (2016) points out a set of other restrictions that have been observed in this structure.Verbs and adjectives that come from romance languages, for example, do not usually form resultative sentences.Moreover, past participle adjectives or adjectives with more than two syllables also tend to be unlicensed.Also, the resultative predicate cannot be topicalized and the construction cannot predicate on its external argument.These restrictions seem to be very peculiar to the English true resultative construction, which makes it an interesting topic for studies about second language acquisition and other phenomena related to bilingualism, where the L2 is English.
There have been many proposals of possible resultative constructions in BP, but they all violate some of the basic rules of the resultative construction, as shown by Oliveira (2016).Most of these proposals include sentences with telic verbs or resultative predicates that are not formed by an AP or a PP.Based on that the author contends that the resultative construction is not part of the BP grammar.Thus, the acquisition of this argument structure construction by BP-English bilinguals can be considered the acquisition of a new construction.
In order to study how the acquisition of the resultative construction can influence the L1, it is necessary to ensure that bilinguals indeed acquire the resultative construction.Oliveira and Souza (2012) and Oliveira (2013) show that BP-English bilinguals with high levels of proficiency exhibit acceptability ratings to resultative sentences indicative of successful learning.The latter study also indicates that bilinguals with lower levels of proficiency may not have learned the resultative construction.Therefore, in order to analyze possible effects of the resultative construction acquisition on the L1, we have to investigate bilinguals with high levels of proficiency.
In addition to comparing monolinguals' and bilinguals' behavior towards the resultative construction, we also analyze how these groups behave towards the depictive construction.The depictive construction has the same surface syntactic pattern observed in the resultative construction, namely NP-VP-NP-AP, but the AP is not mapped to a resultative reading.As argued by Pylkkännen and Mcelree (2006), whereas the AP in the resultative sentence in (12) indicates the result and endpoint of the action, the AP in the depictive sentence in (13) indicates the state of the internal argument during the action.More importantly, the depictive construction is licensed not only Portuguese-English bilinguals' L2, but also in their L1.For this reason the depictive construction was included in the present study as a control to the elicited behaviors concerning the L2-only resultative construction.In fact, the BP syntactic counterparts of both ( 12) and ( 13) have a depictive reading and are licensed, as illustrated in ( 14) and ( 15) respectively.
In the next section we provide details of our design for the empirical component of the present study.

Methods
As discussed above, there is growing evidence of a distinction between the constructs tapped into by acceptability judgment tasks and online processing tasks, with only the latter yielding valid measures of implicit processes.The ultimate focus of the present study is to investigate whether such latent construct difference can account for the failure reported by Souza, Soares-Silva and Silva (2016) to replicate bilinguals' departure from L1 restrictions when processing argument structure constructions in the L1 that are only productive in their L2.It must be recalled that Souza, Soares-Silva and Silva's (2016) observations were made by way of a timed acceptability judgment task.In order to pursue the present investigation, we planned two experimental tasks (which we refer to as experiment 1 and experiment 2 from now on) to be administered in a within-subjects design.Therefore, we sought to compare the responses elicited by the two types of task from a single pool of participants.
Experiment 1 aimed at measuring the processing cost of sentences that forced an L2-specific construction, namely the resultative construction, into BP by both BP-monolinguals and BP-English bilinguals immersed in the L2.In order to do so, participants performed a mazetask (FORSTER et al., 2009), which is similar to the self-paced reading paradigm in regards to the fragment-by-fragment sentence presentation.The major difference between the two techniques is the fact that in the maze-task, participants have to choose at each sentence fragment between two options (one leading to a coherent increment to a sentence, the other one fails to do so).Therefore, in the maze-task the reaction times (RTs) of each sentence fragment reflects how long participants take to select the correct option.The main advantage of this method is the fact that it does not require comprehension questions, it does not exhibit spillover effects and it forces an incremental processing.In FIG. 1, we have an example of how the sentence "Samuel wiped the table clean" would be displayed in a maze-task.Each screen the participants see when reading this sentence is represented.
In order to observe possible differences between bilinguals and monolinguals, we compared their RTs for the APs in the target sentences, which are ungrammatical in BP, but not in English.Furthermore, we compared bilinguals' and monolinguals' RTs for the APs in control sentences with the depictive construction, which is licensed in both BP and English.Therefore, in this task the dependent variable was the RTs towards the APs, of target and control sentences separately, and the independent variable was the participants' linguistic profile.
Experiment 2 aimed at analyzing how participants perceived the acceptability of the same constructions from Experiment 1 under time pressure.In other words, participants performed a speeded acceptability judgment task with a 4-second time ceiling.The 4-second time ceiling (i.e.: 4000ms) time-ceiling was based on Souza et al. (2015), which reports an exploratory study showing that the 4000ms time-window was about 500ms above the lowest threshold observed for adult and post-secondary education native speakers to make accurate acceptability judgments in their L1.In this task, participants read entire sentences and evaluate how acceptable each of them sound.In order to observe possible differences between bilinguals and monolinguals, we compared the acceptability ratings they assigned to the target sentences, instances of the resultative construction, and the control sentences (as previously stated, instances of the depictive construction).Thus, in this task the dependent variable was the acceptability ratings for the target and control sentences and the independent variables were the participants' linguistic profile and the constructions instantiated by target and control sentences.

Participants
43 people participated in both Experiment 1 and Experiment 2. Their mean age was 26 and they were college students or had higher levels of education.As tested by Oliveira (2016), performing the speeded acceptability judgment task after a maze-task with similar target structures does not seem to bias participants' behavior as a result of order effects.27 participants were monolinguals or had only basic knowledge of an L2 and they were all residents of the Belo Horizonte metropolitan area.16 participants were BP-English bilinguals with high levels of L2 proficiency and were residents of the Boston metropolitan area.These bilinguals had been living in the United States for longer than 10 years, but still considered BP their dominant language.All bilinguals reached the highest level of the Vocabulary Levels Test (NATION, 1990).The Vocabulary Levels Test has been empirically shown to obtain scores that reliably correlate to proficiency test scores based on tasks tapping into grammatical knowledge and comprehension skills in the L2 with Brazilian Portuguese L1-English L2 speakers (SOUZA; SOARES-SILVA, 2015).We therefore understand this test to be an effective diagnosis of bilinguals' levels of proficiency.

Materials
Both experiments used stimuli in Brazilian Portuguese, the participants' L1.The maze task had 58 experimental items, 10 of which formed the training session.All the sentences were grammatical, except for the 8 target items, whose resultative predicate (AP) is unlicensed in BP.These target items, exemplified by ( 16), forced the resultative construction using a structure suggested by Oliveira (2013).All items had two clauses.The first one had an NP in the subject position, a verb in the past tense, and a direct object NP with a definite article.The second one had a coordinating conjunction, another verb in the past tense, a clitic pronoun as direct object referring to the direct object of the previous clause, and an ungrammatical AP that forced a resultative reading.The two NPs in each sentence differed from each other in terms of gender in order to decrease the possibility of ambiguous readings.The 8 control items, exemplified by ( 17), were instances of the depictive construction and their structure were similar to the target items.The only difference between the target and the control items was the interpretation of the AP.More specifically, whereas the AP in the target items had a resultative reading, which made it ungrammatical in BP, the AP in the control items had a depictive reading, which is licensed in BP.The speeded acceptability judgment task had 111 5 experimental items, 15 of which formed the training session.The experimental corpus was balanced in terms of grammaticality so that 50% of the sentences were grammatical and the other 50% were ungrammatical.Thus, it was possible to mitigate possible effects related to the repetition of sentences 5 More items were included in the acceptability judgment task than in the maze task because of the different nature of those tasks.For the acceptability judgment task, we believe a wider array of different sentence types are needed in order to keep the target sentences from becoming too salient, and therefore raising participant's awareness towards them.
with similar grammatical status.The 8 target (18) and 8 control (19) items were similar in structure to the target and control items of the maze task as illustrated below: In both the maze-task and the speeded acceptability judgment task the items were pseudo-randomized.Such a procedure aimed at avoiding that that target and control sentences were displayed in sequence.Therefore, it was also possible to mitigate possible effects originated from the repetition of the same construction and/or the order of presentation.

Procedures
Participants performed the maze-task (Experiment 1) and, after an interval in which they provided their personal information, they performed the speeded acceptability judgment task (Experiment 2).Both tasks were performed in the same laptop computer.The DMDX software (FORSTER;FORSTER, 2003) was used for the stimuli presentation and the randomization management.In the maze-task, the software recorded reaction times (RTs) for each segment and, in the speeded acceptability judgment task, it recorded the acceptability ratings given to each sentence.
In both tasks participants were introduced to a set of instructions.In the maze-task, the instructions informed the participants that they should form sentences by choosing, from each pair of words, the option that best suited the sentence being formed.In order to select words, participants used the left-shift and right-shift keys, which were highlighted with colored stickers.In the speeded acceptability judgment task, participants were instructed to assess the acceptability of each sentence with a 5-point Likert scale by using the number keys from 1 to 5.This scale has been argued to be the most suitable for this type of task (SOUZA; OLIVEIRA, 2014).Also, they were instructed to judge the sentences based on the order used and the words selected, trying to ignore pragmatic aspects.
Both Experiment 1 and Experiment 2 were fully conducted in BP, in order not to encourage the activation of participants' L2, i.e., to keep them in a monolingual mode (GROSJEAN, 2013).Both tasks were preceded by a training session and participants could ask questions to the experimenter.In order to avoid possible fatigue effects, participants took a break halfway through each task.In the maze-task, participants had 4000ms to read each pair of words and select the correct option.In the speeded acceptability judgment task participants had 4000ms to read and assign an acceptability rating to each sentence.As stated above, the 4000ms timeceiling was based on the results reported in Souza et al. (2015), concerning the time-window within which adults with post-secondary education can make accurate acceptability judgments in their L1.

Experiment 1
Our hypothesis for Experiment1 was that bilinguals would exhibit shorter RTs for the AP in the resultative construction in comparison to monolinguals.The rationale is that bilinguals would co-activate both the L1 and the L2 and, in turn, they would be able to process the resultative predicate more easily than monolinguals.Since the control items are available both in the L1 and in the L2, we did not expect that they would yield significant differences between bilinguals and monolinguals.
We tested the maze-task target and control item RTs for normality with the Shapiro-Wilk test.The monolinguals' means for the resultative construction by subjects (W=.893, p=.252) and by items (W=.920, p=.433) did not differ from the normal distribution., and neither did their means for the depictive construction by subjects (W=.948, p=.692) or by items (W=.959, p=.796).Similarly, the bilinguals' means for the resultative construction by subjects (W=.874, p=.166) and by items (W=.908, p =.388) did not differ from the normal distribution, and neither did their means for the depictive construction by subjects (W=.956, p=.770) or by items (W=.933, p=.544).
Due to the normality of all the distributions observed, we used the Student's T-test for independent samples to compare bilinguals' and monolinguals' RT means by subject and by items to both target and control sentences.The groups' RTs for the AP in the resultative sentences yielded a significant difference by subjects (t1(39)=3.725,p<.001) and by items (t2(14)=4.732,p<.001), but their RTs for the AP in the depictive sentences did not either in neither the analysis by subjects (t1(39)=-.584,p<.563) nor in the analysis by items (t2(14)=.442,p<.665), as we suspected.The participants' RTs means for both sentence types are illustrated in GRAPH 1.Thus, bilinguals and monolinguals did not exhibit a significant difference as regards the RTs to the APs in the depictive sentences, which are licensed in BP, but bilinguals were significantly faster in relation to the APs in the resultative sentences, which are illicit in BP, but licit in the bilinguals' L2.Therefore, the results suggest that bilinguals exhibit facilitation possibly originated from an access to the L2 representation during online processing of the argument structure pattern not licensed in their L1.
GRAPH 1 -Monolinguals' and bilinguals' mean RTs for the APs in the resultative and in the depictive constructions

Experiment 2
Our hypothesis for Experiment 2 was that bilinguals would not exhibit difference acceptability ratings for the resultative construction in comparison to monolinguals.The rationale is that the timed acceptability judgment task taps into a knowledge different from that involved in the maze-task.As discussed above, our hypothesis is that whereas the mazetask taps into implicit knowledge, the speeded acceptability judgment task taps into explicit knowledge.Since the L2 influence on L1 is a evanescent, temporary and implicit effect on L1 (SOUZA et al., 2016), we do not expect to find this effect on the speeded acceptability judgment task.Therefore, we do not expect to find differences between bilinguals and monolinguals as for the manner they perceive the acceptability of both the resultative and the depictive sentences.
We tested the acceptability ratings for normality with the Shapiro-Wilk test.The monolinguals' means for the resultative construction (W=.841, p<.001) and for the depictive construction (W=.603, p<.001) differed from the normal distribution.Also, bilinguals' means for the resultative construction (W=.740, p<.001) and for the depictive construction (W=.415, p <.001) differed from the normal distribution.
We ran the Mann-Whitney test to compare bilinguals' and monolinguals' acceptability ratings.The acceptability ratings given to the resultative construction did not yield a significant difference between bilinguals and monolinguals (U=9751.5,W=16537.5, Z=-.262, p<.793).Similar results were observed in regards to the acceptability ratings given to the depictive construction (U=9930.5,W=25861.5, Z=-.914, p<.360).The participants' acceptability rating means for both sentence types are illustrated in GRAPH 2. Differently from what was observed in the maze-task, bilinguals and monolinguals behaved similarly, as we suspected.More specifically, bilinguals' behavior did not suggest an L2-to-L1 influence to evaluate the acceptability of the resultative sentences.Thus, the results suggest that the L2 influence on the L1 processing does not last long enough to play a role in participants' metalinguistic analysis regarding the acceptability of an argument structure construction that is L2-specific.
GRAPH 2 -Monolinguals' and bilinguals' mean acceptability for the the resultative and the depictive constructions

Discussion
Although the present study was not designed for factor-analytic data treatment, our results do show that a task that taps into more automatic and online processing (the maze task) yields distinct behavior from what was measured in a timed grammaticality judgment task.Therefore, we understand the present study to converge with both Vafaee et al. (2016) and Kim and Nam (2016) in respect to the fact that psycholinguistic measures are likely to capture features of processing that are dissociable from what is measured by grammaticality judgments, even when the time window for responses in the latter are manipulated so as to impose restrictions on reflective analysis of stimuli by respondents.It must be emphasized once again that the present studied explored the behavior of the same participants for each of the two experiments, therefore controlling for the possibility that the variability in the performance of the two task types resulted from individual differences clustered in one of the participant pools.
Following the arguments in Vafaee et al. (2016), we interpret our observations as reflecting the differences in outcome of processing in a task that involves automatized and largely implicit parsing routines (what we detected through the maze task) as compared with processing in a task that allowed our participants to rely on explicit knowledge of some sort.Such explicit knowledge should not necessarily take the shape of metalinguistic descriptive rules about the linguistic behavior of the resultative construction.Even the awareness of the contrastive readings entailed by the surface forms of sentences like as Samuel wiped the table dry and Samuel limpou a mesa seca might have been available for inspection for our participants over the timed grammaticality judgment task they performed.
We believe that our sample of Brazilian Portuguese-English bilinguals was composed of L2 speakers who achieved a reasonable degree of automaticity and proceduralization in parsing L2 constructions.First, the bilingual participants in our sample were all classified in the highest possible level in the Vocabulary Levels Test, a score that has been shown to positively and significantly correlate with independent measures of L2 proficiency that include morphosyntax (SOUZA; SOARES-SILVA, 2015; SOARES-SILVA, 2016).Second, those were bilinguals immersed in the L2 sociolinguistic environment for several years.This fact taken together with their explicitly measured proficiency in the L2 makes it highly probable that the bilingual population we sampled is characterized by intense exposure and productive use of the L2.The performance of our bilinguals in the maze task revealed a processing pattern that diverged from what we observed among monolinguals in the critical items of our experiment, and effects of intense usage of the L2 on the altering of bilingual processing of L1 patterns in relation to monolinguals have been documented elsewhere (e.g.: FERNÁNDEZ, 2003;DUSSIAS;SAGARRA, 2007).Ultimately, as discussed above, there is evidence that it is precisely the abstraction of distributional probabilities in the input that characterizes much of the implicit knowledge that takes the shape of grammar.
As discussed above, Souza, Soares-Silva and Silva (2016) suggested that the non-replication in a timed judgment task of the bilingual cross-linguistic effects found in online processing tasks could be a temporal decay of such effects.However, the considerations we bring out in the present study lead us to suggest a different perspective on this issue.We propose the alternative hypothesis that instead of the consequence of a time factor, the mismatch reported in the authors' study could be a byproduct of the fact that bilingual cross-linguistic interactions may take place mostly at the level of implicit representations and processes, but are actually inhibited and controlled if the task contingencies allow for integration of explicit representations.
Thus, we interpret our observations in the present study as suggesting that the extent to which bilingual cross-linguistic interactions occur is modulated by task type, and that such interactions can be clearer when performance relies on implicit representations, rather than on more controlled processes.Therefore, we assume that the hypothesis that departure from L1 norms may reflect changes in linguistic competence, put forward in Fernández and Souza (2016) and Fernández, Souza and Carando (2017), could be re-stated as changes in the output of procedures that rely on implicit representations.This assumption is subsumed by the notion that linguistic competence does not refer to a stable knowledge repository, but rather to a malleable set of representations that optimally serve the pressures for the efficient resolution of language processing demands at the point of need, and as imposed by the specific contingencies of specific linguistic tasks.

Conclusion
Our observations in the present study are convergent with the results reported in Fernández, Souza and Carando (2017) and Souza (2014) in concern to a facilitative bilingualism effect on the L1 online processing of an argument structure construction that is alien in the bilinguals' L1, but productive in their L2.On the other hand, Souza, Soares-Silva and Silva's ( 2016) observations that such bilingualism effects are not detectable in timed acceptability judgments were also replicated.We interpret these findings in light of recent research findings the challenge the assumption that the reduction of the time ceiling for judgment calls makes the grammaticality judgment task a reliable measure of mostly implicit linguistic knowledge.In other words, we understand our present findings as suggesting that bilingual cross-linguistic effects on the L1 that have been reported in literature might be specifically salient in processes that rely on implicit linguistic knowledge.
This interpretation seems to be consistent with the notion that implicit processes are automatic, whereas explicit processes are likely to be more controlled.Ultimately, the picture that we tentatively put forward here is that inhibition of properties of a language irrelevant for a given processing task will be more effective in the performance of tasks that are less dependent on the immediate outcome of automatic processes, therefore allowing for more language control.This seems to be the case of the acceptability judgment task, even under severe time restrictions.We believe that such tentative picture also seems to accommodate the hypothesis that continued automatic activation of non-relevant language features during online processing might in the long run alter linguistic representations that may be accessed even by explicit processes, thus leading to bilingual innovations that may ultimately drive language change, as proposed by Fernández, Souza and Carando (2017).
A limitation of the present study is the fact that we did not employ a specifically factor-analytic design for the treatment of our data.We based our assumption that explicit knowledge was tapped into by the timed grammaticality judgment task on other research findings that show that bilinguals in general have enhanced metalinguistic capacity.However, we understand that more detailed psychometric validation studies of the constructs tapped into by psycholinguistic tasks are an invaluable future direction for both the psycholinguistics of bilingualism in particular and for psycholinguistic research at large.Another limitation of this study is the fact that we did not employ any validated measurement of language use and/or dominance profile with our bilingual participant pool.The lack of such an instrument impedes us from holding any conclusive position concerning whether it was indeed the high level of attained L2 proficiency -rather than emerging drifts towards L2 dominance resulting from immersion in the L2 sociolinguistic environment -that better explains the bilingualism effects on L1 processing we did observe in our online processing task.These limitations notwithstanding, we hope the present study is above all an example of the fruitful collaboration that can be established between experimental psycholinguistics and solid psychometric considerations.

FIGURE 1 -
FIGURE 1 -Example of how the sentence "Samuel wiped the table clean" could be displayed in a maze task Source:Oliveira (2016) DET paper and it.ACC blow.PST dry 'The kid painted the paper and blew it dry.' (17) A professora preparou o chá e o bebeu quente.DET teacher prepare.PST DET tea and it.ACC drink.PST hot 'The teacher prepared the tea and drank it hot.' DET girl prepare.PST DET coffee and it.ACC drink.PST hot 'The girl prepared the coffee and drank it hot.'