Agreement effects of gender and number in pronominal coreference processing in Brazilian Portuguese

Pronominal coreference is a syntactic dependency in which pronouns are bound to previous referents in discourse. One of the keys to understanding coreference processing is memory, since information that has already been interpreted and stored must be integrated with new material in real time. The aim of this research is to investigate how pronominal antecedents are retrieved from memory, and more precisely to clarify the role of structural constraints, agreement features and decay factors. Since Brazilian Portuguese has rich morphology, speakers of this language can rely on agreement cues as well as structural constraints to resolve coreference. The hypothesis is that candidates that feature-match pronouns will initially influence coreference processing, even though they violate structural constraints, which will only work later in binding processing to help the parser select the most adequate antecedent. An eyetracking experiment was conducted with twenty-four native speakers of Brazilian Portuguese. The results showed that structurally unacceptable antecedent candidates that feature-matched the pronouns in gender and number facilitated coreference processing. It is claimed that they were considered as potential antecedents. Moreover, it seems that memory might be sensitive to differences that exist between singular and plural features. Plural may be more salient in memory due to the fact it is marked Revista de Estudos da Linguagem, Belo Horizonte, v.25, n.3, p. 1327-1366, 2017 1328 in Brazilian Portuguese. Finally, memory can also be affected by decay factors, which, for example, can be responsible for processing difficulties when there is a long distance between antecedents and pronouns.


Introduction
In order to process language in real time, previously interpreted information must be kept at least momentarily in our memory so that integration with novel upcoming material can take place rapidly (LEWIS et al., 2006).In this way, memory can be considered one of the key factors in processing long distance dependencies such as coreference, in which pronouns are bound to antecedents that occupy linearly distant positions in the discourse.
Among other cues, binding can be influenced by structural constraints, agreement relations between antecedents and anaphors, and salience of the discourse entities involved in the context.Previous research that has investigated how those three factors play a role in binding processing is very contradictory.On the one hand, it has been claimed that structurally unacceptable candidates cannot initially influence binding processing even in cases in which they are salient discourse entities and agree with the anaphors (NICOL; SWINNEY, 1989;CLIFTON et al., 1997;STURT, 2003;LEITÃO et al., 2008;XIANG et al., 2009;OLIVEIRA et al., 2012;DILLON et al., 2013;CHOW et al., 2014).On the other hand, other research has shown that structural constraints can be fallible as these studies found that structurally unacceptable candidates can be initially considered as potential antecedents if they are salient entities that feature-match the anaphors (BADECKER; STRAUB, 2002;KENNISON, 2003;PARKER, 2014;PATIL et al., 2016).
One possible explanation for these contradictory results in relation to the role of agreement in binding processing in the literature may rely on the fact that those studies may have taken for granted intrinsic differences that exist among morphological features.In these terms, our research tried to control for the different types of features that may exist under the category of gender (masculine and feminine) and number (singular and plural).In addition, English may not be the most appropriate language to study agreement, as it is a language with limited overt morphology.By comparing overt agreement marking in English and in Brazilian Portuguese, one notices that unlike the former, the latter has redundant agreement marking in most determiners, nouns, adjectives, and verbs, for example.Sentence (1) shows how one of the sentences used in one of our experiments in Brazilian Portuguese would be translated into English.The sentence in Brazilian Portuguese (1a) has 17 overt marks, while its translation in English (1b) has only 8.
b) The engineer[sg] investigated the architects [pl] who have[3 rd person, pl] stolen him[masc, 3 rd person, sg] for a couple of semesters [pl].Lago et al. (2015) compared agreement attraction in subjectverb dependencies in Spanish (another morphologically rich language similar to Brazilian Portuguese), and in English.Their results showed that Spanish comprehenders showed more processing difficulties in ungrammatical sentences than English comprehenders.Moreover, Spanish comprehenders, but not English comprehenders showed processing difficulties in grammatical sentences with plural attractors. 1  The authors explain that since agreement morphology is functionally 1 Sample of the materials of Lago et al (2015) Experiment in Spanish: Gram, sg attractor: La nota que la chica va a escribir en la clase alegrará a su amiga.
(The note that the girl are going to write during class will cheer her friend up.)Gram, pl attractor: Las notas que la chica va a escribir en la clase alegrará a su amiga.
(The notes that the girl are going to write during class will cheer her friend up.)Ungram, sg attractor: *La nota que la chica van a escribir en la clase alegrará a su amiga.
(The note that the girl are going to write during class will cheer her friend up.)Ungram, pl attractor: * Las notas que la chica van a escribir en la clase alegrará a su amiga.
(The notes that the girl are going to write during class will cheer her friend up.)Experiment in English: Gram, sg attractor: The musician that the reviewer was highly praising last week will probably win a Grammy.
Gram, pl attractor: The musicians that the reviewer was highly praising last week will probably win a Grammy.
Ungram, sg attractor: *The musician that the reviewer were highly praising last week will probably win a Grammy.Ungram, pl attractor: *The musicians that the reviewer were highly praising last week will probably win a Grammy.
more important in Spanish than in English, Spanish speakers would rely more on morphological cues in processing sentences.Therefore, the strength of agreement predictions would be higher for Spanish than in English, which causes a higher pay off when the predictions are not fulfilled and reanalysis is needed.
Taking the fact that the use of agreement cues may be more fruitful in languages with rich morphology like Spanish, the present work aims to investigate how pronouns retrieve antecedents in Brazilian Portuguese, which is also a language with rich morphology.Moreover, it seems that the use of morphological cues in memory retrieval may also vary depending on the particular binding dependency.Agreement features may be more helpful in pronominal antecedent retrieval due to the looseness of Principle B, since it only posits that the pronoun antecedent must not be local [see section 3 of the present paper].
The recognition of a pronoun must initiate a retrospective search for an antecedent.Since the structural relation between a pronoun and its antecedent is almost free, it is natural do assume that a pronoun initiates a cue-based search for an antecedent that shares its person, number, and gender features, and hence it would not be surprising for this search to detect nouns that match those cues, even when they violate Principle B (PHILLIPS; WAGERS; LAU, 2011, p. 171) In this way, the present research will fill a gap in the literature as it will provide not only one more piece of evidence to the puzzle of binding processing, which lacks intensive investigation, but it will determine whether different types of gender and number features carried by candidates are responsible for differences in the way memory retrieves the antecedents.It will also be determined whether speakers of languages with morphological richness such as Brazilian Portuguese tend to initially consider structurally unacceptable candidates as potential antecedents despite the fact that they violate binding constraints.In addition, pronouns will be our object of the study since they might rely more on content cues as opposed to reflexives.
Thus the main aim of this research is to investigate how pronouns retrieve antecedents in Brazilian Portuguese.In addition, its secondary aim is to examine which features can influence memory retrieval the most.The first hypothesis is that candidates that feature-match the pronouns would initially influence coreference processing in Brazilian Portuguese, even though they violate Principle B, and that structural constraints would only work later on in binding processing to help the parser select the most appropriate antecedent (cf. BADECKER;STRAUB, 2002).
In addition, it is hypothesized that memory and consequently coreference processing is sensitive to different types of agreement features such as masculine and feminine, for gender; and singular and plural, for number.Since feminine and plural are marked features in Brazilian Portuguese, we expect that they will be more salient in memory, making the antecedent candidates that carry these types of features more easily retrieved.The correlation between the influence of the structurally unacceptable antecedents and the types of features they display is known as the mismatch assymetry.It seems that structurally unacceptable candidates with marked features are more influential than those with unmarked features (cf.among others for plural and singular, BOCK; MILLER, 1991;WAGERS et al., 2009;DILLON, 2013).
Finally, it is also hypothesized that memory is affected by decay over time (LEWIS; VASISHTH, 2005; LEWIS; VASISHTH; VAN DYKE, 2006), so that a long linear distance between antecedent candidates and anaphors brings costs to binding processing.
In order to test the hypotheses, an eye-tracking experiment was conducted with native speakers of Brazilian Portuguese.The eye-tracking technique is suitable for our purposes as it enables the researcher to examine the temporal course of language processing, including early (First Fixation Duration2 ) and late (Total Fixation Duration3 ) on-line processing measures.This paper will be arranged as follows: Section 2 will present the reader with the computational model that is commonly used in the literature to explain how memory retrieval operates; in Section 3 the structural constraints on corefernece called the Binding Principles will be briefly reviewed; Section 4 will discuss previous research in the literature; Section 5 will introduce the present study; Section 6 will discuss the main conclusions of this study, followed by References.

Content-Addressable Memory (CAM)
Content-addressable memory (CAM) (McELREE, 2000;McELREEE et al., 2003;van DYKE;McELREE, 2006) is a computational model that has been used recently to explain how memory operates during language processing.In this model, previous information that was previously interpreted can be retrieved by a parallel search based on a set of grammatical cues generated by the target.This parallel search in memory can be affected by similarity-based interference and decay factors (LEWIS;VASISHTH, 2005;LEWIS;VASISHTH;van DYKE, 2006).The former occurs when the similarity between items in memory and the retrieval cues increase, reducing the strength of association between the cue and the target item, as a great number of items will be associated with the cue.Consequently, memory failure rates increase, and distractors, that is, candidates that partially-match the cues can sometimes be retrieved.The latter occurs when the linear distance between the dependent items is increased and the activation of the distant item decays over time, which makes its retrieval more difficult.
Retrieval cues consist of several types, including structural, morphological, semantic, and contextual cues (among others).The present paper will focus on only two of them: structural and morphological cues.During the encoding phase, all information is interpreted and stored in memory.By the time the pronoun is encountered, a group of grammatical cues is generated is order to retrieve the antecedent.In the example portrayed in Figure 1, the antecedent must not be local,4 and it must be feminine and singular.After that, there is a parallel search in memory and two candidates that are similar to the cues generated by the target are found: "housekeeper" and "princess".The former candidate is a perfect match; however, although the latter candidate is only a partial match as it is local, it can interfere with memory retrieval, the so-called similarity-based interference effect.Candidates like "princess" are called distractors according to the CAM model.In addition, in this example, "housekeeper" can also decay over time as it was stored in memory before "princess", which, in this case, is more recent.Thus, according to this model, distractors such as "princess" can sometimes be erroneously retrieved as antecedents as a result of a failure caused by both similaritybased interference effects and decay factors.

Binding Principles
The Binding Theory (CHOMSKY, 1993) posits three principles: A, B, and C, which are able to explain, respectively, the distributional constraints on (a) anaphor (according to Chomsky, only includes the reflexives and reciprocals); (b) pronouns; and (c) free referential expressions.Chomsky (1993) claimed that depending on the nature of the NPs involved and the syntactic configurations in which they occur, the anaphoric relations can be possible, necessary, or proscribed.
(3) John criticized him.Chomsky (1993) states that in (2), him can take John as its referent, which cannot happen in (3).According the Binding Principle B, pronouns cannot have a structurally local antecedent.It is noteworthy that this is not a matter of linear distance, as pronouns can actually linearly precede their antecedents, like in (4), a construction traditionally known as cataphora.Moreover, in (5), him cannot refer to John, even though there is a long linear distance between them.
(4) After he entered the room, John sat down.
Thus Chomsky proposed that a pronoun couldn't take as its antecedent an element within its [binding] domain.In (3), the domain of the pronoun is the whole sentence; therefore, as John is within the domain of him, it cannot be its referent.On the other hand, in (2) and (4), John is not in the same domain of him since they are in different clauses.
(8) A referential expression (neither a pronoun nor an anaphora) must be free The previous sentences with their indexes are the following: (9) John i said Mary criticized him i .
(10) After he i entered the room, John i sat down.
(11) *He i said Mary criticized John i .
(12) He i said Mary criticized John j .
The example (11) can only be grammatical if him and John have different indexes, like in (12).
(13) If the α index is different from the β index, α cannot be the antecedent of β and vice-versa.
The example in (3) with indexes would be like: (14) John i criticized him j .
By comparing ( 9) and ( 15), one notices that a pronoun is able to exist within the binding domain of its antecedent.However, it should be highlighted that a pronoun cannot be too close to its antecedent.
(16) A pronoun must be free in a local domain (Principle B) The local domain is generally the minimum clause that contains the pronoun.Unlike pronouns, which can have a binding antecedent but do not need one, anaphora (reflexives and reciprocals), like in (17), require antecedents to bind them.In addition, the antecedents of anaphora need to be in the same local domain: (17) John i criticized himself i .
(18) An anaphor must be bound in its local domain (Principle A).
Clearly, pronouns cannot be substituted for anaphora: Finally, Chomsky (1993) postulates the Binding Principles as the following: (21) Principle A: an anaphor must be bound in its local domain.
Principle B: a pronoun must be free in its local domain.
Principle C: an R-expression must be free.
Principles A and B had as their local domain the minimum clause that contains the anaphor or pronoun.However, the rule of the minimum domain does not address ( 22) and ( 23): In (22) himself is not bound in its local domain, but it is a grammatical sentence, whereas in (23) him is free in its local domain, but it is an ungrammatical sentence.The answer to this can be found in government.In the examples ( 22) and ( 23), the main verb believe governs himself and him due to the fact that to be does not carry inflection.The concept of local domain must therefore be substituted with governing category.This way, the governing category is (22) and in ( 23) is the whole sentence, and Principles A and B are no longer violated.Chomsky (1993) explains that the governing category of α is the minimum Complete Functional Complex that contains α and in which the binding principle of α can be satisfied.
(24) Principle A: an anaphor must be bound in its governing category Principle B: a pronoun must be free in its governing category.
Principle C: an R-expression must be free.
Psycholinguists have been trying to investigate whether the principles of the Binding Theory would rapidly influence binding processing during on-line sentence processing.In the next section some studies that examined the influence of binding principles in online language processing will be reviewed.

Previous research in binding processing
In this section, some previous research on coreferential processing with respect to Principles A and B will be reviewed5 .The relationship between these structural constraints and agreement cues in the time-course of binding processing is very controversial in the literature.Therefore, previous research will be presented here under two subsections: works that showed some evidence of initial infallibility of structural constraints in binding processing; and works that found the opposite, that is, structurally unacceptable candidates can be initially considered as potential candidates if they feature-match the anaphoric expressions.

Evidences of initial infallibility of structural constraints in binding processing
Nicol and Swinney (1989) conducted a cross-modal priming experiment examining the reactivation of anaphoric antecedents.They found out that immediately after the anaphor only the structurally appropriate antecedent was reactivated, while the other referents were not significantly reactivated.The results for pronouns were similar to the results for anaphora.Thus, the authors concluded that the reactivation of prior referents is restricted by grammatical constraints.Nicol & Swinney (1989) explained that only when binding constraints do not constrain the list of potential antecedents to a single one; pragmatic and other sentence or discourse processing procedures would come into play, but only at a later point in processing.Clifton et al. (1997) studied how antecedents of "her" and "him/ his" are reactivated.They performed a phrase-by-phrase self-paced moving window experiment contrasting noun phrase (NP) and specifier (SPEC) usages.They also manipulated the morphological number of the subject in each sentence.The authors found faster reading times for the SPEC trials when the number of the subject agreed with the pronoun, which would make it an appropriate antecedent.However, when the subject and the pronouns mismatched in number, there was a slowdown on reading times as the subject was made inappropriate.Importantly, number did not show any effects on NP trials.Thus Clifton and colleagues concluded that, at least initially, binding principles constrain parsing decisions, and that number would work as a filter to determine whether the accessed antecedents are appropriate.Sturt (2003) was concerned about two questions: i) to what extent sentence processing is affected by ungrammatical antecedents; ii) to what extent do binding principles act like a filter on the final interpretation of a sentence.He conducted an eye-tracking study to investigate the influence of inaccessible antecedents in reflexive binding when they are put strongly into discourse focus.Stereotypical subjects were used in order not to expose participants to ungrammatical sentences.His results show that binding constraints were applied extremely early (at First Fixation and First Pass reading times).First Fixation and First Pass reading times were faster when the gender of the reflexive matched the stereotype of the accessible antecedent than it did not, but they did not differ reliably as a function of whether the inaccessible antecedent matched the reflexive.
However, reliable influences of the inaccessible antecedent at late measures were found (Second Pass in the second area after the reflexive).There were longer Second Pass times when the inaccessible antecedent mismatched the reflexive than when it did not.The author concluded that antecedents that were not initially considered by the binding principles could affect processing at a later stage.In other words, binding constraints are applied at an extremely early stage, but they do not act as filters.Sturt (2003) also conducted a follow-up study, a sentence-by-sentence self-paced reading experiment with a comprehension question to check the interpretation of the anaphor referent.It seems that Principle A did not act as an absolute filter on the final interpretation of the sentence either.Sturt (2003) defends the idea that binding principles act like a defeasible filter, as they can be violated at a later stage when there is a highly focused inacceptable antecedent involved.Leitão et al. (2008) investigated the relationship between Principle B and phi-features (gender, number, and animacy) in coreference processing in Brazilian Portuguese in two self-paced reading experiments.In the first experiment, there were structurally unacceptable antecedents in the sentences, and the results showed that the pronoun+1 region (adverb regions) had longer reading times due to the fact that the structurally unacceptable antecedent in the sentence feature-matched the pronoun.However, in the second experiment, there was a structurally unacceptable candidate available in a preamble.Unlike the first experiment, the results of the second experiment did not show any differences among the conditions, although the reading times at the pronoun region were faster when compared to the first experiment.The authors suggested that when there are no structurally acceptable antecedent candidates available, as in the first experiment, candidates that feature-match the pronouns could be considered as potential antecedents even if they violate Principle B. On the other hand, when there is a structurally acceptable antecedent available, as in the second experiment, the search of an antecedent ends faster and the structurally unacceptable candidates are not taken into account.
In an Event Relative Potentials experiment (ERPs), Xiang et al. (2009) studied intrusion effects of structurally unacceptable noun phrases that matched the reflexive.The authors found a P600-like component for both intrusive and incongruent conditions.However, there were no differences between the intrusive and incongruent sentences, while both were significantly different from the congruent.It is important that they found a marginal late intrusion effect only at 800-1000ms, which matches the late effects of inaccessible antecedents in Sturt (2003).The authors concluded that there is no initial intrusion effect for reflexive binding.Oliveira et al. (2012) conducted a self-paced reading experiment to determine whether Principle A influences reflexive resolution in Brazilian Portuguese.They found that the grammatical conditions, in which the structurally acceptable antecedent agrees in gender with the reflexives, had faster reading times at the reflexive region when compared to ungrammatical conditions.It should be noted that the structurally unacceptable antecedents were not taken into account in any condition, which suggests that Principle A works as a filter, blocking the candidates that violate it.Dillon et al. (2013) conducted eye-tracking experiments with the purpose of investigating the impact of structurally illicit nouns phrases on the computation of reflexive binding.It should be mentioned that they also conducted off-line judgments to check whether the number mismatch in the materials would be reliably rejected.The results of the offline grammaticality judgment indicated a main effect of grammaticality, confirming this.Likewise, the online results showed a main effect of grammaticality in First Pass and in Total Times, with no facilitatory intrusion effects.The authors concluded that initially the feature content of a structurally illicit NP could not affect reflexive processing.Thus they concluded that the mechanism used by memory retrieval for reflexives primarily uses syntactic information to guide retrieval of the antecedents.It is relevant to the current study that comprehenders seemed to be less sensitive to the feature match when the head noun was plural suggests that the feature mismatch is sensitive to the markedness of the features involved.Chow et al. (2014) were concerned about which kinds of constraints initially restrict antecedent retrieval, and which have later effects, working as filters.In their first self-paced moving window experiment they manipulated the gender match between the pronoun "him" and the structurally acceptable main clause subject and the structurally unacceptable embedded clause subject.Relative clauses could also modify the nouns in order to increase the distance between the pronoun and the antecedent.The structurally unacceptable antecedents could be either a common noun or a proper name.As the mismatch conditions had longer reading times, it seems that comprehenders are immediately sensitive to the structural constraints on pronoun interpretation regardless of the similarity between the candidate antecedents and linear distance.They found robust effects of grammaticality, but no interference effects of any kind.It should be mentioned that when the linear distance between the pronoun and the structurally acceptable antecedent was long in the modified common noun condition, they found a late ungrammatical match effect, that is, when no grammatical antecedent was available, the presence of a feature-matching structurally unacceptable antecedent led to longer reading times.The authors explain that it may have been caused by the fact that the memory representation of the structurally acceptable antecedent was decayed due to the long distance.In their second experiment, Chow et al. (2014) tried to replicate the results found on Badecker & Straub (2002) [which will be discussed in the next subsection] by using identical materials and procedures.However, Chow et al. (2014) failed and only replicated the results of their first experiment.They also conducted 3 other experiments, but no effects were found.The authors defended the Simultaneous Constraints hypothesis since it appeared that both agreement features like gender and the structural constraints of binding immediately restricted the set of candidate antecedents during the initial retrieval process.Badecker and Straub (2002) studied the processing of reflexive and pronoun binding in a series of self-paced reading experiments.According to the authors, coreference processing is influenced by: morphological and syntactic properties of the dependent expression and the antecedents; structural parallelism; causal semantics; prominence and salience of the local discourse entities; and the world knowledge shared about the discourse entities involved.Among these factors, the authors' study was focused on morphosyntactic features and local focus of attention.In one of their experiments, they investigated whether the content of structurally inaccessible NPs would influence pronoun processing.They observed longer reading times in the no-match condition than in the accessible match condition.The results also show faster reading times when there was a structurally accessible antecedent than when there was an inaccessible antecedent.There was no difference between the multiple match and the accessible-match conditions.The authors concluded that gender was automatically used to identify the referent of a pronoun, and that the structurally accessible antecedents were also rapidly accessed.On the other hand, inaccessible candidates were not blocked for an initial candidate set, as they influenced the evaluation process as soon as the pronoun was encountered.Badecker and Straub (2002) also investigated whether number features could shape the initial candidate set.In another experiment, they studied the influence of grammatical number in reciprocal anaphors like "each other", which are also governed by Principle A, as can be seen in ( 26):

Evidence of the initial fallibility of the structural constraints in binding processing
(26) a) multiple match: The attorney thought that the judges were telling each other which defendants has appeared as witnesses before.
b) single-match: The attorneys thought that the judges were telling each other which defendants has appeared as witnesses before.
The results indicate longer reading times in the multiple-match than in the single match, but only 3-4 words after the anaphor.The authors suggested that morphological number contributes to identifying the initial set of antecedent candidates.The multiple-match effect was attenuated in this case, because, according to the authors, common nouns may not be as effective as proper names in establishing discourse entities.Badecker and Straub (2002) concluded that binding-theory principles do not function as initial filters as reading times were longer when the grammatically inaccessible NPs agreed in gender (and number) with the pronoun or anaphor.The authors supported the interactiveparallel-constraint model: the initial candidate set is composed of the focused discourse entities that are compatible with the lexical properties of the referentially dependent expression, while the grammatical constraints on interpretation operate quickly and effectively in the process of selecting from among these options.Kennison (2003) investigated how comprehenders use structural information during coreference resolution of the pronouns "her", "him", and "his".In a self-paced moving window experiment, Kennison (2003) examined the processing of "her" in object position, functioning as either an NP or SPEC as in ( 27).She found that the type of subject influenced coreference processing in both conditions, including in NP conditions, which is inconsistent with Nicol and Swinney (1989) and Clifton et al. (1997).In SPEC conditions, reading times were longer when the subject was a male name, while in NP conditions reading times were longer when the subject was female.And the shortest times were for the conditions with "they".In other words, when coreference could be achieved, there were longer reading times for NP conditions than for SPEC, as SPEC conditions were easy to process.However, when coreference could not be achieved, there was no difference immediately after the pronoun.But, later, when gender and number information was accessed, coreference was impeded in SPEC sentences as reading times were longer for the SPEC than the NP condition later on in the sentence.Kennison (2003) also replicated the results of "her" with "his".Kennison's (2003) findings contradict Nicol and Swinney (1989) and Clifton et al. (1997) as structurally unavailable antecedents were considered as potential subjects since the type of subject influenced reading times.Her findings also contradict Badecker and Straub (2002), as number features appeared to help compose the initial candidate set, while gender mismatch only influenced processing at a later phase.It seemed that the antecedent search ended more quickly when the unavailable candidate differed in number with the pronoun whereas the antecedent search was longer when the subject of the sentence in NP matched the pronoun in gender.
In another experiment, Kennison (2003) aimed to determine whether subject type would influence processing when the discourse context contained an available antecedent for the pronoun as in (28).The results suggested that when a single highly salient and structurally available antecedent was in discourse context, structurally unavailable antecedents did not influence coreference, which means that when there is a good fit between the antecedent and the pronoun, the process of searching for an antecedent terminates.It appeared that, on the other hand, when no antecedent is available or when there is not a strong fit between the structurally available antecedent and the pronoun, the process of searching for an antecedent continues, and structurally unavailable antecedents can be considered.
Parker (2014) studied how the parser targets specific information in memory, and how that information is extracted to elaborate the sentence representation.The author studied attraction effects in anaphora resolution manipulating gender, number, and animacy.The results for 1-feature mismatch only showed a late slow down in reading times for ungrammatical sentences, and no attraction effects were found.However, for 2-feature mismatch conditions, early and late reading times were facilitated for ungrammatical sentences with attractors when compared to ungrammatical sentences without attractors.Parker (2014) explains that attraction effects are likely to be a consequence of quantitative similarity.Qualitative factors are also important since structural cues are weighted more strongly in retrieval than morphological cues.Patil et al. (2016) thought that reflexive binding may be a very informative phenomenon in understanding the role that grammatical and non-grammatical constraints play in memory.The structural constraints of reflexive binding are relatively clear, and this construction admits manipulations of agreement, distance, and distracting antecedent candidates.They created a model running 1000 simulations of each condition of Sturt's (2003) conditions.Just like Sturt (2003), they found that: retrieval errors on mismatch conditions were higher than in match conditions (mismatch effect), the retrieval errors for both interference conditions, mismatch and match, were higher than for the other 2 conditions (match interference effect), and the retrieval times for both mismatch conditions are longer than the other two match conditions (mismatch effect).On the other hand, they also found results that were not consistent with Sturt (2003): retrieval times for the match interference condition were shorter than for the match condition and shorter than for the mismatch conditions (mismatch interference effect).Patil et al. (2016) suggested that the inacceptable candidates in Sturt (2003) could not be good attractors as semantic matching cues are not able to cause attraction if no grammatical cue is involved.In addition, since they were less recently created in representation, they could not have enough strength in memory to be retrieved due to decay factors.Patil et al. (2016) also conducted an eye-tracking experiment.To increase the strength of the inaccessible subject, they used an object pronoun within a relative clause where the inaccessible antecedents were the subject of the clause.Patil et al. (2016) found a significant main effect of interference in First Pass and in First Pass Regression Probability.There was also a main effect of match for Rereading times and Total Reading Times.Thus their results are consistent with Badecker and Straub (2002), but inconsistent with Sturt (2003), Nicol and Swinney (1989), Xiang et al. (2009), and Dillon et al. (2013). Patil et al. (2016) concluded that non-structural cues are crucial for antecedent retrieval so that agreement features such as gender must be included in the set of retrieval cues.Moreover, it seems that strict syntactic constraints on antecedent retrieval are inconsistent with their results, as their results challenged the idea that the parser is infallible for reflexive binding.

The present study
The experiment that is reported here is an eye-tracking study, and its main purpose is to investigate how and when the structural constraints of Principle B and agreement cues influence the way nominal antecedents are retrieved from memory.
In this experiment, participants had their eye movements recorded while they read text on a computer screen.Using appropriate software, the researcher can measure the duration of eye fixations (among other measures).This technique is one of the most efficient means linguists have to study language processing.Moreover, it has advantages over the self-paced reading technique because the text can be presented more naturally to the readers (i.e, without segmentation and button pressing).
According to Just and Carpenter (1980), the duration of eye fixations during sentence processing depends on information complexity, that is, the more complex information processing is, the longer the fixation duration in the area where that information is located.These authors make two assumptions: the first is called the Immediacy Assumption, which claims that language processing is immediate, that is, a word is processed at the first time it is encountered; the second is called the Eye-Mind Assumption, which means that the eye remains fixated on a word as long as the word is being processed.The first assumption is still considered true; however, the second assumption is no longer thought to be true, since a word can still be processed when the eyes are fixated on the next word, which is called the spillover effect.
We assume that since overt and redundant agreement marking is often available in languages with rich morphology such as Brazilian Portuguese, speakers will tend to strongly rely on agreement morphology in order to resolve coreference.In congruence with Badecker and Straub (2002), the hypothesis is that candidates that feature-match the pronouns would initially influence coreference processing, even though they violate Principle B, and that the structural constraints of Principle B would only work later on in binding processing to help the parser select the most adequate antecedent.Therefore, find main effects of structurally unacceptable antecedents at early eye measures (First Fixation Duration) and main effects of structurally acceptable antecedents as late eye measures (Total Fixation Duration) are expected.
In addition, it is hypothesized that memory is sensitive to different types of features.In other words, marked features in the language will be more salient in memory, facilitating memory retrieval.This way, it is expected that due to their markedness in Brazilian Portuguese, feminine and plural features on structurally unacceptable candidates will cause facilitation effects when compared to masculine and singular.It should be mentioned that the markedness of these features is not inherent to them.Plural, for example, is not marked because of its morphology (morpheme -s) or notional plurarity, but because of its grammatical number (STAUB, 2009).Plural or feminine is marked in opposition to singular and masculine respectively because the former ones, and not the latter ones, are the default features, which are automatic, frequent, and dominant.
Moreover, we hypothesize that as memory decays, sentences in which the structurally acceptable antecedent is linearly distant from the pronoun would have stronger facilitation effects caused by structurally unacceptable antecedents, as they might be more easily retrieved as the antecedents by memory due to recentness (cf. among others, SCHWEPPE, 2013;CHOW et al., 2014).

Participants
Twenty-nine native speakers of Brazilian Portuguese with normal or corrected-to-normal vision participated as volunteers in the experiment.They were undergraduate students of the Federal University of Rio de Janeiro (UFRJ) and were randomly invited to participate in the study, and, as compensation for their work, they receiving three hours of Cultural-Scientific Activities (Atividades-Científico-Culturais Discentes, AACC), which is mandatory for their graduation.All participants were naive with respect to the object of study of the experiment and signed a consent form which stated that the task they would perform would not have any risks to their health and that the results would be eventually published.Of the twenty-nine participants, five were excluded from analysis as they had less than 80% of their eyes movements recorded.Therefore, the experiment was analyzed using data from the remaining twenty-four participants sixteen female and eight male, with a mean age of 22.6 years (ranging from 18 to 30 years).

Design and materials
There were two independent variables in the experiment.The first one was (i) structurally acceptable antecedent matching.In this variable the structurally acceptable antecedent could feature-match/mismatch the pronoun in number.The second one was (ii) structurally unacceptable antecedent matching, and the structurally unacceptable antecedent could feature-match/mismatch the pronoun in number.
Besides the independent variables, there were three controls in the experiment: i) the number of the structurally unacceptable antecedent, half of the sentences contained plural structurally unacceptable antecedents and the other half singular; ii) the gender of the structurally unacceptable antecedent, half of the sentences contained feminine structurally unacceptable antecedents and the other half masculine; iii) the linear distance between the structurally acceptable antecedent and the pronoun, half of the sentences contained long linear distance and the other half short.Although the controls could not be considered independent variables, they were taken into account in the analysis of the experiment.
The experiment had two on-line dependent variables: (i) the First Fixation Duration and (ii) the Total Fixation Duration at the pronoun areas.
Each of the four lists, which were created using a Latin Square, was pseudo-randomized and contained sixteen experimental sentences and thirty-two fillers.Four sentences from each experimental condition were in each list.Each sentence of the experiment was accompanied by an off-line yes-or-no comprehension question.The filler questions were balanced between yes and no answers, while all the experimental sentences had yes answers.
Each experimental trial contained a structurally acceptable antecedent (masculine/feminine, singular/plural) in the main clause, followed by a structurally unacceptable antecedent, which was the subject of a relative clause, followed by a 3 rd person pronoun ("ele/ela/ eles/elas"), which were the direct objects of the relative clauses.One can find an example sentence below: (29) Short distance between the structurally acceptable antecedent and the pronoun:

Procedure
The experiment was conducted at the laboratory of experimental research (LAPEX) at the Federal University of Rio de Janeiro (UFRJ) in Rio de Janeiro, Brazil.The eye-tracking software used in this experiment was Tobii Studio TM TX 300, which requires an initial individual calibration at the beginning of the procedure for the eye-tracker to be able to monitor the participant's pupils during the reading task.The participants were instructed to sit comfortably and were given written and oral task instructions.After that, the calibration process would start, followed by a short practice session with filler sentences so that the experimenter could check whether the participants understood the task and were performing it at a natural speed.Finally, the experimenter left the participants alone in a quiet room without distraction.Each sentence of the experiment would appear in whole on the computer screen.The participants could read each sentence however many times that was necessary; however they were instructed to read each sentence as fast as they could while also paying attention to meaning.After reading a sentence, the participants would press the space bar to continue to a comprehension question about the sentence that was just read.Subjects answered by fixating their eyes on one of the options, "Yes" or "No".Each participant randomly performed one of the four lists of the experiment.The duration of the experiment was of approximately twenty minutes.

Analysis
The reading time data were extracted using Tobii Fixation Filter, which is the default fixation algorithm in Tobii Studio TM 2.X version 2.2.Approximately 19% of the data were lost due to calibration issues.Therefore, due to the small sample of the test, we decided not to perform any outlier trimming.Our raw data came with a positively skewed nonnormally distributed population (Shapiro Test: W=0.893, p<0.05 for First Fixation; and W= 0.619, p<0.05 for Total Fixation).We believe that this was a consequence of the small sample and the missing data.We did not transform our data to achieve normality, because we decided to analyze the experiment with a linear mixed-effect model (LMM), which is a statistical model that does not use mean data, as it does not average across individual responses and it can also cope with unbalanced data (LO;ANDREWS, 2015).Moreover, we were concerned that data transformation could bias the results.All dependent variables were within-subjects and the statistical analysis was carried on R6 software, using plotrix,7 lmer Test,8 and gplots29 packages.
Although there were only two independent variables as mentioned before, we also included the three controlled variables in our analysis in order to reduce the error residual in our LMM statistical model.Therefore, five variables were analyzed: a) structurally acceptable antecedent matching (matching or mismatching), b) structurally unacceptable antecedent matching (matching or mismatching), c) number of the structurally unacceptable antecedent (singular or plural), d) gender of the structurally unacceptable antecedent (feminine or masculine), e) distance between the structurally acceptable antecedent and the pronoun (short or long).

Results
Means as well as standard errors of First Fixation Duration and Total Fixation Duration at the pronoun area are reported for each condition in Tables 1 and 2: (31) Short distance between the structurally acceptable antecedent and the pronoun:  LMM was created with the help of lmerTest package with the following Fixed Effects: i) structurally acceptable antecedent matching; ii) structurally unacceptable antecedent matching; iii) number of the structurally unacceptable antecedent; iv) gender of the structurally unacceptable antecedent; v) distance between the structurally acceptable antecedent and the pronoun.On the other hand, the Random Effects were: i) participants; and ii) items.
For First Fixation Duration, using the anova ( ) function in our model, it was found a significant effect of the interaction between the variables structurally unacceptable antecedent and structurally unacceptable antecedent number: F(1, 258)=6.248,p=0.013*.In addition, there was an important trend towards statistical significance in the interaction between structurally unacceptable antecedent gender and number: F(1, 265)=3.44,p=0.064; and a moderately significant interaction between structurally unacceptable antecedent number and distance between the structurally acceptable antecedent and the pronoun: F(1, 264)=2.46,p=0.117.
For Total Fixation Duration, a linear mixed-effect model was also created with the help of lmerTest package.Its fixed and random effects were the same of the First Fixation Duration model.By using the anova ( ) function, we found a significant main effect of structurally acceptable antecedent matching: F(1, 254) = 4.046, p=0.045* and slight trend towards significance in the interaction between structurally acceptable antecedent and linear distance: F(1, 253)=3.556,p=0.060.
In order to figure out which pairs of conditions were significantly different, bar plots with Tukey Tests Results for 95% confidence intervals (CI) were also created with the help of ggplot2 package.
Structurally acceptable antecedents that feature-matched the pronouns had faster reading times when compared to structurally acceptable antecedents that mismatched the pronouns for First Fixation Duration, although this effect was probably not statistically significant (β=-21, CI [-49, 6], p=0.140), and for Total Fixation Duration, with a statistically significance difference (β=-91, CI [-176, 5], p=0.036*), as one can see in Figures 2 and 3 respectively.Even though this was only a near-marginal statistical significance, Figure 4 illustrates that First Fixation reading times at the pronoun area were slower when there was a long distance between the pronoun and the structurally acceptable antecedent when compared to short distance, (β=18, CI [-9, 46], p=0.187).Figure 5   For First Fixation Duration, one can see in Figure 6, there was near-marginal significance: a) structurally unacceptable antecedents in the plural that feature-mismatched the pronoun had longer reading times when compared to structurally unacceptable antecedents that matched the pronouns (β=41, CI [-11, 94], p=0.180); and an apparent trend towards significance: b) singular structurally unacceptable antecedents that mismatched the pronouns had faster First Fixation times than plural structurally unacceptable antecedents (β=-38, CI [-90, 13], p=0.215).Similarly, for Total Fixation Duration, Figure 7 illustrates nearmarginal significance for longer reading times when the structurally unacceptable antecedent mismatches the pronoun than when it matches: (β=57, CI [-27, 143], p=0.186).

Discussion
This research aimed investigated how pronouns retrieve their antecedents in Brazilian Portuguese, a morphologically rich language.It was hypothesized that due to the redundant overt agreement marking in Brazilian Portuguese, readers would strongly rely on agreement cues in order to retrieve pronominal antecedents.Therefore, we expected that structurally unacceptable antecedents would be initially considered as potential antecedents, even they violate the structural constraints of Principle B, which would only influence coreference processing later, helping the parser to select the most adequate antecedent from the initial candidate set.The results of the present experiment seem to corroborate that hypothesis, providing evidence in favor of Badecker and Straub (2002).The results of the LMM for First Fixation Duration, which measures early processing, showed effects of structurally unacceptable antecedents, and agreement features such as number and gender, as well as the linear distance between the structurally acceptable antecedents and the pronouns.However, it should be mentioned that 95% confidence intervals for our conditions in First Fixation Duration also showed effects of structurally acceptable antecedents, that is, effects of Principle B. On the other hand, our LMM for Total Fixation only showed effects of structurally acceptable antecedents and linear distance, although the 95% confidence intervals for our conditions in Total Fixation Duration also showed effects of structurally unacceptable antecedents.
It is important to note the 95% confidence intervals showed that structurally unacceptable antecedents that feature-matched the pronouns were responsible for faster coreference processing in both First Fixation Duration and Total Fixation Duration, which might be evidence that these candidates are actually being initially retrieved as antecedents.According to Dillon (2013), facilitatory effects of structurally unacceptable antecedents might be evidence that antecedents are retrieved through a content-addressable memory.In other words, structurally unacceptable antecedents can cause interference effects in memory due to the fact that they partially match the content cues of the pronouns, leading to erroneous retrieval of them as antecedents, which in CAM is known as similarity-based interference.
Therefore, these results evidence contradicting the hypothesis that structural constraints work as an initial filter in binding processing, blocking the influence of structurally unacceptable candidates, as claimed by Nicol and Swinney (1989), Clifton et al. (1997), Sturt (2003), Leitão (2008), Xiang et al. (2009), Oliveira et al. (2012), Dillon et al. (2013) and Chow et al. (2014).These results also contradict Kennison (2003), as it seems that gender agreement is also important at initial stages of coreference processing.
Curiously, at early processing, it was also found that even when the structurally unacceptable antecedents mismatched the pronouns, coreference processing is affected.In this case, structurally unacceptable antecedents in the singular tend to facilitate coreference when compared to the ones in the plural.This result apparently contradicts CAM, as this model posits that only partial matches can cause similarity-based interference effects, and since structurally unacceptable antecedents that mismatch the pronouns do not have any content cue matching the pronoun, it should not be taken into account by memory as a potential antecedent.
It is noteworthy that it was not found strong evidence to support the second hypothesis, that memory is sensitive to different types of agreement features.It was expected that feminine structurally acceptable antecedents would influence coreference more than the masculine ones due to the fact that feminine features are marked in Brazilian Portuguese.The same applies for number, when comparing plural to singular.Our results did not show any difference between masculine and feminine, but it suggests that, as already mentioned above, structurally unacceptable antecedents in the singular that mismatched the pronouns were responsible for shorter reading times at the pronoun area than plural features.Interestingly, plural features did not cause facilitatory effects as we expected, but rather increased reading times.However, one must remember that in that condition, the structurally unacceptable antecedents mismatched the pronouns; therefore, for CAM, they could not be considered as potential antecedents.Essentially, the structurally unacceptable antecedents only facilitate coreference when they featurematch the pronouns.And in the relevant condition, this was not the case.Nonetheless, we should not assume that singular and plural features did not behave the same in this condition.It is likely that structurally unacceptable antecedents in the plural that mismatched pronouns can bring difficulties to the processor due to the salience of plural features in memory.It might be the case that the parser knows that this specific candidate could not be the adequate antecedent, but at the same time the salience of plural disturbs memory, even though those features do not match the pronoun.
As expected, another piece of evidence in favor of CAM is the decay effects we found in both First Fixation Duration and Total Fixation Duration, that is, short linear distance between the structurally acceptable antecedents and the pronouns facilitated coreference when compared to long distance.Again, the hypothesized reason for this is that the more recent an item was stored in memory; the easier it is to retrieve it (cf.among others, Schweppe, 2013;Chow et al, 2014).
Finally, it was found that structurally acceptable antecedents that matched the pronouns facilitated coreference in both First Fixation Duration and Total Fixation Duration.Once structurally acceptable candidates totally match the content cues of the pronouns (both agreement and structural ones), they can clearly be considered as the best antecedents.Nevertheless, it was not expected to find this result in First Fixation Duration, as it was hypothesized that the structural constraints of Principle B would only work at late processing phases, as in Total Fixation Duration.However, as this effect did not appear in our LMM for First Fixation as a significant effect or a trend towards significance, we can continue to support our hypothesis.

Conclusion
This research filled a gap in the literature on coreference processing by showing how agreement cues influence antecedent retrieval in a language with rich morphology and how memory seems to be sensitive to different types of features.In summary, the results suggest that agreement cues, structural constraints, and decay effects can influence coreference from early to late processing stages.However, it seems that agreement cues play a major role at early pronominal coreference processing, while structural constraints play a major role at later processing.Decay appears to influence processing regardless of particular agreement feature or structural constraint.Moreover, structurally unacceptable antecedents that feature match the pronoun in gender and number facilitated coreference processing.It was suggested that these structurally unacceptable antecedents might be erroneously retrieved from memory as potential antecedents as a result of similaritybased interference.
The results reported here provide evidence in favor of CAM as it was found decay and similarity-interference effects in antecedent retrieval from memory; however, it seems that this model needs some adjustments in order to explain how candidates that mismatched the pronouns could interfere in coreference processing, how singular and plural features can be distinguished in memory, and what the consequences are for these in sentence processing.Thus future studies that seek to compare different types agreement cues in antecedent memory retrieval can be helpful in order to better understand how memory retrieval and language processing are integrated.

Figure 1 (
based on LEWIS; VASISHTH; van Dyke, 2006) illustrates how pronouns retrieve their antecedents in memory.

FIGURE 1 -
FIGURE 1 -How antecedent retrieval works in CAM. Figure based on Lewis, Vasishth and van Dyke (2006)

(
19) *John i said Mary criticized himself i .(20) *John i said Mary criticized himself i .

( 22 )
John i believes [himself i to be clever].(23) *John i believes [him i to be clever].
(25) a) multiple match: John thought that Bill owed him another chance to solve the problem.b)accessible match: John thought that Beth owed him another chance to solve the problem.c) inaccessible match: Jane thought that Bill owed him another chance to solve the problem.d)no-match: John thought that Beth owed him another chance to solve the problem.

( 27 )
SPEC conditions: Susan watched her classmate during the open rehearsals of the school play.Carl watched her classmate during the open rehearsals of the school play.They watched her classmate during the open rehearsals of the school play.NP conditions: Susan watched her during the open rehearsals of the school play.Carl watched her during the open rehearsals of the school play.They watched her during the open rehearsals of the school play.

( 28 )
Billy complained about having a stomachache.a) Laura watched him closely throughout the day.b) Michael watched him closely throughout the day.c) They watched him closely throughout the day.

FIGURE 4 -
FIGURE 4 -95% CI barplot for First Fixation Duration of the linear distance between the structurally acceptable antecedents and the pronouns

FIGURE 6 -
FIGURE 6 -95% CI barplot for First Fixation Duration of structurally unacceptable antecedents and number

TABLE 1 -
Sample of the experimental materials used for short distance conditions (30) Short distance between the structurally acceptable antecedent and the pronoun:

TABLE 2 -
Sample of the experimental materials used for long distance conditions

TABLE 3 -
First Fixation Duration means and standard errors in milliseconds for short distance experimental conditions

TABLE 4 -
First Fixation Duration means and standard errors in milliseconds for long distance experimental conditions