PARSING PREFERENCES AND LINGUISTIC STRATEGIES Rodolfo Delmonte Ca' Garzoni-Moro, San Marco 3417 Università "Ca Foscari" 30124 - VENEZIA Tel. 39-41-2578464/52/19 - Fax. 39-41-5287683 E-mail: delmont@unive.it - website: byron.cgm.unive.it Abstract We implemented in our parser four parsing strategies that obey LFG grammaticality conditions and follow the hypothesis that knowledge of language is used in a "modular" fashion. The parsing strategies are the following: Minimal Attachment(MA), Functional Preference(FP), Semantic Evaluation(SE), Referential Individuation(RI). From the way in which we experimented them in our implementation it appears that they are strongly interwoven. In particular, MA is dependent upon FP to satisfy argument/function interpretation principles; with semantically biased sentences, MA, FP and SE apply in hierarchical order to license a phrase as argument or adjunct. RI is required and activated every time a singular definite NP has to be computed and is dependent upon the presence of a Discourse Model. The parser shows Garden Path effects and concurrently produces a processing breakdown which is linguistically motivated. Our parser is a DCG(Pereira & Warren, 1980) is implemented in Prolog and obeys a topdown depth-first deterministic parsing policy. 1. Introduction In order for a parser to achieve psychological reality it should satisfy three different types of requirements: psycholinguistic plausibility, computational efficiency in implementation, coverage of grammatical principles and constraints. Principles underlying the parser architecture should not conform exclusively to one or the other area, disregarding issues which might explain the behaviour of the human processor. In accordance with this criterion, we assume that the implementation should closely mimick phenomena such as Garden Path effects, or an increase in computational time in presence of semantically vs syntactically biased ambiguous structures. We also assume that a failure should ensue from strong Garden Path effects and that this should be justified at a psycholinguistic interpretation level. Since we base most of our grammatical principles on LFG we assume that lexical information is the most important knowledge source in the processing of natural language. However, we also assume that all semantic information should be made to bear on the processing and this is only partially coincident with lexical information as stored in lexical forms. In particular, subcategorization, semantic roles and all other semantic compatibility evaluative mechanisms should be active while parsing each word of the input string. In addition, the Discourse Model and External Knowledge of the World should be tapped when needed to disambiguate ambiguous antecedents. Differently from what is asserted by global or full paths approaches(see Schubert, 1985; Bear & Hobbs, 1988; Hobbs, Stickel, Appelt, Martin, 1993), we believe that decisions on structural ambiguity should be reached as soon as possible rather than deferred to a later level of representation. The parser we work with is organized as shown in Fig.1 and can deal with a certain number of linguistic phenomena at sentence level, while leaving other problems to be solved at discourse level. The parser we present was conceived in the middle '80s and started as a Transfer module for a Machine Translation Expert system in a very restricted linguistic domain. Then it became a general parser for Italian and English, to be used with LFG students. German was added later on, beginning of '90s. Since the people working at it were interested in the semantics as much as in the syntax, it was soon enriched with a Quantifier Raising algorithm and an Anaphoric Binding Module. In 1994 the Discourse Model and the Inferential Processes algorithms were developed. Finally in 1996 work on a Situational Semantics interface and on the Discourse Structure was carried out. These experiments were finally enriched - two years ago - with a number of Parsing Strategies procedures like setting up a Lookahead mechanism, a Well-Formed Substring Table and a number of other semantically and/or lexically based triggering lookup procedures. ********************************************************** rodolfo delmonte Ph.D. Associate Professor of Computational Linguistics Section of Linguistic Studies Dipartimento Studi Asia Orientale Ca' Garzoni-Moro, San Marco 3417 Universita' Ca' Foscari 30124 - VENEZIA (It) tel.:39-041-2578464 lab.:39-041-2578452/19 fax.:39-041-5287683 website: http://byron.cgm.unive.it **********************************************************