forked from acl-org/acl-anthology
-
Notifications
You must be signed in to change notification settings - Fork 0
/
2000.iwpt.xml
426 lines (426 loc) · 40.5 KB
/
2000.iwpt.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
<?xml version='1.0' encoding='UTF-8'?>
<collection id="2000.iwpt">
<volume id="1" ingest-date="2020-05-11">
<meta>
<booktitle>Proceedings of the Sixth International Workshop on Parsing Technologies</booktitle>
<publisher>Association for Computational Linguistics</publisher>
<address>Trento, Italy</address>
<month>February 23-25</month>
<year>2000</year>
<editor><first>Alberto</first><last>Lavelli</last></editor>
<editor><first>John</first><last>Carroll</last></editor>
<editor><first>Robert C.</first><last>Berwick</last></editor>
<editor><first>Harry C.</first><last>Bunt</last></editor>
<editor><first>Bob</first><last>Carpenter</last></editor>
<editor><first>John</first><last>Carroll</last></editor>
<editor><first>Ken</first><last>Church</last></editor>
<editor><first>Mark</first><last>Johnson</last></editor>
<editor><first>Aravind</first><last>Joshi</last></editor>
<editor><first>Ronald</first><last>Kaplan</last></editor>
<editor><first>Martin</first><last>Kay</last></editor>
<editor><first>Bernard</first><last>Lang</last></editor>
<editor><first>Alon</first><last>Lavie</last></editor>
<editor><first>Anton</first><last>Nijholt</last></editor>
<editor><first>Christer</first><last>Samuelsson</last></editor>
<editor><first>Mark</first><last>Steedman</last></editor>
<editor><first>Oliviero</first><last>Stock</last></editor>
<editor><first>Hozumi</first><last>Tanaka</last></editor>
<editor><first>Masaru</first><last>Tomita</last></editor>
<editor><first>Hans</first><last>Uszkoreit</last></editor>
<editor><first>K.</first><last>Vijay-Shanker</last></editor>
<editor><first>David</first><last>Weir</last></editor>
<editor><first>Mats</first><last>Wiren</last></editor>
<url hash="58076030">2000.iwpt-1</url>
</meta>
<paper id="1">
<title>Proceedings of the Sixth Internatonal Workshop on Parsing Technologies</title>
<pages/>
<url hash="c480d444">2000.iwpt-1.1</url>
<abstract/>
<bibkey>nn-2000-proceedings</bibkey>
</paper>
<paper id="2">
<title>Automatic Grammar Induction: Combining, Reducing and Doing Nothing</title>
<author><first>Eric</first><last>Brill</last></author>
<author><first>John C.</first><last>Henderson</last></author>
<author><first>Grace</first><last>Ngai</last></author>
<pages>1-5</pages>
<url hash="7023653a">2000.iwpt-1.2</url>
<abstract>This paper surveys three research directions in parsing. First, we look at methods for both automatically generating a set of diverse parsers and combining the outputs of different parsers into a single parse. Next, we will discuss a parsing method known as transformation-based parsing. This method, though less accurate than the best current corpus-derived parsers, is able to parse quite accurately while learning only a small set of easily understood rules, as opposed to the many-megabyte parameter files learned by other techniques. Finally, we review a recent study exploring how people and machines compare at the task of creating a program to automatically annotate noun phrases.</abstract>
<bibkey>brill-etal-2000-automatic</bibkey>
</paper>
<paper id="3">
<title>Guides and Oracles for Linear-Time Parsing</title>
<author><first>Martin</first><last>Kay</last></author>
<pages>6-10</pages>
<url hash="2078f35c">2000.iwpt-1.3</url>
<abstract>If chart parsing is taken to include the process of reading out solutions one by one, then it has exponential complexity. The stratagem of separating read-out from chart construction can also be applied to other kinds of parser, in particular, to left-comer parsers that use early composition. When a limit is placed on the size of the stack in such a parser, it becomes context-free equivalent. However, it is not practical to profit directly from this observation because of the large state sets that are involved in otherwise ordinary situations. It may be possible to overcome these problems by means of a guide constructed from a weakened version of the initial grammar.</abstract>
<bibkey>kay-2000-guides</bibkey>
</paper>
<paper id="4">
<title>Parsing Techniques for Lexicalized Context-Free Grammars</title>
<author><first>Giorgio</first><last>Satta</last></author>
<pages>10-14</pages>
<url hash="477a8506">2000.iwpt-1.4</url>
<abstract/>
<bibkey>satta-2000-parsing</bibkey>
</paper>
<paper id="5">
<title>A Bootstrapping Approach to Parser Development</title>
<author><first>Izaskun</first><last>Aldezabal</last></author>
<author><first>Koldo</first><last>Gojenola</last></author>
<author><first>Kepa</first><last>Sarasola</last></author>
<pages>17-28</pages>
<url hash="ba9b39aa">2000.iwpt-1.5</url>
<abstract>This paper presents a robust parsing system for unrestricted Basque texts. It analyzes a sentence in two stages: a unification-based parser builds basic syntactic units such as NPs, PPs, and sentential complements, while a finite-state parser performs syntactic disambiguation and filtering of the results. The system has been applied to the acquisition of verbal subcategorization information, obtaining 66% recall and 87% precision in the determination of verb subcategorization instances. This information will be later incorporated to the parser, in order to improve its performance.</abstract>
<bibkey>aldezabal-etal-2000-bootstrapping</bibkey>
</paper>
<paper id="6">
<title>New Tabular Algorithms for Parsing</title>
<author><first>Miguel A.</first><last>Alonso</last></author>
<author><first>Jorge</first><last>Graña</last></author>
<author><first>Manuel</first><last>Vilares</last></author>
<author><first>Eric</first><last>de la Clergerie</last></author>
<pages>29-40</pages>
<url hash="83d18b19">2000.iwpt-1.6</url>
<abstract>We develop a set of new tabular parsing algorithms for Linear Indexed Grammars, including bottom-up algorithms and Earley-like algorithms with and without the valid prefix property, creating a continuum in which one algorithm can in turn be derived from another. The output of these algorithms is a shared forest in the form of a context-free grammar that encodes all possible derivations for a given input string.</abstract>
<bibkey>alonso-etal-2000-new</bibkey>
</paper>
<paper id="7">
<title>Customizable Modular Lexicalized Parsing</title>
<author id="roberto-basili"><first>R.</first><last>Basili</last></author>
<author id="maria-teresa-pazienza"><first>M. T.</first><last>Pazienza</last></author>
<author><first>F. M.</first><last>Zanzotto</last></author>
<pages>41-52</pages>
<url hash="f6b5522d">2000.iwpt-1.7</url>
<abstract>Different NLP applications have different efficiency constraints (i.e. quality of the results and throughput) that reflect on each core linguistic component. Syntactic processors are basic modules in some NLP application. A customization that permits the performance control of these components enables their reuse in different application scenarios. Throughput has been commonly improved using partial syntactic processors. On the other hand, specialized lexicons are generally employed to improve the quality of the syntactic material produced by specific parsing (sub)process (e.g. verb argument detection or PP attachment disambiguation) . Building upon the idea of grammar stratification, in this paper a method to push modularity and lexical sensitivity, in parsing, in view of customizable syntactic analysers is presented. A framework for modular parser design is proposed and its main properties are discussed. Parsers (i.e. different parsing module chains) are then presented and their performances are analyzed in an application-driven scenarios.</abstract>
<bibkey>basili-etal-2000-customizable</bibkey>
</paper>
<paper id="8">
<title>Range Concatenation Grammars</title>
<author><first>Pierre</first><last>Boullier</last></author>
<pages>53-64</pages>
<url hash="89e27b5a">2000.iwpt-1.8</url>
<abstract>In this paper we present Range Concatenation Grammars, a syntactic formalism which possesses many attractive features among which we underline here, power and closure properties. For example, Range Concatenation Grammars are more powerful than Linear Context-Free Rewriting Systems though this power is not reached to the detriment of efficiency since its sentences can always be parsed in polynomial time. Range Concatenation Languages are closed both under intersection and complementation and these closure properties may allow to consider novel ways to describe some linguistic processings. We also present a parsing algorithm which is the basis of our current prototype implementation.</abstract>
<bibkey>boullier-2000-range</bibkey>
</paper>
<paper id="9">
<title>Automated Extraction of <fixed-case>TAG</fixed-case>s from the <fixed-case>Penn</fixed-case> <fixed-case>Treebank</fixed-case></title>
<author><first>John</first><last>Chen</last></author>
<author><first>K.</first><last>Vijay-Shanker</last></author>
<pages>65-76</pages>
<url hash="6373ce40">2000.iwpt-1.9</url>
<abstract>The accuracy of statistical parsing models can be improved with the use of lexical information. Statistical parsing using Lexicalized tree adjoining grammar (LTAG), a kind of lexicalized grammar, has remained relatively unexplored. We believe that is largely in part due to the absence of large corpora accurately bracketed in terms of a perspicuous yet broad coverage LTAG. Our work attempts to alleviate this difficulty. We extract different LTAGs from the Penn Treebank. We show that certain strategies yield an improved extracted LTAG in terms of compactness, broad coverage, and supertagging accuracy. Furthermore, we perform a preliminary investigation in smoothing these grammars by means of an external linguistic resource, namely, the tree families of an XTAG grammar, a hand built grammar of English.</abstract>
<bibkey>chen-vijay-shanker-2000-automated</bibkey>
</paper>
<paper id="10">
<title>From Cases to Rules and Vice Versa: Robust Practical Parsing With Analogy</title>
<author><first>Alex Chengyu</first><last>Fang</last></author>
<pages>77-88</pages>
<url hash="c2fa7522">2000.iwpt-1.10</url>
<abstract>This article describes the architecture of the Survey Parser and discusses two major components related to the analogy-based parsing of unrestricted English. Firstly, it discusses the automatic generation of a large declarative formal grammar from a corpus that has been syntactically analysed. Secondly, it describes analogy-based parsing that employs both the automatically learned rules and the database of cases to determine the syntactic structure of the input string. Statistics are presented to characterise the performance of the parsing system.</abstract>
<bibkey>fang-2000-cases</bibkey>
</paper>
<paper id="11">
<title>A Transformation-based Parsing Technique With Anytime Properties</title>
<author><first>Kilian</first><last>Foth</last></author>
<author><first>Ingo</first><last>Schröder</last></author>
<author><first>Wolfgang</first><last>Menzel</last></author>
<pages>89-100</pages>
<url hash="e5af1568">2000.iwpt-1.11</url>
<abstract>A transformation-based approach to robust parsing is presented, which achieves a strictly monotonic improvement of its current best hypothesis by repeatedly applying local repair steps to a complex multi-level representation. The transformation process is guided by scores derived from weighted constraints. Besides being interruptible, the procedure exhibits a performance profile typical for anytime procedures and holds great promise for the implementation of time-adaptive behaviour.</abstract>
<bibkey>foth-etal-2000-transformation</bibkey>
</paper>
<paper id="12">
<title><fixed-case>SOUP</fixed-case>: A Parser for Real-world Spontaneous Speech</title>
<author><first>Marsal</first><last>Gavaldà</last></author>
<pages>101-110</pages>
<url hash="d37adc0b">2000.iwpt-1.12</url>
<abstract>This paper describes the key features of SOUP, a stochastic, chart-based, top-down parser, especially engineered for real-time analysis of spoken language with very large, multi-domain semantic grammars. SOUP achieves flexibility by encoding context-free grammars, specified for example in the Java Speech Grammar Format, as probabilistic recursive transition networks, and robustness by allowing skipping of input words at any position and producing ranked interpretations that may consist of multiple parse trees. Moreover, SOUP is very efficient, which allows for practically instantaneous backend response.</abstract>
<bibkey>gavalda-2000-soup</bibkey>
</paper>
<paper id="13">
<title>A Recognizer for <fixed-case>M</fixed-case>inimalist <fixed-case>G</fixed-case>rammars</title>
<author><first>Henk</first><last>Harkema</last></author>
<pages>111-122</pages>
<url hash="4386d0b5">2000.iwpt-1.13</url>
<abstract>Minimalist Grammars are a rigorous formalization of the sort of grammars proposed in the linguistic framework of Chomsky’s Minimalist Program. One notable property of Minimalist Grammars is that they allow constituents to move during the derivation of a sentence, thus creating discontinuous constituents. In this paper we will present a bottom-up parsing method for Minimalist Grammars, prove its correctness, and discuss its complexity.</abstract>
<bibkey>harkema-2000-recognizer</bibkey>
</paper>
<paper id="14">
<title>A Neural Network Parser that Handles Sparse Data</title>
<author><first>James</first><last>Henderson</last></author>
<pages>123-134</pages>
<url hash="3ed32612">2000.iwpt-1.14</url>
<abstract>Previous work has demonstrated the viability of a particular neural network architecture, Simple Synchrony Networks, for syntactic parsing. Here we present additional results on the performance of this type of parser, including direct comparisons on the same dataset with a standard statistical parsing method, Probabilistic Context Free Grammars. We focus these experiments on demonstrating one of the main advantages of the SSN parser over the PCFG, handling sparse data. We use smaller datasets than are typically used with statistical methods, resulting in the PCFG finding parses for under half of the test sentences, while the SSN finds parses for all sentences. Even on the PCFG ‘s parsed half, the SSN performs better than the PCFG, as measure by recall and precision on both constituents and a dependency-like measure.</abstract>
<bibkey>henderson-2000-neural</bibkey>
</paper>
<paper id="15">
<title>A Context-free Approximation of <fixed-case>H</fixed-case>ead-driven <fixed-case>P</fixed-case>hrase <fixed-case>S</fixed-case>tructure <fixed-case>G</fixed-case>rammar</title>
<author><first>Bernd</first><last>Kiefer</last></author>
<author><first>Hans-Ulrich</first><last>Krieger</last></author>
<pages>135-146</pages>
<url hash="1ede73be">2000.iwpt-1.15</url>
<abstract>We present a context-free approximation of unification-based grammars, such as HPSG or PATR-II. The theoretical underpinning is established through a least fixpoint construction over a certain monotonic function. In order to reach a finite fixpoint, the concrete implementation can be parameterized in several ways , either by specifying a finite iteration depth, by using different restrictors, or by making the symbols of the CFG more complex adding annotations a la GPSG. We also present several methods that speed up the approximation process and help to limit the size of the resulting CF grammar.</abstract>
<bibkey>kiefer-krieger-2000-context</bibkey>
</paper>
<paper id="16">
<title>Optimal Ambiguity Packing in Context-free Parsers with Interleaved Unification</title>
<author><first>Alon</first><last>Lavie</last></author>
<author><first>Carolyn Penstein</first><last>Rosé</last></author>
<pages>147-158</pages>
<url hash="7cc8b864">2000.iwpt-1.16</url>
<abstract>Ambiguity packing is a well known technique for enhancing the efficiency of context-free parsers. However, in the case of unification-augmented context-free parsers where parsing is interleaved with feature unification, the propagation of feature structures imposes difficulties on the ability of the parser to effectively perform ambiguity packing. We demonstrate that a clever heuristic for prioritizing the execution order of grammar rules and parsing actions can achieve a high level of ambiguity packing that is provably optimal. We present empirical evaluations of the proposed technique, performed with both a Generalized LR parser and a chart parser, that demonstrate its effectiveness.</abstract>
<bibkey>lavie-rose-2000-optimal</bibkey>
</paper>
<paper id="17">
<title>Extended Partial Parsing for Lexicalized Tree Grammars</title>
<author><first>Patrice</first><last>Lopez</last></author>
<pages>159-170</pages>
<url hash="298f6733">2000.iwpt-1.17</url>
<abstract>Existing parsing algorithms for Lexicalized Tree Grammars (LTG) formalisms (LTAG, TIG, DTG, ... ) are adaptations of algorithms initially dedicated to Context Free Grammars (CFG). They do not really take into account the fact that we do not use context free rules but partial parsing trees that we try to combine. Moreover the lexicalization raises up the important problem of multiplication of structures, a problem which does not exist in CFG. This paper presents parsing techniques for LTG taking into account these two fundamental features. Our approach focuses on robust and pratical purposes. Our parsing algorithm results in more extended partial parsing when the global parsing fails and in an interesting average complexity compared with others bottom-up algorithms.</abstract>
<bibkey>lopez-2000-extended</bibkey>
</paper>
<paper id="18">
<title>Improved Left-corner Chart Parsing for Large Context-free Grammars</title>
<author><first>Robert C.</first><last>Moore</last></author>
<pages>171-182</pages>
<url hash="d523663b">2000.iwpt-1.18</url>
<abstract>We develop an improved form of left-corner chart parsing for large context-free grammars, introducing improvements that result in significant speed-ups more compared to previously-known variants of left corner parsing. We also compare our method to several other major parsing approaches, and find that our improved left-corner parsing method outperforms each of these across a range of grammars. Finally, we also describe a new technique for minimizing the extra information needed to efficiently recover parses from the data structures built in the course of parsing.</abstract>
<bibkey>moore-2000-improved</bibkey>
</paper>
<paper id="19">
<title>Measure for Measure: Parser Cross-fertilization - Towards Increased Component Comparability and Exchange</title>
<author><first>Stephan</first><last>Oepen</last></author>
<author><first>Ulrich</first><last>Callmeier</last></author>
<pages>183-194</pages>
<url hash="86a1d82a">2000.iwpt-1.19</url>
<abstract>Over the past few years significant progress was accomplished in efficient processing with wide-coverage HPSG grammars. HPSG-based parsing systems are now available that can process medium-complexity sentences (of ten to twenty words, say) in average parse times equivalent to real (i.e. human reading) time. A large number of engineering improvements in current HPSG systems were achieved through collaboration of multiple research centers and mutual exchange of experience, encoding techniques, algorithms, and even pieces of software. This article presents an approach to grammar and system engineering, termed competence & performance profiling, that makes systematic experimentation and the precise empirical study of system properties a focal point in development. Adapting the profiling metaphor familiar from software engineering to constraint-based grammars and parsers, enables developers to maintain an accurate record of system evolution, identify grammar and system deficiencies quickly, and compare to earlier versions or between different systems. We discuss a number of exemplary problems that motivate the experimental approach, and apply the empirical methodology in a fairly detailed discussion of what was achieved during a development period of three years. Given the collaborative nature in setup, the empirical results we present involve research and achievements of a large group of people.</abstract>
<bibkey>oepen-callmeier-2000-measure</bibkey>
</paper>
<paper id="20">
<title>Computing the Most Probable Parse for a Discontinuous Phrase Structure Grammar</title>
<author><first>Oliver</first><last>Plaehn</last></author>
<pages>195-206</pages>
<url hash="fe7acc9d">2000.iwpt-1.20</url>
<abstract>This paper presents a probabilistic extension of Discontinuous Phrase Structure Grammar (DPSG), a formalism designed to describe discontinuous constituency phenomena adequately and perspicuously by means of trees with crossing branches. We outline an implementation of an agenda-based chart parsing algorithm that is capable of computing the Most Probable Parse for a given input sentence for probabilistic versions of both DPSG and Context-Free Grammar. Experiments were conducted with both types of grammars extracted from the NEGRA corpus. In spite of the much greater complexity of DPSG parsing in terms of the number of (partial) analyses that can be constructed for an input sentence, accuracy results from both experiments are comparable. We also briefly hint at future lines of research aimed at more efficient ways of probabilistic parsing with discontinuous constituents.</abstract>
<bibkey>plaehn-2000-computing</bibkey>
</paper>
<paper id="21">
<title>An Efficient <fixed-case>LR</fixed-case> Parser Generator for <fixed-case>T</fixed-case>ree <fixed-case>A</fixed-case>djoining <fixed-case>G</fixed-case>rammars</title>
<author><first>Carlos A.</first><last>Prolo</last></author>
<pages>207-218</pages>
<url hash="73000bac">2000.iwpt-1.21</url>
<abstract>The first published LR algorithm for Tree Adjoining Grammars (TAGs [Joshi and Schabes, 1996]) was due to Schabes and Vijay-Shanker [1990] . Nederhof [1998] showed that it was incorrect (after [Kinyon, 1997]), and proposed a new one. Experimenting with his new algorithm over the XTAG English Grammar [XTAG Research Group, 1998] he concluded that LR parsing was inadequate for use with reasonably sized grammars because the size of the generated table was unmanageable. Also the degree of conflicts is too high. In this paper we discuss issues involved with LR parsing for TAGs and propose a new version of the algorithm that, by maintaining the degree of prediction while deferring the “subtree reduction”, dramatically reduces both the average number of conflicts per state and the size of the parser.</abstract>
<bibkey>prolo-2000-efficient</bibkey>
</paper>
<paper id="22">
<title>Parsing Scrambling with Path Set: a Graded Grammaticality Approach</title>
<author><first>Siamak</first><last>Rezaei</last></author>
<pages>219-230</pages>
<url hash="71c679a6">2000.iwpt-1.22</url>
<abstract>In this work we introduce the notion of path set for parsing free word order languages. The parsing system uses this notion to parse examples of sentences with scrambling. We show that by using path set, the performance constraints on scrambling such as Resource Limitation Principle (RLP) can be represented easily. Our work contrasts with models based on the notion of immediate dominance rule and binary precedence relations. In our work the precedence relations and word order constraints are defined locally for each clause. Our binary precedence relations are examples of fuzzy relations with weights attached to them. As a result, the word order principles in our approach can be violated and each violation contributes to a lowering of the overall acceptability and grammaticality. The work suggests a robust principle-based approach to parsing ambiguous sentences in verb final languages.</abstract>
<bibkey>rezaei-2000-parsing</bibkey>
</paper>
<paper id="23">
<title>On the Use of Grammar Based Language Models for Statistical Machine Translation</title>
<author><first>Hassan</first><last>Sawaf</last></author>
<author><first>Kai</first><last>Schütz</last></author>
<author><first>Hermann</first><last>Ney</last></author>
<pages>231-241</pages>
<url hash="99f459e7">2000.iwpt-1.23</url>
<abstract>In this paper, we describe some concepts of language models beyond the usually used standard trigram and use such language models for statistical machine translation. In statistical machine translation the language model is the a-priori knowledge source of the system about the target language. One important requirement for the language model is the correct word order, given a certain choice of words, and to score the translations generated by the translation model <tex-math>\textrm{Pr}(f_1^J/e^I_1)</tex-math>, in view of the syntactic context. In addition to standard <tex-math>m</tex-math>-grams with long histories, we examine the use of Part-of-Speech based models as well as linguistically motivated grammars with stochastic parsing as a special type of language model. Translation results are given on the VERBMOBIL task, where translation is performed from German to English, with vocabulary sizes of 6500 and 4000 words, respectively.</abstract>
<bibkey>sawaf-etal-2000-use</bibkey>
</paper>
<paper id="24">
<title>Algebraic Construction of Parsing Schemata</title>
<author><first>Karl-Michael</first><last>Schneider</last></author>
<pages>242-253</pages>
<url hash="55cd07f0">2000.iwpt-1.24</url>
<abstract>We propose an algebraic method for the design of tabular parsing algorithms which uses parsing schemata [7]. The parsing strategy is expressed in a tree algebra. A parsing schema is derived from the tree algebra by means of algebraic operations such as homomorphic images, direct products, subalgebras and quotient algebras. The latter yields a tabular interpretation of the parsing strategy. The proposed method allows simpler and more elegant correctness proofs by using general theorems and is not limited to left-right parsing strategies, unlike current automaton-based approaches. Furthermore, it allows to derive parsing schemata for linear indexed grammars (LIG) from parsing schemata for context-free grammars by means of a correctness preserving algebraic transformation. A new bottom-up head corner parsing schema for LIG is constructed to demonstrate the method.</abstract>
<bibkey>schneider-2000-algebraic</bibkey>
</paper>
<paper id="25">
<title>A <fixed-case>S</fixed-case>panish <fixed-case>POS</fixed-case> Tagger with Variable Memory</title>
<author><first>José</first><last>Triviño</last></author>
<author><first>Rafael</first><last>Morales-Bueno</last></author>
<pages>254-265</pages>
<url hash="693cfc23">2000.iwpt-1.25</url>
<abstract>An implementation of a Spanish POS tagger is described in this paper. This implementation combines three basic approaches: a single word tagger based on decision trees, a POS tagger based on variable memory Markov models, and a feature structures set of tags. Using decision trees for single word tagging allows the tagger to work without a lexicon that lists only possible tags. Moreover, it decreases the error rate because there are no unknown words. The feature structure set of tags is advantageous when the available training corpus is small and the tag set large, which can be the case with morphologically rich languages like Spanish. Finally, variable memory Markov models training is more efficient than traditional full-order Markov models and achieves better accuracy. In this implementation, 98.58% of tokens are correctly classified.</abstract>
<bibkey>trivino-morales-bueno-2000-spanish</bibkey>
</paper>
<paper id="26">
<title>Parsing a Lattice with Multiple Grammars</title>
<author><first>Fuliang</first><last>Weng</last></author>
<author><first>Helen</first><last>Meng</last></author>
<author><first>Po Chui</first><last>Luk</last></author>
<pages>266-277</pages>
<url hash="06c40834">2000.iwpt-1.26</url>
<abstract>Efficiency, memory, ambiguity, robustness and scalability are the central issues in natural language parsing. Because of the complexity of natural language, different parsers may be suited only to certain subgrammars. In addition, grammar maintenance and updating may have adverse effects on tuned parsers. Motivated by these concerns, [25] proposed a grammar partitioning and top-down parser composition mechanism for loosely restricted Context-Free Grammars (CFGs). In this paper, we report on significant progress, i.e., (1) developing guidelines for the grammar partition through a set of heuristics, (2) devising a new mix-strategy composition algorithms for any rule-based grammar partition in a lattice framework, and 3) initial but encouraging parsing results for Chinese and English queries from an Air Travel Information System (ATIS) corpus.</abstract>
<bibkey>weng-etal-2000-parsing</bibkey>
</paper>
<paper id="27">
<title>Modular Unification-based Parsers</title>
<author><first>Rémi</first><last>Zajac</last></author>
<author><first>Jan</first><last>Amtrup</last></author>
<pages>278-290</pages>
<url hash="7cc99606">2000.iwpt-1.27</url>
<abstract>We present an implementation of the notion of modularity and composition applied to unification based grammars. Monolithic unification grammars can be decomposed into sub-grammars with well defined interfaces. Sub-grammars are applied in a sequential manner at runtime, allowing incremental development and testing of large coverage grammars. The modular approach to grammar development leads us away from the traditional view of parsing a string of input symbols as the recognition of some start symbol, and towards a richer and more flexible view where inputs and outputs share the same structural properties.</abstract>
<bibkey>zajac-amtrup-2000-modular</bibkey>
</paper>
<paper id="28">
<title>Hypergraph Unification-based Parsing for Incremental Speech Processing</title>
<author><first>Jan</first><last>Amtrup</last></author>
<pages>291-292</pages>
<url hash="b7516173">2000.iwpt-1.28</url>
<abstract/>
<bibkey>amtrup-2000-hypergraph</bibkey>
</paper>
<paper id="29">
<title>Parsing Mildly Context-sensitive <fixed-case>RMS</fixed-case></title>
<author><first>Tilman</first><last>Becker</last></author>
<author><first>Dominik</first><last>Heckmann</last></author>
<pages>293-294</pages>
<url hash="8bcd3ad1">2000.iwpt-1.29</url>
<abstract>We introduce Recursive Matrix Systems (RMS) which encompass mildly context-sensitive formalisms and present efficient parsing algorithms for linear and context-free variants of RMS. The time complexities are <tex-math>\mathcal{O}(n^{2h + 1})</tex-math>, and <tex-math>\mathcal{O}(n^{3h})</tex-math> respectively, where <tex-math>h</tex-math> is the height of the matrix. It is possible to represent Tree Adjoining Grammars (TAG [1], MC-TAG [2], and R-TAG [3]) as RMS uniformly.</abstract>
<bibkey>becker-heckmann-2000-parsing</bibkey>
</paper>
<paper id="30">
<title>Property Grammars: a Solution for Parsing with Constraints</title>
<author><first>Philippe</first><last>Blache</last></author>
<pages>295-296</pages>
<url hash="90559f7d">2000.iwpt-1.30</url>
<abstract/>
<bibkey>blache-2000-property</bibkey>
</paper>
<paper id="31">
<title>Grammar Organization for Cascade-based Parsing in Information Extraction</title>
<author><first>Fabio</first><last>Ciravegna</last></author>
<author><first>Alberto</first><last>Lavelli</last></author>
<pages>297-298</pages>
<url hash="eb265dbc">2000.iwpt-1.31</url>
<abstract/>
<bibkey>ciravegna-lavelli-2000-grammar</bibkey>
</paper>
<paper id="32">
<title>A Bidirectional Bottom-up Parser for <fixed-case>TAG</fixed-case></title>
<author><first>Víctor</first><last>Díaz</last></author>
<author><first>Vicente</first><last>Carrillo</last></author>
<author><first>Miguel</first><last>Alonso</last></author>
<pages>299-300</pages>
<url hash="db776cf1">2000.iwpt-1.32</url>
<abstract/>
<bibkey>diaz-etal-2000-bidirectional-bottom</bibkey>
</paper>
<paper id="33">
<title>A Finite-state Parser with Dependency Structure Output</title>
<author><first>David</first><last>Elworthy</last></author>
<pages>301-302</pages>
<url hash="96f0d64e">2000.iwpt-1.33</url>
<abstract>We show how to augment a finite-state grammar with annotations which allow dependency structures to be extracted. There are some difficulties in determinising the grammar, which is an essential step for computational efficiency, but they can be overcome. The parser also allows syntactically ambiguous structures to be packed into a single representation.</abstract>
<bibkey>elworthy-2000-finite</bibkey>
</paper>
<paper id="34">
<title>Discriminant Reverse <fixed-case>LR</fixed-case> Parsing of Context-free Grammars</title>
<author><first>Jacques</first><last>Farré</last></author>
<pages>303-304</pages>
<url hash="61d83602">2000.iwpt-1.34</url>
<abstract/>
<bibkey>farre-2000-discriminant</bibkey>
</paper>
<paper id="35">
<title>Direct Parsing of Schema-<fixed-case>TAG</fixed-case>s</title>
<author><first>Karin</first><last>Harbusch</last></author>
<author><first>Jens</first><last>Woch</last></author>
<pages>305-306</pages>
<url hash="015f1d30">2000.iwpt-1.35</url>
<abstract/>
<bibkey>harbusch-woch-2000-direct</bibkey>
</paper>
<paper id="36">
<title>Analysis of Equation Structure using Least Cost Parsing</title>
<author><first>R. Nigel</first><last>Horspool</last></author>
<author><first>John</first><last>Aycock</last></author>
<pages>307-308</pages>
<url hash="b0195daa">2000.iwpt-1.36</url>
<abstract>Mathematical equations in LaTeX are composed with tags that express formatting as opposed to structure. For conversion from LaTeX to other word-processing systems, the structure of each equation must be inferred. We show how a form of least cost parsing used with a very general and ambiguous grammar may be used to select an appropriate structure for a LaTeX equation. MathML provides another application for the same technology; it has two alternative tagging schemes - presentation tags to specify formatting and content tags to specify structure. While conversion from content tagging to presentation tagging is straightforward, the converse is not. Our implementation of least cost parsing is based on Earley’s algorithm.</abstract>
<bibkey>horspool-aycock-2000-analysis</bibkey>
</paper>
<paper id="37">
<title>Exploiting Parallelism in Unification-based Parsing</title>
<author><first>Marcel P.</first><last>van Lohuizen</last></author>
<pages>309-310</pages>
<url hash="a2a364a4">2000.iwpt-1.37</url>
<abstract>Because of the nature of the parsing problem, unification-based parsers are hard to parallelize. We present a parallelization technique designed to cope with these difficulties.</abstract>
<bibkey>van-lohuizen-2000-exploiting</bibkey>
</paper>
<paper id="38">
<title>Partial Parsing with Grammatical Features</title>
<author><first>Natasa</first><last>Manousopoulou</last></author>
<author><first>George</first><last>Papakonstantinou</last></author>
<author><first>Panayotis</first><last>Tsanakas</last></author>
<pages>311-312</pages>
<url hash="0a98a2dd">2000.iwpt-1.38</url>
<abstract>This paper describes a rule based method for partial parsing, particularly for noun phrase recognition, which has been used in the development of a noun phrase recognizer for Modern Greek. This technique is based on a cascade of finite state machines, adding to them a characteristic very crucial in the parsing of words with free word order: the simultaneous examination of part of speech and grammatical feature information, which are deemed equally important during the parsing procedure, in contrast with other methodologies.</abstract>
<bibkey>manousopoulou-etal-2000-partial</bibkey>
</paper>
<paper id="39">
<title>Uniquely Parsable Accepting Grammar Systems</title>
<author><first>Carlos</first><last>Martín-Vide</last></author>
<author><first>Victor</first><last>Mitrana</last></author>
<pages>313-314</pages>
<url hash="23ccb7b0">2000.iwpt-1.39</url>
<abstract/>
<bibkey>martin-vide-mitrana-2000-uniquely</bibkey>
</paper>
<paper id="40">
<title>Chart Parsing as Constraint Propagation</title>
<author><first>Frank</first><last>Morawietz</last></author>
<pages>315-316</pages>
<url hash="8d92f5bf">2000.iwpt-1.40</url>
<abstract/>
<bibkey>morawietz-2000-chart-parsing</bibkey>
</paper>
<paper id="41">
<title>Tree-structured Chart Parsing</title>
<author><first>Paul W.</first><last>Placeway</last></author>
<pages>317-318</pages>
<url hash="40533bb7">2000.iwpt-1.41</url>
<abstract>We investigate a method of improving the memory efficiency of a chart parser. Specifically, we propose a technique to reduce the number of active arcs created in the process of parsing. We sketch the differences in the chart algorithm, and provide empirical results that demonstrate the effectiveness of this technique.</abstract>
<bibkey>placeway-2000-tree</bibkey>
</paper>
<paper id="42">
<title>A Parsing Methodology for Error Detection</title>
<author><first>Davide</first><last>Turcato</last></author>
<author><first>Devlan</first><last>Nicholson</last></author>
<author><first>Trude</first><last>Heift</last></author>
<author><first>Janine</first><last>Toole</last></author>
<author><first>Stavroula</first><last>Tsiplakou</last></author>
<pages>319-320</pages>
<url hash="4435792d">2000.iwpt-1.42</url>
<abstract/>
<bibkey>turcato-etal-2000-parsing</bibkey>
</paper>
<paper id="43">
<title>Dependency Model using Posterior Context</title>
<author><first>Kiyotaka</first><last>Uchimoto</last></author>
<author><first>Masaki</first><last>Murata</last></author>
<author><first>Satoshi</first><last>Sekine</last></author>
<author><first>Hitoshi</first><last>Isahara</last></author>
<pages>321-322</pages>
<url hash="d1d4da1c">2000.iwpt-1.43</url>
<abstract>We describe a new model for dependency structure analysis. This model learns the relationship between two phrasal units called bunsetsus as three categories; ‘between’, ‘dependent’, and ‘beyond’, and estimates the dependency likelihood by considering not only the relationship between two bunsetsus but also the relationship between the left bunsetsu and all of the bunsetsus to its right. We implemented this model based on the maximum entropy model. When using the Kyoto University corpus, the dependency accuracy of our model was 88%, which is about 1% higher than that of the conventional model using exactly the same features.</abstract>
<bibkey>uchimoto-etal-2000-dependency</bibkey>
</paper>
<paper id="44">
<title>The Editing Distance in Shared Forest</title>
<author><first>Manuel</first><last>Vilares</last></author>
<author><first>David</first><last>Cabrero</last></author>
<author><first>Francisco J.</first><last>Ribadas</last></author>
<pages>323-324</pages>
<url hash="4c3bc888">2000.iwpt-1.44</url>
<abstract>In an information system indexing can be accomplished by creating a citation based on context-free parses, and matching becomes a natural mechanism to extract patterns. However, the language intended to represent the document can often only be approximately defined, and indices can become shared forests. Queries could also vary from indices and an approximate matching strategy becomes also necessary. We present a proposal intended to prove the applicability of tabulation techniques in this context.</abstract>
<bibkey>vilares-etal-2000-editing</bibkey>
</paper>
</volume>
</collection>