Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WikipediaTokenizer incorrectly splits certain syntax into multiple tokens [LUCENE-1141] #2218

Open
asfimport opened this issue Jan 18, 2008 · 2 comments

Comments

@asfimport
Copy link

WikipediaTokenizer incorrectly splits tokens that have italics/bold inside the token, for instance '''F'''oo is a bold Foo, not F, oo


Migrated from LUCENE-1141 by Grant Ingersoll (@gsingers), updated May 16 2011
Attachments: LUCENE-1141-test.patch

@asfimport
Copy link
Author

Grant Ingersoll (@gsingers) (migrated from JIRA)

Here's a test case for the problem

@asfimport
Copy link
Author

Jens Muecke (@ryd) (migrated from JIRA)

Patch doesn't apply any more.

common.compile-test:
[javac] Compiling 1 source file to /home/jens/projects/java/lucene-git/build/contrib/wikipedia/classes/test
[javac] /home/jens/projects/java/lucene-git/contrib/wikipedia/src/test/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerTest.java:232: cannot find symbol
[javac] Token token = new Token();
[javac] ^
[javac] symbol: class Token
[javac] location: class WikipediaTokenizerTest
[javac] /home/jens/projects/java/lucene-git/contrib/wikipedia/src/test/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerTest.java:232: cannot find symbol
[javac] Token token = new Token();
[javac] ^
[javac] symbol: class Token
[javac] location: class WikipediaTokenizerTest
[javac] Note: /home/jens/projects/java/lucene-git/contrib/wikipedia/src/test/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerTest.java uses unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 2 errors

BUILD FAILED

It's not by adding the import for the Token class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants