Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address extraction code is impossible to work on #128

Open
slinkp opened this issue Sep 28, 2012 · 1 comment
Open

Address extraction code is impossible to work on #128

slinkp opened this issue Sep 28, 2012 · 1 comment

Comments

@slinkp
Copy link
Contributor

slinkp commented Sep 28, 2012

The regular expression that ebdata.nlp.addresses uses to find addresses is actually a 100-line regex into which a 130-line regex is inserted 11 times. The final regex is over 1500 lines long.

It is almost impossible to debug, fix, or extend this regex.
We need to re-think the address extraction approach completely.
Investigating whether there is existing natural-language work we can leverage.

@slinkp
Copy link
Contributor Author

slinkp commented Sep 28, 2012

Ticket imported from Trac:
http://developer.openblockproject.org/ticket/128
Reported by: slinkp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant