Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use interval calculus to compute regex ranges #14

Open
malcolmsparks opened this issue Aug 15, 2019 · 1 comment
Open

Use interval calculus to compute regex ranges #14

malcolmsparks opened this issue Aug 15, 2019 · 1 comment
Labels
enhancement New feature or request

Comments

@malcolmsparks
Copy link
Contributor

Computing regexes for ranges for code-points can be slow. Therefore, code-points beyond the Basic Multilingual Plane are not yet supported.

A faster approach is needed. Allen's interval algebra may offer a solution.

https://en.wikipedia.org/wiki/Allen%27s_interval_algebra

@malcolmsparks
Copy link
Contributor Author

See comment here:

;; These higher code-points that lie outside the BMP
;; are significantly impacting compile performance. We
;; need to be able to do partition-into-ranges in a
;; much more performant way. Perhap with intervals
;; rather than brute-force expansion of ranges into
;; sequences of ints.
;;(int-range 0x10000 0x1FFFD)
;;(int-range 0x20000 0x2FFFD)
;;(int-range 0x30000 0x3FFFD)
;;(int-range 0x40000 0x4FFFD)
;;(int-range 0x50000 0x5FFFD)
;;(int-range 0x60000 0x6FFFD)
;;(int-range 0x70000 0x7FFFD)
;;(int-range 0x80000 0x8FFFD)
;;(int-range 0x90000 0x9FFFD)
;;(int-range 0xA0000 0xAFFFD)
;;(int-range 0xB0000 0xBFFFD)
;;(int-range 0xC0000 0xCFFFD)
;;(int-range 0xD0000 0xDFFFD)
;;(int-range 0xE1000 0xEFFFD)
)))

@malcolmsparks malcolmsparks added the enhancement New feature or request label Aug 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant