Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Herml hangs in infinite recursion on double quotes and non-ASCII chars. #5

Open
drdaeman opened this issue Jan 7, 2011 · 2 comments

Comments

@drdaeman
Copy link

drdaeman commented Jan 7, 2011

I've found Herml (tested on 2942bd1) to hang in parser-generated code on the following template:

!!!
%html
  %head
    %meta[{charset,"utf-8"}]/

Generated parser continued to recursively call itself herml_scan:string/4 in an endless loop, quickly hogging all memory. Unfortunately, I'm too newbie to Erlang and leex, so I don't really understand why this happens. Anyway, the calls are occurring in this way:

...
string/4(",\"utf-8\"}]/", 1, ",\"utf-8\"}]/", [{chr,1,"charset"},{lcurly,1,[123]},{lbrace,1,[91]},{chr,1,[109,101,116,97]},{tag_start,1,[37]}])
string/4("\"utf-8\"}]/", 1, "\"utf-8\"}]/", [{comma,1,[44]},{chr,1,"charset"},{lcurly,1,[123]},{lbrace,1,[91]},{chr,1,[109,101,116,97]},{tag_start,1,[37]}])
string/4("\"utf-8\"}]/", 1, "\"utf-8\"}]/", [{pipe,1,[]},{comma,1,[44]},{chr,1,"charset"},{lcurly,1,[123]},{lbrace,1,[91]},{chr,1,[109,101,116,97]},{tag_start,1,[37]}])
string/4("\"utf-8\"}]/", 1, "\"utf-8\"}]/", [{pipe,1,[]},{pipe,1,[]},{comma,1,[44]},{chr,1,"charset"},{lcurly,1,[123]},{lbrace,1,[91]},{chr,1,[109,101,116,97]},{tag_start,1,[37]}])
string/4("\"utf-8\"}]/", 1, "\"utf-8\"}]/", [{pipe,1,[]},{pipe,1,[]},{pipe,1,[]},{comma,1,[44]},{chr,1,"charset"},{lcurly,1,[123]},{lbrace,1,[91]},{chr,1,[109,101,116,97]},{tag_start,1,[37]}])
string/4("\"utf-8\"}]/", 1, "\"utf-8\"}]/", [{pipe,1,[]},{pipe,1,[]},{pipe,1,[]},{pipe,1,[]},{comma,1,[44]},{chr,1,"charset"},{lcurly,1,[123]},{lbrace,1,[91]},{chr,1,[109,101,116,97]},{tag_start,1,[37]}])
...

There are certainly no "|" characters in template, and I really don't know why {pipe,1,[]} is there.

After some mindless fiddling, I've found that similiar hangs happen on intentionally malformed %meta[{charset,"utf-8}]/ (missing second double quote) code, and when there are any "unknown" characters (for example, UTF-8 Cyrillic).

I've attempted to fix the issue with drdaeman/herml@58b4958, but due to a lack of expertise I don't know whenever this is the proper solution, or it just happen to work.

@seancribbs
Copy link
Collaborator

There was a specific reason why we chose to use single-quotes, but since I haven't touched the code in over a year, I don't recall why. Needs revisiting.

@kevsmith
Copy link
Owner

Actually I haven't touched this code in quite a while. I'm thinking about taking down the repo entirely and let someone else continue development on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants