Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally leave Unicode characters unescaped in MathML output #59

Open
skalee opened this issue Apr 8, 2021 · 2 comments
Open

Optionally leave Unicode characters unescaped in MathML output #59

skalee opened this issue Apr 8, 2021 · 2 comments

Comments

@skalee
Copy link

skalee commented Apr 8, 2021

This is a feature request. I can provide implementation, but I wanted to discuss things first.


MathML does not enforce any particular encoding, UTF-8 is legal, and non-ASCII characters can be used without escaping them [W3C].

However, this gem always encodes non-ASCII characters using numeric XML character references (e.g. é):

def append_escaped(text)
text.each_codepoint do |cp|
if cp == 38
@mathml << "&amp;"
elsif cp == 60
@mathml << "&lt;"
elsif cp == 62
@mathml << "&gt;"
elsif cp > 127
@mathml << "&#x#{cp.to_s(16).upcase};"
else
@mathml << cp
end
end
end
end

This is safer as it never depends on parent document's encoding, but on the other hand it hampers readability.

My suggestion is to add :escape_non_ascii option to Expression#to_mathml method which would disable this kind of escaping (of course <, >, and & will be escaped anyway). This option should default to false.

Perhaps similar option could be added to Expression#to_html method.

@pepijnve
Copy link
Member

pepijnve commented Apr 8, 2021

I've added the code to do this already. I'm still trying to figure out how I can let users pass options to to_mathml without breaking backwards compatibility. Looks like I've painted myself into a corner there.

@skalee
Copy link
Author

skalee commented Apr 8, 2021

Maybe something like:

class Expression
  def to_mathml(prefix = "", attrs = {}, options = {})
  end
end

or:

class Expression
  def to_mathml(prefix = "", attrs_or_escape_non_ascii = nil, attrs_or_nil = nil)
    attrs = #...
    escape_non_ascii = #...
  end
end

or even:

class Expression
  # use like:
  # expr.with_options(escape_non_ascii: false).to_math_ml
  def with_options(options)
    dup.tap do |new_expr|
      # set some options like disabling escaping on expression, which is quite odd, but at least gives reasonable interface
    end
  end
end

Keyword argument would be even better, but I suppose this gem still supports old good Ruby 1.9 for some reason?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants