-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Markdown rendering changes - stripping classes? #9054
Comments
Currently, output from all renderers is passed through a post-processor and an HTML sanitizer: gitea/modules/markup/markup.go Lines 83 to 92 in f8bd90b
The sanitizer rules are getting in your way. There's no way ATM of bypassing or tailoring those rules. |
I don't know in what version that behavior could have changed, though. |
May be options to bypass those steps could be added to the renderers' configuration? |
Ah no, you're right and I'm wrong. I pulled 1.9.1, 1.9.0, and 1.8.3 and all had the same behavior as 1.10.0. I must've been misremembering the order I did the migration in. I thought I went gogs -> gitea and then added the custom renderer but I must've done the reverse. Sorry for the noise!
That'd be fine with me ( Alternatively, what about something like: diff --git a/modules/markup/sanitizer.go b/modules/markup/sanitizer.go
index f873e8105..66bf1b60f 100644
--- a/modules/markup/sanitizer.go
+++ b/modules/markup/sanitizer.go
@@ -48,6 +48,9 @@ func ReplaceSanitizer() {
// Allow keyword markup
sanitizer.policy.AllowAttrs("class").Matching(regexp.MustCompile(`^` + keywordClass + `$`)).OnElements("span")
+
+ // Allow KaTeX markup from pandoc
+ sanitizer.policy.AllowAttrs("class").Matching(regexp.MustCompile(`^(math\s*|inline\s*|display\s*){0,3}$`)).OnElements("span")
}
// Sanitize takes a string that contains a HTML fragment or document and applies policy whitelist. That seems simpler: you're letting through a very small subset of classes and in the default installation, nothing will happen because you're not shipping KaTeX and you're not shipping the Pandoc render. The owner would have to manually add KaTeX rendering (scripts + css + ...) and modify the config to switch to Pandoc to hit this in most cases. Thought? |
Or we could add custom regexp express on third-party external renderer configuration so that user could define themselves. |
That sounds like the best candidate. May I take a shot at implementing that?
… |
@cipherboy Of course! You can check |
OK, I have a working draft on my fork. Is it possible to run tests without |
You certainly can run partial (but meaningful) tests with:
Once you're satisfied, you can run:
|
Allowing the gitea administrator to configure sanitization policy allows them to couple external renders and custom templates to support more markup. In particular, the `pandoc` renderer allows generating KaTeX annotations, wrapping them in `<span>` elements with class `math` and either `inline` or `display` (depending on whether or not inline or block mode was requested). This iteration gives the administrator whitelisting powers; carefully crafted regexes will thus let through only the desired attributes necessary to support their custom markup. Resolves: go-gitea#9054 Signed-off-by: Alexander Scheel <[email protected]>
* Support custom sanitization policy Allowing the gitea administrator to configure sanitization policy allows them to couple external renders and custom templates to support more markup. In particular, the `pandoc` renderer allows generating KaTeX annotations, wrapping them in `<span>` elements with class `math` and either `inline` or `display` (depending on whether or not inline or block mode was requested). This iteration gives the administrator whitelisting powers; carefully crafted regexes will thus let through only the desired attributes necessary to support their custom markup. Resolves: #9054 Signed-off-by: Alexander Scheel <[email protected]> * Document new sanitization configuration - Adds basic documentation to app.ini.sample, - Adds an example to the Configuration Cheat Sheet, and - Adds extended information to External Renderers section. Signed-off-by: Alexander Scheel <[email protected]> * Drop extraneous length check in newMarkupSanitizer(...) Signed-off-by: Alexander Scheel <[email protected]> * Fix plural ELEMENT and ALLOW_ATTR in docs These were left over from their initial names. Make them singular to conform with the current expectations. Signed-off-by: Alexander Scheel <[email protected]>
[x]
):Nothing relevant in logs.
Description
I've installed a custom markdown rendering based on
pandoc
as such:This lets me add a custom header and render LaTeX in Markdown with KaTeX:
Sometime recently (I remember it working early in 1.9.x series) this got broken. Looking at the source, it looks like classes on elements started getting stripped by gitea after
pandoc
got done rendering it.For example, the browser gets sent source like:
However, running the
pandoc
command above on the server (on the same source file) gives:Which makes me think gitea changed something recently. This results in KaTeX not rendering anything, which means my Math+Markdown files are now broken.
Did something change? Perhaps more markdown sanitation was added recently?
Screenshots
The text was updated successfully, but these errors were encountered: