You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are a few issues in the function apply_content_annotation in filter.php, which can result in serious issues with the page display.
The use of DOMDocument->loadHTML, followed immediately by getting the body does not always work as expected.
loadHTML will only generate a body and put things inside it if it needs to be in the body.
Some locations this filter activates contain content which can exist outside the body, such as user added scripts and HTML comments.
These will not be found in the body of the generated DOMDocument.
If the text only contains such elements, there will be no body and warnings will be thrown, and then it will attempt to replace a > symbol with the annotation.
If there is non body HTML, followed by a single div with the class no-overflow, it wont wrap it, it will alter the non-body HTML, and not touch the single div at all.
This can be very problematic if the first non-body HTML is a HTML comment, as the text replacement will then break the end of comment tag, significantly altering the page contents.
This can be seen for example in an assignment by making the description contain only the html:
This will trigger PHP warnings, and it will convert it to something like <!-- hi there -- data-ally-richcontent = assign:assign:intro:3478947548 >
I think one possible solution would be to change the code to this: $doc = local_content::build_dom_doc('<body>' . $text . '</body>'); if (!$doc) { return $text; } $shouldwrap = true; $bodyChildren = $doc->getElementsByTagName('body')->item(0)->childNodes; if($bodyChildren->length === 1) { $node = $bodyChildren->item(0); $shouldwrap = $node->nodeName !== 'div' || strpos($node->getAttribute('class'), 'no-overflow') === false; }
The change to enclose it in body tags will force everything to be in the body, so there will no longer be an issue of no body tag, or the body tag only containing some of the items instead of all.
Setting shouldwrap to true and then updating it will mean by default it will wrap and only when it confirms it shouldn't it wont, which means errors will have it wrap.
Changing it to use nodeName instead of tagName will still produce div for divs, but for text nodes and the comments and the like, instead of being an error it will just report the corresponding node Name.
And here are some test cases which can be used (as a php array of arrays, indicating the input string and if it should be wrapped: $tests = [ ['<!--hi there-->', true], ['<!--hi there--><div class="no-overflow">Test</div>', true], ['<div class="no-overflow">Test</div>', false], ['<p class="no-overflow">Test</p>', true], ['<div>Test</div>', true], ['<div class="no-overflow">Test</div><p>Test</p>', true], ];
With the current code, the first 2 tests fail and the other 4 pass. The first three also produce warnings.
With my suggested changes, all 6 tests pass.
The text was updated successfully, but these errors were encountered:
Many thanks for your clear explanation of the issue, suggestion for fixing it and test examples. This should be included in the next release. Or if you would like to create a PR to master I will try to get it merged for you. (I don't have the ability to merge PRs but I can recommend that they should be)
In the meantime, I thought of another example which causes a problem: <div title="Hover > Get text" class="no-overflow">Hover for more</div>
In this case, the > in the attribute gets replaced, also breaking the HTML
I think a simple way to possibly fix that is to change the pattern that is matched to /<div/mU
However, this would mean if for whatever reason the annotation is applied multiple times, to the same input text, the last annotation that is applied will take precedence rather than the first annotation that is applied.
I'm not sure if this would ever happen.
If it does, it can be addressed by also using $node->getAttribute('data-ally-richcontent') to see if it has already been annotated.
There are a few issues in the function apply_content_annotation in filter.php, which can result in serious issues with the page display.
The use of DOMDocument->loadHTML, followed immediately by getting the body does not always work as expected.
loadHTML will only generate a body and put things inside it if it needs to be in the body.
Some locations this filter activates contain content which can exist outside the body, such as user added scripts and HTML comments.
These will not be found in the body of the generated DOMDocument.
If the text only contains such elements, there will be no body and warnings will be thrown, and then it will attempt to replace a > symbol with the annotation.
If there is non body HTML, followed by a single div with the class no-overflow, it wont wrap it, it will alter the non-body HTML, and not touch the single div at all.
This can be very problematic if the first non-body HTML is a HTML comment, as the text replacement will then break the end of comment tag, significantly altering the page contents.
This can be seen for example in an assignment by making the description contain only the html:
This will trigger PHP warnings, and it will convert it to something like
<!-- hi there -- data-ally-richcontent = assign:assign:intro:3478947548 >
I think one possible solution would be to change the code to this:
$doc = local_content::build_dom_doc('<body>' . $text . '</body>'); if (!$doc) { return $text; } $shouldwrap = true; $bodyChildren = $doc->getElementsByTagName('body')->item(0)->childNodes; if($bodyChildren->length === 1) { $node = $bodyChildren->item(0); $shouldwrap = $node->nodeName !== 'div' || strpos($node->getAttribute('class'), 'no-overflow') === false; }
The change to enclose it in body tags will force everything to be in the body, so there will no longer be an issue of no body tag, or the body tag only containing some of the items instead of all.
Setting shouldwrap to true and then updating it will mean by default it will wrap and only when it confirms it shouldn't it wont, which means errors will have it wrap.
Changing it to use nodeName instead of tagName will still produce div for divs, but for text nodes and the comments and the like, instead of being an error it will just report the corresponding node Name.
And here are some test cases which can be used (as a php array of arrays, indicating the input string and if it should be wrapped:
$tests = [ ['<!--hi there-->', true], ['<!--hi there--><div class="no-overflow">Test</div>', true], ['<div class="no-overflow">Test</div>', false], ['<p class="no-overflow">Test</p>', true], ['<div>Test</div>', true], ['<div class="no-overflow">Test</div><p>Test</p>', true], ];
With the current code, the first 2 tests fail and the other 4 pass. The first three also produce warnings.
With my suggested changes, all 6 tests pass.
The text was updated successfully, but these errors were encountered: