-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate unnecessary escape characters #4176
Conversation
To me, this should directly trigger And this should go in the 3.x branch (so that 4.0 can ship with the new behavior instead of having to wait until 5.0) |
Thanks for the review. I think it makes sense. I will change this PR to target v3 and make sure we only deprecate and not introduce new functionality. Then for v4 we can do the other change. |
707e826
to
7bbbe6f
Compare
Oke, I updated the PR to target V3. We don't change the behavior. We only deprecate, using When this is merged, v4 is rebased with v3, I can provide the PR that changes the behavior there. Please have another look when you have time 😊 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably add a note in the docs explaining when one needs to escape.
f385f58
to
5b1c8b1
Compare
I refactored the code a bit and added more test cases. I think good to carefully review these test cases. Let me know if you think I can add more cases. Also updated the docs. |
5b1c8b1
to
c062f96
Compare
c062f96
to
18bc4f2
Compare
0954f23
to
91f0010
Compare
Alright, I think I'm done. I added more documentation related to Octal and Hexadecimal escape sequences. |
91f0010
to
63b3230
Compare
@fabpot what do you think of this? Anything we should change or add more tests? |
I've just run a small performance test via the script below and (maybe as expected) the hit can be significant: <?php
require_once __DIR__.'/vendor/autoload.php';
$string = str_repeat("日本では、春になると桜の花が咲きます。多くの人々は、公園や川の近くに集まり、お花見を楽しみます。桜の花びらが風に舞い、まるで雪のように見える瞬間は、とても美しいです。", 10);
$loader = new Twig\Loader\ArrayLoader([
'index.twig' => <<<EOF
{{ "$string" }}
EOF
,
]);
$twig = new Twig\Environment($loader);
for ($i = 0; $i < 10000; $i++) {
$twig->tokenize(new Twig\Source($twig->getLoader()->getSourceContext('index.twig')->getCode(), 'index.twig'));
} On my machine, ~100ms for the 3.x branch vs ~900ms for this branch. But I suppose that this kind of long string is not very common. I would expect "hardcoded" strings to be quite short in templates. |
I ran the bench on my machine:
When I change the private function stripcslashes(string $str, string $quoteType): string
{
+ if (!str_contains($str, '\\')) {
+ return $str;
+ }
+ // ... it becomes this:
But obviously, that's because the bench does not contain any escape characters. So let's change the bench to: <?php
require_once __DIR__.'/vendor/autoload.php';
$string = str_repeat("日本では、春になると桜の花が咲きます。\n多くの人々は、公園や川の近くに集まり、お花見を楽しみます。\\n桜の花びらが風に舞い、まるで雪のように見える瞬間は、とても美しいです。", 10);
$loader = new Twig\Loader\ArrayLoader([
'index.twig' => <<<EOF
{{ "$string" }}
EOF
,
]);
$twig = new Twig\Environment($loader);
for ($i = 0; $i < 10000; $i++) {
$twig->tokenize(new Twig\Source($twig->getLoader()->getSourceContext('index.twig')->getCode(), 'index.twig'));
} Now we get
So the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You haven't added the str_contains()
call yet, right?
fc6704a
to
a4b9cae
Compare
See twigphp#4123 twigphp#2712 This allows to change the implementation in v4 so that we no longer have to escape backslashes.
a4b9cae
to
c6656cf
Compare
@fabpot Applied the feedback and also added the |
Thank you @ruudk. |
Thanks! I will prep the next PR when v4 is synced. |
Hi @ruudk ... i may have a version that reduces the runtime difference, using chunks to avoid looping on every char. It passes the tests, but i'm not sure if this goes in the direction you wanted for it. So if you want to check / benchmark it: private function stripcslashes(string $str, string $quoteType): string
{
if (!str_contains($str, '\\')) {
return $str;
}
$result = '';
$length = strlen($str);
$specialChars = [
'f' => "\f",
'n' => "\n",
'r' => "\r",
't' => "\t",
'v' => "\v",
];
$i = 0;
while ($i < $length) {
if (false === $pos = strpos($str, '\\', $i)) {
$result .= substr($str, $i);
break;
}
$result .= substr($str, $i, $pos - $i);
$i = $pos + 1;
if ($i >= $length) {
$result .= '\\';
break;
}
$nextChar = $str[$i];
if (isset($specialChars[$nextChar])) {
$result .= $specialChars[$nextChar];
} elseif ($nextChar === '\\' || $nextChar === $quoteType) {
$result .= $nextChar;
} elseif ($nextChar === '#' && $i + 1 < $length && $str[$i + 1] === '{') {
$result .= '#{';
$i++;
} elseif ($nextChar === 'x' && $i + 1 < $length && ctype_xdigit($str[$i + 1])) {
$hex = $str[++$i];
if ($i + 1 < $length && ctype_xdigit($str[$i + 1])) {
$hex .= $str[++$i];
}
$result .= chr(hexdec($hex));
} elseif (ctype_digit($nextChar) && $nextChar < '8') {
$octal = $nextChar;
while ($i + 1 < $length && ctype_digit($str[$i + 1]) && $str[$i + 1] < '8' && strlen($octal) < 3) {
$octal .= $str[++$i];
}
$result .= chr(octdec($octal));
} else {
trigger_deprecation('twig/twig', '3.12', sprintf('Character "%s" at position %d does not need to be escaped anymore.', $nextChar, $i + 1));
$result .= $nextChar;
}
$i++;
}
return $result;
} |
@smnandre smart! Will check it out today |
See twigphp#4176 (comment) Co-Authored-By: Simon André <[email protected]> ``` # Old Benchmark 1: php bench.php Time (mean ± σ): 1.192 s ± 0.004 s [User: 1.163 s, System: 0.009 s] Range (min … max): 1.186 s … 1.198 s 10 runs # New Benchmark 1: php bench.php Time (mean ± σ): 356.5 ms ± 1.1 ms [User: 342.0 ms, System: 7.6 ms] Range (min … max): 354.6 ms … 358.2 ms 10 runs ``` Using bench.php: ``` <?php require_once __DIR__.'/vendor/autoload.php'; $string = str_repeat("Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut a tincidunt turpis. \\App\\Entity\\Product \World \065 App\#{var} \'quoted\' Donec pharetra enim quis erat pharetra, dignissim molestie erat laoreet. Vivamus auctor purus sed lorem vestibulum rhoncus.", 10); $loader = new Twig\Loader\ArrayLoader([ 'index.twig' => <<<EOF {{ "$string" }} EOF , ]); $twig = new Twig\Environment($loader); for ($i = 0; $i < 10000; $i++) { $twig->tokenize(new Twig\Source($twig->getLoader()->getSourceContext('index.twig')->getCode(), 'index.twig')); } ```
See twigphp#4176 (comment) Co-Authored-By: Simon André <[email protected]> ``` # Old Benchmark 1: php bench.php Time (mean ± σ): 1.192 s ± 0.004 s [User: 1.163 s, System: 0.009 s] Range (min … max): 1.186 s … 1.198 s 10 runs # New Benchmark 1: php bench.php Time (mean ± σ): 356.5 ms ± 1.1 ms [User: 342.0 ms, System: 7.6 ms] Range (min … max): 354.6 ms … 358.2 ms 10 runs ``` Using bench.php: ``` <?php require_once __DIR__.'/vendor/autoload.php'; $string = str_repeat("Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut a tincidunt turpis. \\App\\Entity\\Product \World \065 App\#{var} \'quoted\' Donec pharetra enim quis erat pharetra, dignissim molestie erat laoreet. Vivamus auctor purus sed lorem vestibulum rhoncus.", 10); $loader = new Twig\Loader\ArrayLoader([ 'index.twig' => <<<EOF {{ "$string" }} EOF , ]); $twig = new Twig\Environment($loader); for ($i = 0; $i < 10000; $i++) { $twig->tokenize(new Twig\Source($twig->getLoader()->getSourceContext('index.twig')->getCode(), 'index.twig')); } ```
Created the PR to optimize this further: #4196 |
See twigphp#4176 (comment) Co-Authored-By: Simon André <[email protected]> ``` # Old Benchmark 1: php bench.php Time (mean ± σ): 1.204 s ± 0.012 s [User: 1.185 s, System: 0.011 s] Range (min … max): 1.191 s … 1.225 s 10 runs # New Benchmark 1: php bench.php Time (mean ± σ): 368.3 ms ± 4.8 ms [User: 354.2 ms, System: 10.1 ms] Range (min … max): 362.9 ms … 375.6 ms 10 runs ``` Using bench.php: ``` <?php require_once __DIR__.'/vendor/autoload.php'; $string = str_repeat("Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut a tincidunt turpis. \\App\\Entity\\Product \World \065 App\#{var} \'quoted\' Donec pharetra enim quis erat pharetra, dignissim molestie erat laoreet. Vivamus auctor purus sed lorem vestibulum rhoncus.", 10); $loader = new Twig\Loader\ArrayLoader([ 'index.twig' => <<<EOF {{ "$string" }} EOF , ]); $twig = new Twig\Environment($loader); for ($i = 0; $i < 10000; $i++) { $twig->tokenize(new Twig\Source($twig->getLoader()->getSourceContext('index.twig')->getCode(), 'index.twig')); } ```
This PR was squashed before being merged into the 3.x branch. Discussion ---------- Optimize stripcslashes See #4176 (comment) by `@smnandre` ``` # Old Benchmark 1: php bench.php Time (mean ± σ): 1.192 s ± 0.004 s [User: 1.163 s, System: 0.009 s] Range (min … max): 1.186 s … 1.198 s 10 runs # New Benchmark 1: php bench.php Time (mean ± σ): 356.5 ms ± 1.1 ms [User: 342.0 ms, System: 7.6 ms] Range (min … max): 354.6 ms … 358.2 ms 10 runs ``` Using bench.php: ```php <?php require_once __DIR__.'/vendor/autoload.php'; $string = str_repeat("Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut a tincidunt turpis. \\App\\Entity\\Product \World \065 App\#{var} \'quoted\' Donec pharetra enim quis erat pharetra, dignissim molestie erat laoreet. Vivamus auctor purus sed lorem vestibulum rhoncus.", 10); $loader = new Twig\Loader\ArrayLoader([ 'index.twig' => <<<EOF {{ "$string" }} EOF , ]); $twig = new Twig\Environment($loader); for ($i = 0; $i < 10000; $i++) { $twig->tokenize(new Twig\Source($twig->getLoader()->getSourceContext('index.twig')->getCode(), 'index.twig')); } ``` Commits ------- bc6e36d Optimize stripcslashes
This PR was merged into the 5.4 branch. Discussion ---------- [TwigBridge] fix tests using Twig 3.12 | Q | A | ------------- | --- | Branch? | 5.4 | Bug fix? | no | New feature? | no | Deprecations? | no | Issues | | License | MIT related to twigphp/Twig#4176 Commits ------- a9858c6 fix tests using Twig 3.12
Follow up of twigphp#4176, twigphp#4123 and twigphp#2712. Now that we have deprecated unnecessary escape in v3, we can change the behavior in v4. With this change, unnecessary will no longer be ignored, but handled like any other character. This allows for writing fully qualified class names like this: ```twig {{ constant('App\Entity\User::SOME_CONSTANT') }} ``` Instead of having to escape the `\` in v3: ```twig {{ constant('App\\Entity\\User::SOME_CONSTANT') }} ```
Follow up of twigphp#4176, twigphp#4123 and twigphp#2712. Now that we have deprecated unnecessary escape in v3, we can change the behavior in v4. With this change, unnecessary will no longer be ignored, but handled like any other character. This allows for writing fully qualified class names like this: ```twig {{ constant('App\Entity\User::SOME_CONSTANT') }} ``` Instead of having to escape the `\` in v3: ```twig {{ constant('App\\Entity\\User::SOME_CONSTANT') }} ```
Follow up of twigphp#4176, twigphp#4123 and twigphp#2712. Now that we have deprecated unnecessary escape in v3, we can change the behavior in v4. With this change, unnecessary will no longer be ignored, but handled like any other character. This allows for writing fully qualified class names like this: ```twig {{ constant('App\Entity\User::SOME_CONSTANT') }} ``` Instead of having to escape the `\` in v3: ```twig {{ constant('App\\Entity\\User::SOME_CONSTANT') }} ```
This PR was squashed before being merged into the 4.x branch. Discussion ---------- Do not hide unnecessary escape characters Follow up of #4176, #4123 and #2712. Now that we have deprecated unnecessary escape in v3, we can change the behavior in v4. With this change, unnecessary will no longer be ignored, but handled like any other character. This allows for writing fully qualified class names like this: ```twig {{ constant('App\Entity\User::SOME_CONSTANT') }} ``` Instead of having to escape the `\` in v3: ```twig {{ constant('App\\Entity\\User::SOME_CONSTANT') }} ``` /cc `@fabpot` `@stof` Commits ------- 53c24bf Do not hide unnecessary escape characters
This PR was squashed before being merged into the 3.x branch. Discussion ---------- Improve escape deprecation message This was introduced in #4176 According to #4343 the message can be improved. I made a few changes to the test to make it easier to work with. Best to review this PR **commit by commit**. /cc `@stof` Commits ------- 7256713 Improve escape deprecation message
This is a first attempt at solving #4123 and #2712.
Currently it writes the deprecations to an array in the Lexer. This is probably not the way to do it. Should we directly trigger
E_USER_DEPRECATED
errors?/cc @stof @fabpot Let me know what you think 😊