Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove iconv for decision search highlighting #1766

Merged

Conversation

tomudding
Copy link
Member

Unfortunately, by using iconv there is a greater chance that the resulting (converted) text is either short or longer. For example, by using the euro symbol (€) we artifically increase the length of the texts we are comparing:

iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', '');
// 'EUR'

This is problematic, as it will result in incorrectly alignments of <mark>. While this can be mitigated by carefully calculating offsets for the offsets this quickly makes it more difficult to keep maintaining this functionality. Especially when there need to be more of these exceptions.

image

Only using the transliterator with Any-Latin; Latin-ASCII seems to preserve the length of the comparing elements and allow for searching accented/special characters. There are characters that are not part of/exist in Latin-ASCII, however, these characters are probably never used in the setting of the association.

This is a bug fix for GH-1764.

Unfortunately, by using `iconv` there is a greater chance that the
resulting (converted) text is either short or longer. For example,
by using the euro symbol (€) we artifically increase the length of
the texts we are comparing:

```php
iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', '€');
// 'EUR'
```

This is problematic, as it will result in incorrectly alignments
of `<mark>`. While this can be mitigated by carefully calculating
offsets for the offsets this quickly makes it more difficult to
keep maintaining this functionality. Especially when there need
to be more of these exceptions.

Only using the transliterator with `Any-Latin; Latin-ASCII` seems
to preserve the length of the comparing elements and allow for
searching accented/special characters. There are characters that
are not part of/exist in `Latin-ASCII`, however, these characters
are probably never used in the setting of the association.
@tomudding
Copy link
Member Author

is not the only symbol that causes problems for us, things like special quotes have the same effect.

@tomudding tomudding merged commit c1fc05b into GEWIS:main Nov 19, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant