Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smart quotes sometimes too aggressive #47

Open
Omikhleia opened this issue Nov 11, 2022 · 3 comments
Open

Smart quotes sometimes too aggressive #47

Omikhleia opened this issue Nov 11, 2022 · 3 comments
Assignees
Labels

Comments

@Omikhleia
Copy link
Contributor

Omikhleia commented Nov 11, 2022

Running with the smart extension enabled.

It's clear it shouldn't happen. Aujourd'hui n'est pas demain.

With Pandoc: <p>It’s clear it shouldn’t happen. Aujourd’hui n’est pas demain.</p>
(with right quotes = apostrophes everywhere)

But Lunamark generates <p>It‘s clear it shouldn’t happen. Aujourd‘hui n’est pas demain.</p>
(with left quotes in "It's" and "Aujourd'hui")

@Witiko
Copy link
Collaborator

Witiko commented Nov 16, 2022

It seems to me that Pandoc only considers quotes that begin at word boundaries:

https://github.com/jgm/pandoc/blob/139f9c064d056989f3a7071341c7b15bfc1850a7/src/Text/Pandoc/Parsing/Smart.hs#L142-L151

By contrast, we are quite happy to parse quotes anywhere:

larsers.DoubleQuoted = parsers.dquote * Ct((parsers.Inline - parsers.dquote)^1)
* parsers.dquote / writer.doublequoted
larsers.SingleQuoted = parsers.squote_start
* Ct((parsers.Inline - parsers.squote_end)^1)
* parsers.squote_end / writer.singlequoted

@Omikhleia
Copy link
Contributor Author

An additional case to consider:

D&D was a big thing in the '80s.

Lunamark currently gives:

<p>D&amp;D was a big thing in the &#39;80s.</p>

(with a straight apostrophe).

Expected (as with Pandoc):

<p>D&amp;D was a big thing in the ’80s.</p>

@Witiko
Copy link
Collaborator

Witiko commented Dec 31, 2022

Seems like single quotes get special additional treatment by Pandoc. I will need to delve into Pandoc's implementation a bit more. As illustrated by the fenced divs, it may be more time-efficient to study Pandoc's code over treating it like a black box and reverse-engineering the code from its outputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants