Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BibTex escaping #515

Closed
kosarko opened this issue Apr 6, 2016 · 12 comments
Closed

BibTex escaping #515

kosarko opened this issue Apr 6, 2016 · 12 comments
Assignees
Milestone

Comments

@kosarko
Copy link
Member

kosarko commented Apr 6, 2016

by @dan-zeman:
"ă" (SMALL A WITH BREVE) disappears from bibtex citation

@kosarko kosarko added this to the 2016.4 milestone Apr 7, 2016
@kosarko kosarko self-assigned this Apr 11, 2016
@kosarko
Copy link
Member Author

kosarko commented Apr 11, 2016

We should come up with a more sustainable solution, then listing all the mappings https://github.com/ufal/lindat-dspace/blob/lindat/dspace-oai/src/main/java/cz/cuni/mff/ufal/utils/BibtexUtil.java#L122
@amirkamran might know of a js library to do this for us

@stranak
Copy link
Member

stranak commented Apr 11, 2016

This might be a good moment to think about a more comprehensive citation solution that can generate not only bibtex, but also other formats and has support of styles.

Have a look at citeproc and where it is used (CrossRef, Open Office and many more places). There are also many implementations, JS, Java, Python, …

@kosarko
Copy link
Member Author

kosarko commented Apr 13, 2016

Haven't had the time to actually inspect what citeproc/CSL is. Just found out there was some debate regarding this in dspace jira. https://jira.duraspace.org/browse/DS-1359 and there seems to be a patch of some sort in https://jira.duraspace.org/browse/DS-1224. Only there seems to be some issues/concerns regarding licensing

@kosarko
Copy link
Member Author

kosarko commented Jun 6, 2016

I'm testing citeproc-java and the thing is, at the first glance, it does not escape as I'd expect. There might be a different bibtex style that does that and I just haven't found it yet.

edit: sample output

<div class="csl-entry"> @misc{Košarko_2016, title={Test dataset}, publisher={Some rather lenghty name of a publisher that really does not exist just for convenience to see it working}, author={Košarko, Ondřej}, year={2016}}</div>

but I'd expect Ko{\v s}arko, Ond{\v r}ej

Maybe we are overdoing it. Is there really no support for UTF-8 in bibtex? @vidiecan I think you've linked somewhere saying that the escapes are really the right way...

@kosarko
Copy link
Member Author

kosarko commented Jun 8, 2016

There are some hints in http://tex.stackexchange.com/a/57827 (maybe that's about biblatex and not bibtex). Tried the book from there and the misc generated above in ufal's sharelatex...that's probably also not "vanilla" bibtex

\documentclass{article}
\usepackage[utf8]{inputenc}
...
\usepackage{natbib}
\usepackage{graphicx}
...
\bibliographystyle{plain}
\bibliography{references}

and the accented chars are displayed normally. The only trouble was with the accent in the generated key above (Košarko_2016)

What do we really want to support in the exports?

@stranak
Copy link
Member

stranak commented Jun 8, 2016

I am not sure that Bibtex even supports accents in keys. I would just generate different ASCII keys. Accents in the visible text are important, though.

-Pavel

On 08 Jun 2016, at 10:50, Ondřej Košarko [email protected] wrote:

There are some hints in http://tex.stackexchange.com/a/57827 (maybe that's about biblatex and not bibtex). Tried the book from there and the misc generated above in ufal's sharelatex...that's probably also not "vanilla" bibtex

\documentclass{article}
\usepackage[utf8]{inputenc}
...
\usepackage{natbib}
\usepackage{graphicx}
...
\bibliographystyle{plain}
\bibliography{references}

and the accented chars are displayed normally. The only trouble was with the accent in the generated key above (Košarko_2016)

What do we really want to support in the exports?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@kosarko
Copy link
Member Author

kosarko commented Jun 8, 2016

I am not sure that Bibtex even supports accents in keys. I would just generate different ASCII keys. Accents in the visible text are important, though.

My point was rather do we really need to "escape" Košarko as Ko{\v s}arko in the authors field?

@stranak
Copy link
Member

stranak commented Jun 8, 2016

On 08 Jun 2016, at 11:07, Ondřej Košarko [email protected] wrote:

I am not sure that Bibtex even supports accents in keys. I would just generate different ASCII keys. Accents in the visible text are important, though.

My point was rather do we really need to "escape" Košarko as Ko{\v s}arko in the authors field?

I think we do, since very few people use Biblatex and Bibtex, which everybody uses for latexing, doesn't support UTF-8, AFAIK.

-Pavel

@stranak
Copy link
Member

stranak commented Jun 8, 2016

On 08 Jun 2016, at 11:09, Pavel Stranak [email protected] wrote:

On 08 Jun 2016, at 11:07, Ondřej Košarko [email protected] wrote:

I am not sure that Bibtex even supports accents in keys. I would just generate different ASCII keys. Accents in the visible text are important, though.

My point was rather do we really need to "escape" Košarko as Ko{\v s}arko in the authors field?

I think we do, since very few people use Biblatex and Bibtex, which everybody uses for latexing, doesn't support UTF-8, AFAIK.

Clarification: Some implementations of bibtex now probably do support some accented characters, but unless we verify that most or all support at least latin1 + latin2 accents, we do need to escape.

Š is also in Finnish and Estonian, so could it be working because it is in Latin 1?

-Pavel

@dan-zeman
Copy link
Member

I have had issues (repeatedly) with accented UTF-8 characters in .bib file not being displayed properly (actually not being displayed at all). Even when I don't have to escape the characters in the main text of the paper, where UTF-8 is properly recognized.

@stranak
Copy link
Member

stranak commented Jun 8, 2016

On 8. 6. 2016, at 13:34, Dan Zeman [email protected] wrote:

I have had issues (repeatedly) with accented UTF-8 characters in .bib file not being displayed properly (actually not being displayed at all). Even when I don't have to escape the characters in the main text of the paper, where UTF-8 is properly recognized.

The same had happened to me, I think I tried to google and then I was surprised when I found that Bibtex doesn't support UTF-8. It is several years back, but I think we should stil expect that on the users' side. LaTeX is not something that people tend to keep updated and using bleeding edge new stuff in.

Pavel

This was referenced Jun 14, 2016
@kosarko kosarko added the Has PR label Jun 14, 2016
@kosarko
Copy link
Member Author

kosarko commented Jun 14, 2016

In #568 I've added mappings from http://lindat.mff.cuni.cz/services/morph/code-latin2-table.html; hopefully all of them.

Lets' leave citeproc/csl for some other time #567

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants