-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert to citations/bibliography rather than links #24
Comments
It'd definitely be desirable, but it wouldn't be easy. Currently the thing that makes the tool uncomplicated is that it doesn't need to talk to Zotero at all during the scan. All the citation data is added when setting a citation style in LibreOffice. The scan just converts the markers to LO Reference Marks with Zotero format and Zotero item.uris that allow for updating. The relevant function is here: https://github.com/Juris-M/zotero-odf-scan-plugin/blob/master/chrome/content/rtfScan.js#L271 Hope you like regular expressions ;) |
The talking to Zotero bit isn't really too hard. Is https://github.com/Juris-M/zotero-odf-scan-plugin/blob/master/chrome/content/rtfScan.js#L512 the central function that orchestrates the finding and replacing, and https://github.com/Juris-M/zotero-odf-scan-plugin/blob/master/chrome/content/rtfScan.js#L594 the part that does the actual replacements? If I may ask, why use regexes when FF has XML/XPath functionality built in? |
I don't think there's a strong reason to use regex over XML except that Frank likes regex (the original tool this is based on was in python I think, but it's not like that would have made using XML/XPATH impossible). Might be that it actually ends up being more stable given different interpretation of the ODF XML model, but also possible that the reverse is true. Certainly worth testing out. |
You're not the first to ask that question. 😃 The code was originally rejected for inclusion in Zotero for exactly that reason. (Edit: Dan's third response in this thread on zotero-dev) The problem is that the target string may be cross-nested with XML tags that capture a larger run of document text. Identifying the string and isolating it for replacement using XML methods would be very hard to do. It would also be slower to run (because you would need to iterate to the top of the XML hierarchy to determine that a given match attempt had failed). I offered that explanation at the time, and it didn't find favor, but that's the reason behind using regex there. |
Cross-nested? I thought XML was strictly hierarchical? |
(that link appears to want to search your mailbox -- I don't think I have access to that 😄) |
XML is, but the "scannable cites" are not an XML unit, so you get things like this:
Maybe there is an easy way to find the strings and adjust the tag structure to permit insertion of a well structured XML element at their location in DOM context, but it looked pretty daunting to me, and I gave up. Didn't notice that the Google Groups links worked that way! Here's the relevant bit (from April 16, 2013): Frank
Dan
|
Lord Cthulhu almighty, there's kids in the room, you can't just show things like this out in the open... alright, I see your point. The solution would be ugly in any case given this, and the regexen are arguably less ugly than the XML parsing would have been. Wow. |
It does look like a plain string in the word processor, though, so by adding LibreOffice as a dependency ... |
I'm afraid I do not follow all the technical discussion here, but is this issue linked to the possibility of using the ODF scan as something like a bibtex type referencing system? I'm looking at the idea of using Latex for writing, but all my research references are in the ODF scan form. I was wondering if there is an easy way of converting them into something that would be recognised in Latex. |
I'm interested in adding the possibility to have the ODF-scanner use the Zotero-embedded citeproc to create a finalized document; not to remove the existing functionality to create a Zotero-compatible document but so that I can use Word-online + ODF scan without requiring the use of Word to finalize the document. Would this be:
and if so, can you point me to the part of the code that does the current replacement?
The text was updated successfully, but these errors were encountered: