-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fall 2017 publication thread #11
Comments
@amir-zeldes Apa Johannes document FA 29-30 is done EXCEPT for some additional information (folio #s) needed for the idno metadatum. You should be able to test the TEI converter on it, though. Please see my thread here gucorpling/gitdox#54 about the converter, first. Thanks!! |
@amir-zeldes AP and Apa Johannes are DONE except for two AP we are waiting on answers to queries; those sayings are from outside contributors and are marked "review." There are a TON of AP. I edited a few that were already published but needed edits. I updated versioning and committed. However, this means that we have some AP in sgml format and some in excel and some in both. Amir, let me know if you have questions about these. I think the rule of thumb is: if there is an sgml file in the gitdox folder, use that. For any excels, don't use them unless there is no sgml in gitdox. Unless you want me to go through and systematically commit every AP to github in the gitdox folder. Let me know! |
Oh, no worries, I'm not going to export from Excel or SGML files - it'll all happen directly from GitDox based on document status (published/to_publish). If you could quickly verify that the statuses are correct, I can attempt the first conversion. I'll have a look at Johannes first maybe. |
The statuses for the AP and Johannes are correct. I am waiting for responses to those two AP tho.
|
Should I convert AP without those two then or wait? |
There is only one left. Greg emailed last night to say he will add the translation today. I will look for it when I get on the train. —c
…Sent from my iPhone
|
Ok, I'll hold off on ap. I'm getting on a plane to Pittsburgh but will
check mail again in the evening.
sent from my mobile
…On Nov 9, 2017 11:10 AM, "Caroline T. Schroeder" ***@***.***> wrote:
There is only one left. Greg emailed last night to say he will add the
translation today. I will look for it when I get on the train. —c
Sent from my iPhone
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACFlW_JYIImjSrCDGkuWj-ACM70rnNzJks5s0yP0gaJpZM4QLn7D>
.
|
Dirt is ready!! OMG this is a lot of material. Also @amir-zeldes I saw your email about TEI but could not get to it with everything else. Will get to it in the morning. |
OK, shenoute.dirt is now in ANNIS as well, accessible with your logins. Let me know if everything looks OK (it only had the same issues of pb_xml:id and the TEI column, which I removed) |
re dirt: @amir-zeldes a couple of things:
Text & annotation
Otherwise I think Dirt looks good. |
Re johannes.canons: Document metadata
Text and annotation
Thanks! Let me know if we should annotate anything differently based on this conversation. |
OK, I'm on the doc name and license issues. I figured out the naming problem, which is a bug in the TreeTagger module in SNP. It looks fixable, but might have repercussions I don't understand. I opened an issue here: Basically, in stripping off the extension of the filename, it just removes everything after the first dot. The quickest fix is to not have dots in filenames, but ultimately (after the release) I'd like to see this fixed. I think not putting |
The hyperlink issue also seems to be an internal SNP thing due to our new workflow. Again, no quick solution, but I can simply 'un-escape' the > etc. manually in the ANNIS files for this release. But in the future, it's a problem we'll need to solve. Let me know about the document names, and if we're OK with GL71-74 etc., then I can reimport everything with those 2 problems fixed. |
Thank you for investigating. I would really rather not change document
names because we have so many corpora with dots in the names, and some
of these docs go free floating like in the TEI archive (which now
someone is using). Is there any other way around this problem?
|
We could wait for the SNP problem to be solved, but that would delay the release... or we could manually change them back in all output documents, which is a bit irritating. But I'm not sure we should want to: our previous corpora don't do this (so documents in Eagerness are called |
Johannes: |
Victor should be done. I am not sure the layer names are correct, but they are understandable. Also Amir, I know you're busy, but when you get a chance, let us know what to do to change the spans so the normalized and analytical visualizations are more sensible. |
Sure, norm responds to |
Hi. I have added translation spans so the analytic views should look ok now. |
Sorry: Victor, Dirt, Johannes, Ps-Theophilus. Since you haven't published a test run of Ps-theophilus, maybe you could try the visualization with that one first? |
OK, theo, dirt, johannes and victor are now online and ready for inspection in ANNIS. TEI also converts no problem except for a modeling issue Carrie and I are discussing, but basically they all validate, suggesting there are no issues with the underlying annotations at this point. AP is sadly riddled with little things, so I'll plug away at that next; the other ones would be ready for a complete release on my end. |
Can I help with AP?
…Sent from my iPhone
|
Thanks, no, I need to fix errors and re-run SNP each time, so you can't (if I find something systematic I'll let you know). What I need from you and @eplatte is just a green light for the other corpora that are in ANNIS right now. If they check out, we just need AP to release! |
I'm at Reed today, but I will check during my lunch break.
Beth
…On Nov 15, 2017 10:30 AM, "Amir Zeldes" ***@***.***> wrote:
Thanks, no, I need to fix errors and re-run SNP each time, so you can't
(if I find something systematic I'll let you know).
What I need from you and @eplatte <https://github.com/eplatte> is just a
green light for the other corpora that are in ANNIS right now. If they
check out, we just need AP to release!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AIB4NZvBxESmayC3PaEX2G2R6nsndbGZks5s2y3hgaJpZM4QLn7D>
.
|
Thanks! No need to overdo it though, it can all wait! |
PUBLISHED! yay |
Please use this thread to track our Fall 2017 corpora publication process.
Data freeze: November 9, 2017
Corpora to publish + reviewers:
Reviewers, please be sure to:
The text was updated successfully, but these errors were encountered: