-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are root-relative paths valid? #1681
Comments
This is also linked to #1374: until the URL of the package document is known, there is no way to know how a path-absolute or path-relative URL will be parsed. |
The issue was discussed in a meeting on 2021-06-10 List of resolutions:
View the transcript2. URLs and the package documentSee github issue #1681, #1374, #1688, #1686. Dave Cramer: this is a bunch of issues that revolve around how you interpret URLs in the package document, especially if they're absolute URLs Matt Garrish: in epubcheck there was a root-relative URL that caused an error, and that spawned all of this Dave Cramer: in issue 1688 Romain he suggests that manifest items should have one of the special schemes (except Matt Garrish: there are edge cases where file scheme items make sense, but not generally for epub Dave Cramer: it goes against epub as a portable format, and the file scheme ties the epub to a specific file system Matt Garrish: never heard of one Dave Cramer: okay, so what if we just say no file URLs in epub? Matt Garrish: most RS probably won't do anything with file URL Wendy Reid: depending on platform you might not even be able to access parts of the file system (e.g. iOS apps) Dave Cramer: can we start by resolving on this point from 1688?
Dan Lazin: is there a use case for some of these other schemes? Why would you have an FTP in your epub? Matt Garrish: if we go too far, do we prevent future stuff? will we have to come back and re-add this in the future? Ben Schroeter: is the idea that if we disallow file scheme, then we also disallow "slash URLs"? Matt Garrish: not sure those are the same Dave Cramer: what would be the consequences of forbidding root-relative paths? Matt Garrish: not sure there are any, because epubcheck had forbidden these until a recent update Dave Cramer: and this is just for href on manifest? Matt Garrish: no, this would be anywhere, e.g. in content docs too Dan Lazin: do we support the base tag? Dave Cramer: we've been phasing out Dan Lazin: the base tag allows you to define what the relative path is relative to Matt Garrish: base would force you to have all external resources, right? It exists, but I don't imagine anyone really going there
Marisa DeMeglio: there was a resolution a few weeks ago about dumping Dave Cramer: and that's separate from the HTML base element Dan Lazin: if you set base to some website, and then use root-relative URLs, your URLs would appear to be relative, when they are actually absolute Dave Cramer: but can we really say anything about base because its part of HTML? Matt Garrish: so you must not use root-relative URLs unless you use a base? Dan Lazin: what was the harm in not banning root-relative? Matt Garrish: because the RS might treat zip root as the root, but they could also treat location of package doc as root Dan Lazin: maybe permit it, but use SHOULD NOT? Dave Cramer: yes, e.g. with books that only work with iBooks because of scripting support Matt Garrish: maybe just a note that root-relative could cause issues if authors use it? Dave Cramer: so does that mean that there are epubs that could be built to work in some RS, but expose an interop issue if opened in another RS? Matt Garrish: right Dave Cramer: not sure what the right course of action is, but maybe we can continue this another time with Romain present Wendy Reid: we need RS people here on next call that know exactly what RSes are doing right now Marisa DeMeglio: one of the github threads has a sample, but I wasn't able to download it
Matt Garrish: also, there's not much hand authoring, and most tools will put all the content into one folder |
The issue was discussed in a meeting on 2021-06-18
View the transcript2. What is the relationship between URLs and the package doc (what is home?)See github issue #1681, #1374, #1687, #1686. Wendy Reid: we started this discussion last week. Core question is: Where is home (given we allow both relative and absolute URLs) in the epub context Romain Deltour: we have to keep in mind: 1) what things have to be put in epub core spec, and 2) what are the rules for epub RS spec Wendy Reid: okay, so what is the IRI of the package document then? Ivan Herman: we can't really answer what the IRI of the package is, and i'm not sure we should try Matt Garrish: we have 2 issues, 1) are these resources within the container and how do we determine that? 2) what happens when you unpack, and where do these resources go? Brady Duga: so absolute URIs are not allowed, and what relative IRI is interpreted by the language in question (e.g. HTML, or CSS, depending on what type of document it is) Matt Garrish: i think the issue is root-relative is still a relative path, so do we have to say "all relative is allowed, except root relative" Romain Deltour: even with regular relative URLs, the spec is silent on what happens if the relative URL tries to go below the container root? Ivan Herman: i was surprised to find that some RS don't automatically unpack the whole zip Matt Garrish: we have requirement in OCF that all relative resources must resolve to something in container Gregorio Pellegrino: i know that Colibrio streams files out of zip without unzipping Wendy Reid: yes, there are more examples of RS doing that beyond that Ivan Herman: but conceptually an RS unpacks the whole zip file onto a domain (as if it were a file system). If we do that then all these concepts become clear Hadrien Gardeur: streaming from zip is what Readium does by default Romain Deltour: i'm surprised that resources that are not in the same directory tree as the OPF would not be accessible in the epub
Romain Deltour: this defines unambiguously how relative URLs are to be resolved Wendy Reid: going back to romain's point about testing, there are a variety of ways that RSes handle these URLs Ivan Herman: would some sort of conceptual model clash with how things are implemented? Hadrien Gardeur: we treat OPF as base, and that seems to work in most cases. Seems to make more sense to us than treating zip as base Matt Garrish: this originally came up in multiple renditions when we had issues referencing across sibling directories Romain Deltour: drawback of conceptual solution is that sometimes adding this layer of abstraction makes spec harder to use Wendy Reid: is the best way forward at this point for us to do some sort of testing? (e.g. OPF as base, zip as base, examples of files living outside when OPF is base) Ivan Herman: i think we should also test environment where multiple renditions is implemented Wendy Reid: do we know if a functioning implementation of multiple renditions? Hadrien Gardeur: barnes and noble were using multiple renditions for newspapers and magazines Wendy Reid: okay, so maybe we test on Nook app |
The issue was discussed in a meeting on 2021-06-24 List of resolutions:
View the transcript1. Refine the requirements on how RS must process the container structureSee github issue #1687, #1681, #1686. Wendy Reid: per discussion last week, mgarrish made us a test epub for this Brady Duga: and most of this was done via sideloading, and publisher pipelines are often different Matt Garrish: we still have the problem that the spec doesn't say anything about this. There is no authoring requirement for where to put your content (other that below the root). And for RS there is no requirement for how to unpack, etc. Wendy Reid: it probably doesn't hurt to refine language, but at this point creating a firm requirement would impact some existing RS implementations Matt Garrish: easiest solution is probably an authoring requirement. Esp. because most authors have probably never tried to do anything like the test epub Brady Duga: this has been an issue forever, and the only time we noticed was with multiple renditions, which hasn't been implemented really. So is a 3rd solution to just leave it? Matt Garrish: this whole thing really only came up because of that root-relative thing, so on that issue maybe we just say not to use those Wendy Reid: right, so we advise not to use root-relative, and we can't say specifically how RS will behave if you do it Matt Garrish: can we resolve just to use something similar to the note we were going to have for multiple renditions?
Wendy Reid: the other two related issues first are root relative paths valid? is this now moot? Matt Garrish: i think we are on safer ground to just disallow those, especially because in the past epubcheck has had those come up as an error
Wendy Reid: the second one: what should RS do when manifest item has duplicate entries? Matt Garrish: i think the issue with this was that if there were multiple copies of the same item in manifest, then RS might not know which manifest item to go to when one copy is referenced |
The issue was discussed in a meeting on 2021-07-02
View the transcript2. Are root-relative paths valid?See github issue #1681, #1374. See github pull request #1725. Dave Cramer: What more needs to happen or can happen in the spec for root-relative paths? Ivan Herman: one problem we need to address is that we have a problem with iBooks and others that rely on Adobe ADE, namely that they rely on a specific way of organizing the files, which is not in the standard. Romain Deltour: the test was done with valid ePub with shared resources - there is still the issue of root-relative URL paths and paths that would go outside the container. I think we need the spec to address that. John Foliot: Is an unintended consequence that a publisher would have to create two versions, one for iBooks and another for other reading systems? Dave Cramer: I don't see huge problems around interoperability because EPUBs are consistent with folder structure, generally. Ivan Herman: Whatever works for iBooks works for others - but there are perfectly valid ePubs that iBooks doesn't take. Romain Deltour: these are edge cases, we don't see this problem often if ever. Ivan Herman: it would be helpful to have a clearly-worded proposal for reading systems. Hoping Romain's help with this. Dave Cramer: everyone seems to agree that having Hadrien Gardeur: from a reading system perspective, they need to resolve URIs, and expose the HTML resource (or any resource) to web view. Ivan Herman: What precisely should the recommendation in the reading system spec be to cover all implementations? Hadrien Gardeur: we don't know how each RS works behind the scenes, we can only speculate. Ivan Herman: If we put something in the spec, it's up to RS how to implement Hadrien Gardeur: On the web, we don't think about files and root containers. For reading systems, we are deciding how an EPUB behaves. So weary of this conceptual approach. Dave Cramer: we are really talking about edge cases here. Hoping that we can build some tests based on the write-up and what we are trying to achieve. Hadrien Gardeur: difficult to test everywhere - gets tricky when you have to consider different CSS, etc Dave Cramer: let's get some proposals down with Romain's help, and get Matt to take a look at them, and proceed from there. Ivan Herman: Must have a clear statement somewhere whether we intend to restrict EPUB content and define organization of EPUB package. |
The issue was discussed in a meeting on 2021-10-29
View the transcript2.2. Are root-relative paths valid? (issue epub-specs#1681)See github issue epub-specs#1681. Romain Deltour: the issue is on how we resolve relative URLs in EPUB.
Ivan Herman: are you suggesting to use OCF section on URLs for the other documents?. Romain Deltour: not really, I think there are other issues on URLs. |
This was raised in the epubcheck tracker: w3c/epubcheck#1252 (comment)
We don't say anything about resolving a path that starts with a slash.
Is the root the root of the container or is it the location of the package document? If it's the former, the paths will be problematic for reading systems that serve unzipped content using the location of the package document as the root.
It seems like we should formally disallow root-relative paths unless we want to spec out the behaviour and are sure that all reading systems already do the same thing.
The text was updated successfully, but these errors were encountered: