Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache Partitioning, HTTP Cache Management, and Conditional DNR Rule Application #708

Open
jaissam10 opened this issue Oct 9, 2024 · 7 comments
Labels
discussion Needs further discussion follow-up: chrome Needs a response from a Chrome representative needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time

Comments

@jaissam10
Copy link

jaissam10 commented Oct 9, 2024

Hi Team,

My use case is depicted below:

We have an extension that, when a PDF URL is loaded, inserts an iframe containing our web-accessible resource (an extension HTML page). This page then creates another iframe that loads our React application hosted on a domain, such as https://abc.com/index.html.

When a user opens a PDF URL, like https://pdfobject.com/pdf/sample.pdf, two iframes are embedded in the page—the first being our extension page and the second being our React application. These two iframes communicate with each other via postMessage.

To explain further: If I open a PDF, like https://pdfobject.com/pdf/sample.pdf, the top-level site (PDF URL) is dynamic. We then insert our iframe (web_accessible_resource), i.e., (chrome-extension://id/viewer.html). Inside this, we load a nested iframe URL of CDN. So, there will be two nested iframes—one for chrome-extension:///index.html and another for https://abc.com/index.html. (this is our hosted cdn)
Reference image:

image

I am facing cookies and storage partitioning issues.

For cookie partitioning/blocking, I am addressing the issue using the chrome.cookies API along with DeclarativeNetRequest.

Q1. Due to storage partitioning, a service worker gets registered every time a new PDF is opened (assuming the domain is considered new), and its cache storage is also partitioned, preventing reuse.
How can this be resolved or made unpartitioned?
Is there an API, similar to chrome.cookies, that allows setting cache storage on the browser domain (either partitioned or unpartitioned)?

Q2. The HTTP cache is also partitioned, causing a network request every time a new PDF is opened.
How can we address this? Is there an existing API to manage the HTTP cache, or can you suggest a way to make it unpartitioned?

Q3. I am applying a DeclarativeNetRequest (DNR) rule to modify the request header to send cookies from the extension. However, I want to apply the DNR rule specifically for my website (for my client), as we have other clients making similar requests. I'm trying to find a way to limit the rule's application to my client only.
For example, I have the https://abc.com/api/getUser API, and both my extension and another website are initiating requests to this endpoint. I want to add a condition based on the request header to differentiate between the clients.

Is there an existing solution for this? If not, is it possible to implement such a check in dynamic Rule?

@github-actions github-actions bot added needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: firefox Firefox needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time labels Oct 9, 2024
@jaissam10 jaissam10 changed the title Request for service worker unpartitioned Cache Partitioning, HTTP Cache Management, and Conditional DNR Rule Application Oct 9, 2024
@Rob--W
Copy link
Member

Rob--W commented Nov 7, 2024

Your use case would better be served by a dedicated MIME handler API. Chrome has an internal mimeHandlerPrivate extension API that renders an extension document that receives a stream with the original response body. As someone who has been maintaining the PDF.js Chrome extension for a decade, such an API would be much more preferable than a bunch of patches over a fundamentally broken design. For instance, with the current extension APIs, it is currently not possible to render PDF responses from POST requests, or responses that expire after one request.

Before mimeHandlerPrivate, there was an inferior version of it calls streamsPrivate (mozilla/pdf.js#4172). If you are interested in historical details including the non-viability of getting access to a private extension API, see https://issues.chromium.org/issues/40344209

The feature requests requesting MIME handling extension APIs are here:

Now, since you did not ask "How can we best support PDF viewer use cases", but some specific questions towards work-arounds, I'll also answer them below.

Q1. Due to storage partitioning, a service worker gets registered every time a new PDF is opened (assuming the domain is considered new), and its cache storage is also partitioned, preventing reuse. How can this be resolved or made unpartitioned? Is there an API, similar to chrome.cookies, that allows setting cache storage on the browser domain (either partitioned or unpartitioned)?

In your situation, the ultimate iframe of interest is web content. Although embedded in an extension document, the behavior is ultimately up to the web platform, not extensions. One potential extension-specific solution could be to treat everything within an extension document as unpartitioned, but that comes with its own security and privacy concerns: External websites can trick the extension into triggering requests outside of its partition.

Q2. The HTTP cache is also partitioned, causing a network request every time a new PDF is opened. How can we address this? Is there an existing API to manage the HTTP cache, or can you suggest a way to make it unpartitioned?

There is no tracking issue in the WECG yet. But it has been on my wishlist for a long while to have an extension API that enables extensions to choose the specific partition to use when sending network requests ( https://bugzilla.mozilla.org/show_bug.cgi?id=1670278 ). I believe that such an API would be necessary to solve the general issue of the default partition being "wrong".

Q3. I am applying a DeclarativeNetRequest (DNR) rule to modify the request header to send cookies from the extension. However, I want to apply the DNR rule specifically for my website (for my client), as we have other clients making similar requests. I'm trying to find a way to limit the rule's application to my client only. For example, I have the https://abc.com/api/getUser API, and both my extension and another website are initiating requests to this endpoint. I want to add a condition based on the request header to differentiate between the clients.

If no redirects are involved, you could try to match by a random reference fragment, since that is not included in an initial request. When redirects are involved, be careful that the server can specify a different reference fragment that clears the original one, so that would not be a reliable way to modify requests across redirects. A similar need (matching specific requests) came up before in #694, and you can see its discussion.

@Rob--W Rob--W added discussion Needs further discussion follow-up: chrome Needs a response from a Chrome representative and removed needs-triage: firefox Firefox needs to assess this issue for the first time labels Nov 7, 2024
@Rob--W
Copy link
Member

Rob--W commented Nov 7, 2024

Adding follow-up: chrome Needs a response from a Chrome representative because Patrick said in today's meeting that he'd follow up with the original issue author.

I'd also be interested in Google Chrome's thoughts on the dedicated extension API for MIME handling, as expressed in my previous comment.

@jaissam10
Copy link
Author

Hi @Rob--W
Thank you for your reply.

I would also like to hear more about the Mime Handler API and if we are planning to implement it.

Regarding your answers to the questions I asked, here are my responses:

In your situation, the ultimate iframe of interest is web content. Although embedded in an extension document, the behavior is ultimately up to the web platform, not extensions. One potential extension-specific solution could be to treat everything within an extension document as unpartitioned, but that comes with its own security and privacy concerns: External websites can trick the extension into triggering requests outside of its partition.

Just like we do not partition the storage when the extension page is the top level resource, it is not clear to us how the scenario changes if the extension page is in an iframe and nested under another top-level domain. Can you please help explain how these two scenarios are different from the lens of security?

There is no tracking issue in the WECG yet. But it has been on my wishlist for a long while to have an extension API that enables extensions to choose the specific partition to use when sending network requests ( https://bugzilla.mozilla.org/show_bug.cgi?id=1670278 ). I believe that such an API would be necessary to solve the general issue of the default partition being "wrong".

In our case, a webpage (e.g., http://www.abc.com/) is embedded within an extension page (a web-accessible resource), itself nested within a top-level PDF domain. With this setup, how would calls initiated from the embedded webpage (https://abc.com/) be unpartitioned in terms of HTTP cache? It seems that the new extension API you’re considering would allow requests initiated by the extension to bypass partitioning.
Could you clarify if this API would address this scenario as well?

CC: @patrickkettner

@Rob--W
Copy link
Member

Rob--W commented Nov 8, 2024

(...) External websites can trick the extension into triggering requests outside of its partition.

Just like we do not partition the storage when the extension page is the top level resource, it is not clear to us how the scenario changes if the extension page is in an iframe and nested under another top-level domain. Can you please help explain how these two scenarios are different from the lens of security?

Top-level extension documents do not need to be web-accessible. It is clear that the tab displays extension content and presumably doing things on behalf of the user.

Extension iframes in web content can only be displayed when declared in web_accessible_resources. If an extension frame loads arbitrary content based on input from an external website, it is possible for any other website to load the extension page in a frame to trigger such a load, and thereby escaping the partition.

Even if you wanted to take measures to prevent that from happening, I think that many more extension developers won't, to the detriment of users. So as a default, it is less risky to continue to partition as usual.

There is no tracking issue in the WECG yet. But it has been on my wishlist for a long while to have an extension API that enables extensions to choose the specific partition to use when sending network requests ( https://bugzilla.mozilla.org/show_bug.cgi?id=1670278 ). I believe that such an API would be necessary to solve the general issue of the default partition being "wrong".

In our case, a webpage (e.g., http://www.abc.com/) is embedded within an extension page (a web-accessible resource), itself nested within a top-level PDF domain. With this setup, how would calls initiated from the embedded webpage (https://abc.com/) be unpartitioned in terms of HTTP cache? It seems that the new extension API you’re considering would allow requests initiated by the extension to bypass partitioning.
Could you clarify if this API would address this scenario as well?

Your abc web page could communicate with the extension to request the data stream needed to display the content. I assume that you are already doing that, because regular web content cannot load arbitrary cross-origin data. When the extension receives such a request for the data stream, it can invoke extension APIs, including whatever new API we can come up with.

@jaissam10
Copy link
Author

Your abc web page could communicate with the extension to request the data stream needed to display the content. I assume that you are already doing that, because regular web content cannot load arbitrary cross-origin data. When the extension receives such a request for the data stream, it can invoke extension APIs, including whatever new API we can come up with.

We're using the Fetch API to retrieve the PDF buffer within the extension, and passing the fetched buffer to nested iframe ( https://abc.com/ ) via postMessage.
However, when iframe ( https://abc.com/ ) loads, it initiates additional resource requests through script tags, which are there in the DOM OR added to the DOM.
Addition to this, we are also initiating Fetch APIs to accomplish our business workflows from this iframe ( https://abc.com/ ), and these calls are having abc.com domain and its subdomains.
These both cases (Fetch APIs & Script loading) are also subject to http cache partitioning.
Given this partitioning, how would we leverage this new API to handle these resource requests effectively in this context?

Mime Handler API

I have an additional question regarding the Mime Handler API. Could you please provide an update? Is there ongoing development for this feature?

@Rob--W
Copy link
Member

Rob--W commented Nov 14, 2024

Mime Handler API

I have an additional question regarding the Mime Handler API. Could you please provide an update? Is there ongoing development for this feature?

There is no active development on that. I have occasionally raised it in discussions with Google, but it has never reached the planning or even implementation phase.

@patrickkettner Do you have anything to add here?

@jaissam10
Copy link
Author

Thank you for the response, @Rob--W

While we await Patrick's reply for MIME Handler, could you please address this (regarding HTTP cache one) in the meantime?

We're using the Fetch API to retrieve the PDF buffer within the extension, and passing the fetched buffer to nested iframe ( https://abc.com/ ) via postMessage.
However, when iframe ( https://abc.com/ ) loads, it initiates additional resource requests through script tags, which are there in the DOM OR added to the DOM.
Addition to this, we are also initiating Fetch APIs to accomplish our business workflows from this iframe ( https://abc.com/ ), and these calls are having abc.com domain and its subdomains.
These both cases (Fetch APIs & Script loading) are also subject to http cache partitioning.
Given this partitioning, how would we leverage this new API to handle these resource requests effectively in this context?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Needs further discussion follow-up: chrome Needs a response from a Chrome representative needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time
Projects
None yet
Development

No branches or pull requests

2 participants