Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline support fixes #546 #1000

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

jasonwilliams
Copy link
Member

@jasonwilliams jasonwilliams commented Aug 12, 2019

This PR is adding a service worker (via the use of workbox.js) to the mdBook pages which allows for offline navigation and usage. We use the StaleWhileRevalidate mode from workbox

In the gif below the red you see in the network tab is the StaleWhileRevalidate kicking in and trying to load a fresh version, this obviously fails offline so is expected.

Gif, click for animation:
swRust

I have updated the workbox library as it was very out of date. Not sure on the failing test though
@sorin-davidoi @Michael-F-Bryan

How does this work?
When you visit the page the service worker will start to pre-cache all of the chapters within the book, it basically visits those urls and caches the response (but not the assets on those pages). So it doesn't matter which page you hit first, the service work will cache all pages in the book.
When you visit a page in the book, the contents are served from the cache, if there is no matching content in the cache it is served from the network.

Below is an image of what the cache table looks like in Chrome.
I made a slight change to the appendix page, you can see workbox replaced the cache with the updated version, but kept the other pages as they are. (Time Cached is different to the other entries)
image

When does the cache invalidate?
The mode set is staleWhileRevalidate so it will always try to pull a fresh copy in the background (whilst serving you content from the cache). If the content is the same it is basically thrown away. If the content is new workbox replaces the cache with the up to date page and uses that instead.

There is no explicit expiry time set.

Testing Instructions

Testing Instructions (build)

  • Checkout https://github.com/rust-lang/book
  • Add offline-support = true to the [output.html] section
  • Disable the check for localhost at the end of book.js in mdBook repo
  • Build mdBook (set the binary in your path target/debug/mdBook)
  • call mdBook serve in the book repo

@jasonwilliams jasonwilliams changed the title WIP: Offline support (take 2) fixes #546 Offline support (take 2) fixes #546 Aug 19, 2019
@jasonwilliams jasonwilliams changed the title Offline support (take 2) fixes #546 Offline support fixes #546 Aug 19, 2019
@DevQps
Copy link

DevQps commented Aug 22, 2019

@jasonwilliams Sorry for the ultra late reply! I promised last Saturday but it took me longer sadly..

I checked https://jason-williams.co.uk/book/ but when I plug off my Ethernet cable I get a "Not found page" on Firefox 68.0.2 (64-bit). Is it possible that something changed?
image

I will go check the other instructions as well.

@DevQps
Copy link

DevQps commented Aug 22, 2019

@jasonwilliams Could you give me a few more pointers on the instructions?

    1. Checkout https://github.com/rust-lang/book
    2. Add offline-support = true to the [output.html] section
    3. Disable the check for localhost at the end of book.js in mdBook repo
    4. Build mdBook (set the binary in your path target/debug/mdBook)
    5. call mdBook serve in the book repo

I cloned the book repo and added the offline-support line. I guess I also need to clone your version of the mdbook repository? What do you mean exactly with disable the check for localhost? What should I verify once I did these steps?

@Michael-F-Bryan
Copy link
Contributor

@DevQps this line:

  var isLocalhost =
    ["localhost", "127.0.0.1", ""].indexOf(document.location.hostname) !== -1;

@jasonwilliams
Copy link
Member Author

jasonwilliams commented Sep 4, 2019

@DevQps could you refresh your cache and try https://jason-williams.co.uk/book/ again? (whilst online).
Then go offline and give it another go.

If you have the same problem, open your dev tools and go to application then take a screenshot for us

@DevQps
Copy link

DevQps commented Sep 9, 2019

@DevQps could you refresh your cache and try https://jason-williams.co.uk/book/ again? (whilst online).
Then go offline and give it another go.

If you have the same problem, open your dev tools and go to application then take a screenshot for us

I tried it again! I figured out that it caches the pages I have been to before, but not the pages I have been to. For example I could browse to Functions again, but not to other pages. I did a Crtl + Shift + R refresh and deleted all storage / cookies manually as well, but no change.

@jasonwilliams
Copy link
Member Author

@DevQps what browser are you using?
Is your dev tools showing all the pages downloaded in the service worker tab?

@DevQps
Copy link

DevQps commented Sep 14, 2019

Hey Jason!,

I tried it again on Firefox and it worked. Then I tested it on Chrome again, and the first time it failed. But when I reconnected + browsed a page it worked. Could it be that either A) It takes some times (as in around 10 seconds) to fully load everything, or B) That it only works after a reload (i.e. the new service worker is run?)

So I think everything works now!

@jasonwilliams
Copy link
Member Author

@DevQps the dev tools tab (applications) should show you which pages have downloaded
Here is a snippet of mine:
Annotation 2019-09-15 175337

You may have clicked away before everything downloaded

@DevQps
Copy link

DevQps commented Sep 22, 2019

That's probably the case! I think this MR is fine!

@jasonwilliams
Copy link
Member Author

@DevQps what’s the next steps?

@DevQps
Copy link

DevQps commented Oct 16, 2019

I am not sure! @steveklabnik Can you review this maybe?

@steveklabnik
Copy link
Member

I am not really involved in mdbook maintenance these days; @Dylan-DPC should know.

src/theme/sw.js Outdated

// Local resources
workbox.routing.registerRoute(
new RegExp(".woff2?|.ttf|.css|.js|.json|.png|.svg"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a more robust regex?
And all registered routes above may need some modifications and escaping such as ^.

Suggested change
new RegExp(".woff2?|.ttf|.css|.js|.json|.png|.svg"),
/\.(woff2?|ttf|css|js|json|png|svg)$/

previousScrollTop = document.scrollingElement.scrollTop;
}, { passive: true });
if ("serviceWorker" in navigator && !isLocalhost) {
navigator.serviceWorker.register("sw.js").catch(function(error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encountered this error when navigate to pages other than base URL.

Service worker registration failed: TypeError: "ServiceWorker script 
at http://localhost:3000/format/sw.js 
for scope http://localhost:3000/format/ encountered an error during installation."

The problem here is that the register path is relative to current file URL. We can resolve it by adding <base href="https://our.basedomain.com"> though this would be a huge change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that we could merge this PR and open another if the problem is too thorny to deal with at this time.

@jasonwilliams
Copy link
Member Author

jasonwilliams commented Oct 18, 2019

@weihanglo rebased & fixed regex

@weihanglo
Copy link
Member

#1000 (comment)

Besides this baseURI problem, I think we can merge this PR to make some progress. We can try to deal it later in a separate PR.

@weihanglo
Copy link
Member

@jasonwilliams
The relative path issue has been fixed. See jasonwilliams#1

@nihaals
Copy link
Contributor

nihaals commented Jun 12, 2020

What's currently blocking this from being merged? Having it use the cache first then fetch and give a refresh pop up?

@nihaals
Copy link
Contributor

nihaals commented Aug 15, 2020

It looks like fontawesome's acting weird. With "Lie-Fi", it takes 2 failed requests for the SW to return and since it for some reason isn't being cache properly it shows invalid icons every time you go to a new page. Not sure if this has been fixed since the version you have on your hosted version, but seems odd. It loads fine completely offline.

Also, not sure if it's worth adding a config table for custom regex, definitely has a use but you can just make your own sw.js (although would miss out of possible future updates just to add new paths). Haven't tested yet but #1253 might result in chapters not being cached too

@jasonwilliams
Copy link
Member Author

What's currently blocking this from being merged?

Not much just testing.
I’m not going to keep rebasing it if the owners of this project don’t want it though so there needs to be some plan.

I can fix issues but if there’s no response there’s not much I can do.

@nihaals
Copy link
Contributor

nihaals commented Aug 15, 2020

Only issues I've had while testing is FA afaik. Knowing when it's done caching/refreshing the cache might be helpful on mobile (my favourite is https://stripe.com/docs/api which has a little icon at the top, although I don't know how it behaves if it's stale). Since it would be helpful with my project I'd even be willing to maintain a fork and keep rebasing/merging if the devs don't find this useful (judging from the reactions in the issue it didn't seem like a priority but since it's already implemented I'm not sure).

Are you expecting issues to be found? I feel like it would mostly be feature requests that can be added after an initial release (e.g. the popup when updated)

@jasonwilliams
Copy link
Member Author

jasonwilliams commented Aug 15, 2020

@nihaals the first issue we need to sort out is StaleWhilerevalidate vs “fetch first”. I don’t think we made a decision on this.

Personally I don’t mind SWR but @ehuss had issues with it. We need to make a decision on this to move forward, otherwise we’re at an impasse.

I think outside of that it was fine (for me) but @ehuss hit some bugs. I couldn’t reproduce any of them so we we’re stuck.

(my favourite is https://stripe.com/docs/api which has a little icon at the top, although I don't know how it behaves if it's stale)

This is good to know, nice to see there is precedent for doing this on documentation pages

@nihaals
Copy link
Contributor

nihaals commented Aug 15, 2020

Just off the top of my head

SWR Pros/FF Cons

  • "Lie-Fi" is usable
    • Your project might aim for users more likely to be affected by this or they need to prioritise it displaying content
    • The key part of one of the talks I linked is thinking "offline first", assuming good internet connection may harm the UX of users without
  • Loads faster
    • Something that benefits both online and offline users
    • Possibly one of the biggest factors that can be affected by a SW that makes people more likely to see/read the book
  • Can choose between fresh (refreshing when seeing the pop up) and speed (e.g. you just need to quickly check something)
    • If you find yourself always needing fresh data (can't think of an example apart from something news related?), you can change to fetch first in the SW, but switching to offline first is more involved due to a possible popup

SWR Cons/FF Pros

  • Receive stale content
    • In my opinion, not a big deal
    • Individual pages are unlikely to be huge so they would get fresh content soon after
    • A simple pop up saying they can refresh means if they need to prioritise fresh they can

I think double refresh is a hybrid approach but hurts UX and takes away the 2nd and 3rd SWR Pros

I haven't tested on iOS yet but I don't see why it wouldn't work, although workbox is fairly out of date now, there's a new major version, in fact v6 is currently in its 2nd alpha release

Edit: iOS works great

@ehuss
Copy link
Contributor

ehuss commented Aug 22, 2020

I'm sorry that it seems that I am ignoring this. I have very little time to spend on mdbook. I don't really know anything about webworkers, so it is difficult for me to understand or review this. I asked a few questions above which never got answered, which makes me more weary about this. In particular, I have no idea how to inspect, debug, or otherwise work with webworkers. I don't know how to delete them, or stop them, or control them as a user.

I see your points about offline-first, but I still feel like for the common case of people being online will result in a worse experience (that is, a flash and reload, or a manual intervention). Perhaps if offline support required an opt-in by the user, that would reduce the risk, and make it easier to roll out and test? I'm also not proposing that as a permanent solution, but rather as a way to roll it out and get more testing and experience and exposure with it. It would also allay my concern that if it has any problems, there would be some way for the user to disable and remove the cached version.

@jasonwilliams
Copy link
Member Author

I have very little time to spend on mdbook

Is there anyone else maintaining this who can help?

I asked a few questions above which never got answered, which makes me more weary about this.

Oh what did i miss? I'm sure i answered all the Qs.
The only things i can't really help with are the issues you faced as neither me nor @nihaals are facing the problems you described.

In particular, I have no idea how to inspect, debug, or otherwise work with webworkers. I don't know how to delete them, or stop them, or control them as a user.

All of this is achieved in dev tools.

I don't know how to delete them, or stop them, or control them as a user

You can unregister them from that view and that will stop them.
You can inspect the cache storage on the left too, this will help you with debugging stale cache etc.

I see your points about offline-first, but I still feel like for the common case of people being online will result in a worse experience (that is, a flash and reload, or a manual intervention)

This hasn't been my experience when using:

Perhaps if offline support required an opt-in by the user

It's already opt-in by the user of mdBook so no one will be using this feature unless its switched on. Having the end users also opt-in is a bit overkill and would involve some UX explaining to them what they're opting into. I think there's risk of it being confusing, plus I've not seen this prior art with other offline sites or documentation pages.

I think a way forward would be to have the feature off by default (which it already is) and mark it as an experimental feature.
We can start with a short cache lifetime if we're unsure about something not working properly.

there would be some way for the user to disable and remove the cached version

Users are able to do this in their developer tools but it should be rare anyone needs to do this unless there's a bug.

@ehuss
Copy link
Contributor

ehuss commented Aug 22, 2020

Is there anyone else maintaining this who can help?

Unfortunately, not really.

Oh what did i miss?

Here:

  • How do I clear the web worker cache on mobile safari?
  • Is it normal to have lot of errors in the console (online and offline)?
  • Does chrome require a hostname or should an IP work? Or, in other words, how do I get this to work locally with chrome?

Other questions:

  • Have we confirmed it doesn't work for Android? Surely someone has a colleague or friend with an android device?

All of this is achieved in dev tools.

Thanks. How do I do this for Safari and Mobile Safari?

This hasn't been my experience when using:

How do they handle when content changes? Do they just show stale content? How do the user know when to reload?

I've been doing a little testing again, and it is quite concerning when I load up a page, and it shows content from months ago, and it is not really clear how to get it to update. Sometimes it seems navigating to pages is enough, sometimes reloading is enough, but that experience just seems wrong to me. For example, let's say there's a blog post, "hey, check out this new Rust feature" with a link to the docs, and a bunch of users visit that, and the docs aren't there, because they have an old cache. Does that not seem bad? It seems really bad to me.

I'm also getting a lot of font flashing on your book page (chrome and firefox). Is it having problems with the FA fonts? Are those being reloaded from scratch on each page?

I think a way forward would be to have the feature off by default (which it already is) and mark it as an experimental feature.

Who would experiment with it? How would we know it is safe to use, or turn it on for rust-lang books?

@jasonwilliams
Copy link
Member Author

@ehuss could you go into more detail off how you’re testing, what endpoint you’re hitting etc.
We need to be able to reproduce the same issue to help

@ehuss
Copy link
Contributor

ehuss commented Aug 22, 2020

Sure, I have a variety of ways I am trying to test:

  • Built this PR locally (currently on ca2c0e1), and build a book with that. Then serve that with some server. I've been using Python's basic http server (python -m http.server) and an https server, also in Python.
  • Testing with various browsers (chrome, safari, firefox, mobile safari) on a few different platforms (primarily macos, but also Windows and Linux and iOS).
  • Visit the local server (https://192.168.1.108:4443 in my case).
  • Do various offline tests, like visiting the page after turning off the network. On macOS, I just turn off my wifi. On mobile, I go into airplane mode, on Windows I disconnect the network.

When testing content changes, I just make a change to the book, and rebuild it, then try to see what scenarios it will reload (visiting in a new tab, refreshing, navigating between pages, etc.).

So, from a basic test (no offline testing):

  1. Load the book in a web browser.
  2. Make a change to the book and rebuild it.
  3. Open the book in a new tab or window, notice it shows the old content.

@nihaals
Copy link
Contributor

nihaals commented Aug 23, 2020

How do they handle when content changes?

Pretty sure the standard is to show a little popup at the bottom saying there's a new version and they can update by refreshing (sometimes with a refresh button)

This hasn't been my experience when using: [...]

I agree, this seems to be the most popular way and in my opinion the best (using the approach above). I don't think I've seen anyone do a double reload

Having the end users also opt-in is a bit overkill and would involve some UX explaining to them what they're opting into. I think there's risk of it being confusing

I agree, offline websites is somewhat new, and for most users it definitely is. In one of the talks I linked they said that they hope having something like "this website can now be used offline" will fall out of fashion as eventually most websites will be like this, but at the moment, it's useful to simply tell the user as they probably won't expect offline

Is it having problems with the FA fonts?

I also had this issue (#1000 (comment))

[release behind a flag]

I think this would be the best approach as creators may need to be aware of what it is doing due to them possibly having little exposure to SWs. I'm sure when people see the option added in the change log they'll be intrigued and want to test it which will help get more testers (although there shouldn't be many issues as the library used is popular and well tested)

How do I clear the web worker cache on mobile safari?

I'm not sure as I don't use Safari, but they're dev tools seem to be lacking (which doesn't seem rare). This support documentation suggests that there are not Safari dev tools for SWs (although it does warn that it is possibly out of date and could be added in Safari 14). As I said above, browser unique bugs should be extremely rare and would most likely be a library issue or even a browser bug. The current library version is fairly out of date (#1000 (comment)) so updating could help if you do run into any issues

Is it normal to have lot of errors in the console (online and offline)?

The only errors I have are relating to FA (but I believe this issue may be elsewhere, I'm not sure if it's due to a bad service worker. I'm really unsure what's causing the FA issue, I haven't used it much myself but it the setup seems fine. Possibly updating could help but I haven't tested yet

Does chrome require a hostname or should an IP work? Or, in other words, how do I get this to work locally with chrome?

localhost definitely works, not sure if local files is part of the spec but probably safe to use a server

For example, let's say there's a blog post, "hey, check out this new Rust feature" with a link to the docs, and a bunch of users visit that, and the docs aren't there, because they have an old cache

I didn't really think about that and it's a great point. If the page is new, the 404 shouldn't get cached so it would be fine, and if it was something like a changelog page that gets updated, I do agree that it could be confusing, however, with an update pop up I think this could be solved. The Google pop ups aren't obnoxious but are still easily visible which is probably an important balance. If someone just navigates to another page though it would update, so stale content would only be displayed on the first page they see.

It seems most of your issues are UX related. Would fixing the FA problem and having the "update is available" pop up solve those? I think the pop up could make a huge difference and could be useful to actually test on users unaware of the feature to see how well it performs (maybe once this is merged, PRing it into the Rust Book and seeing if that community notices it, finds it annoying and think it's worth the benefit of decreasing load times and having offline access. Even if a feature wasn't used in The Book I still think it could have value for some mdBook users (for example I want to create a book that would primarily be used in places where there wouldn't be great internet access or needs to be accessed no matter what, so this trade off would be trivial to me, especially when stale content would not cause major problems, another type of user who could find this useful)

@jasonwilliams
Copy link
Member Author

If you're referring to my site ignore the fonts errors, i think that's my server not handling that file type, I don't think its related to offline at all.

@nihaals
Copy link
Contributor

nihaals commented Aug 23, 2020

I was, but there was also strange FA behaviour, not just console errors, but I haven't built the PR and tried hosting it myself yet

@ehuss
Copy link
Contributor

ehuss commented Sep 6, 2020

Would fixing the FA problem and having the "update is available" pop up solve those?

I still feel like this would not be a good user experience. For docs that update rapidly, this would show up essentially every day they visit. For users who read the docs sporadically (say once a month), they will essentially run into this every time they visit, which I feel is a bad experience.

Would it be possible to have it be more network-centric so that the current behavior is retained, and the "offline" behavior on engages when needed? For example, can it:

  • Always load from the network first, and display that page.
  • If that load fails (after some relatively short timeout), consult the offline cache and display that.
  • To accommodate the flaky network issue, when it goes into offline mode, switch to loading directly from the offline cache immediately (like this PR works), and then switch back after some period of time (say 24 hours).

Another question, I'm not sure if I asked before, but is there a way to "unload" the service worker? Let's say someone enables this, and then decides to turn it off. Will it be permanently be enabled for any user who visited the site before it was disabled?

@jasonwilliams
Copy link
Member Author

jasonwilliams commented Sep 6, 2020

I’m ok to try “network first”. It’s not “great” when you have no network because some browsers hang for a bit before falling back to the cache but it’s probably the closest we’ll get to pleasing both scenarios.

Another question, I'm not sure if I asked before, but is there a way to "unload" the service worker? Let's say someone enables this, and then decides to turn it off. Will it be permanently be enabled for any user who visited the site before it was disabled?

I think we can have a switch for offline on and off, if you turn it off it can unload the service worker. However I don’t think the switch is needed if we’re doing “network first” as it will make no difference to your experience as you’re fetching from the network anyway.

So we need to make some decisions here:

  • If we do offline “cache first” buy default, then yes a switch would work. The user can switch it on or it can be ok by default set by the author.
  • If we’re fetching from the network first anyway a switch is pointless and only adds confusion. It’s a bit like having a switch to use the HTTP cache in your browser on a website.

@nihaals
Copy link
Contributor

nihaals commented Sep 6, 2020

If the goal is to have both online first and offline first implemented, I think having offline first as the default would be better as it's definitely where most projects seem to be going. I'm not sure how easily you could describe this switch though, you may need a short 2 sentences describing each option.

After reading a web.dev post about pretty much this problem, I thought of another idea which would also have a more user friendly configuration screen. What if we use offline first, but have 3 states:

  1. The content was retrieved less than CONFIG_OPTION_1 [minutes] ago. This means the content will be served straight from the cache. That's it. (This state could be removed but would alleviate some server time.)

  2. The content was retrieved less than CONFIG_OPTION_2 [minutes] ago (and more than CONFIG_OPTION_1). This means the content will be served straight from the cache, but there will also be a network request to fetch new content and refresh the cached content if needed. This would also invoke the standard "new content, refresh" pop up.

  3. The content was retrieved more than CONFIG_OPTION_2 [minutes] ago. This means the cached content is "very" stale and will be discarded. No cached content will be displayed and new content will be retrieved (as if it is their first time loading the site).

Having both of these options user configurable and book author configurable means:

  • For people writing very dynamic books, they can either set (the default values of) CONFIG_OPTION_1 and CONFIG_OPTION_2 to the same value (maybe even 0) which means most of the time only fresh content is shown or just low values and values with a small difference.
  • For people writing books that want long cache times and aren't very dynamic, they can set CONFIG_OPTION_1 and CONFIG_OPTION_2 to larger numbers (or even -1, although that shouldn't really be allowed for CONFIG_OPTION_1) so users will get their cached content and get the pop up when there is an update.

  • CONFIG_OPTION_1 <= CONFIG_OPTION_2
  • CONFIG_OPTION_1 != -1 (to avoid books never being updated)
  • CONFIG_OPTION_2 can equal -1 (standard offline first behaviour)

@ehuss
Copy link
Contributor

ehuss commented Sep 22, 2020

I’m ok to try “network first”. It’s not “great” when you have no network because some browsers hang for a bit before falling back to the cache but it’s probably the closest we’ll get to pleasing both scenarios.

If it can do something like I described above, I think the delay should only be visible once over some time period (say 24 hours). I'm uncertain how often that would happen in practice. What kind of scenarios would cause it to go offline?

If we’re fetching from the network first anyway a switch is pointless and only adds confusion

Yes, my intent was that if it can automatically switch to offline mode, there would not be a need for a user-visible switch.

My question was more for the publisher side. If I publish a book with this turned on, and it causes problems for my readers, I want to turn it off for everyone. AIUI, if you just disable the config option, it doesn't disable the service worker, and there isn't a practical way for users to unload the service worker, leaving it permanently running on their system. Would it be possible, if the option is turned off, if there was some javascript that removed the service worker if it exists?

@nihaals
Copy link
Contributor

nihaals commented Sep 22, 2020

@ehuss What do you think about #1000 (comment)?

@ehuss
Copy link
Contributor

ehuss commented Sep 22, 2020

What do you think about #1000 (comment)?

I guess I don't understand the difference between option1 and option2. Does workbox.js always issue a request in the background? I thought it would always check. What would be the purpose of completely disabling that background request? And for option 3, what would be the purpose of never showing the cached entry? Isn't that really important, like the whole point of offline viewing? I would expect it to show the cached content for as long as possible if it is unable to update.

Also, I may be a bit unclear by what you mean by "user", as there are two different users, the reader and the publisher. Perhaps it would be best to use that terminology when talking about the interaction. I would personally prefer to avoid complicating things for the reader as much as possible.

@nihaals
Copy link
Contributor

nihaals commented Sep 22, 2020

Some pseudo-code that should hopefully explain it better:

def fetch_and_cache(path: str) -> Content:
    # Probably wouldn't actually use a path when refreshing as the entire book should be cached (config option to only cache visited pages in the future?)
    content = get(path)
    set_cache(path, content)  # store content with path and timestamp
    if content_already_displayed():
        # Specifically for the background refresh, true if the user can already see content but this is still running
        # Would also check if the fresh version is actually different to the previous version
        show_popup('Update available, refresh to see')
    return content

def get_content(path: str, config_option_1: int, config_option_2: int) -> Content:
    """Gets content at path respecting cache options"""
    if config_option_1 > config_option_2:
        raise ValueError()
    if config_option_1 <= -1:
        raise ValueError()
    if config_option_2 < -1:
        raise ValueError()
    if config_option_2 == -1:
        # Simplify this snippet's logic
        # Would actually behave like infinity
        config_option_2 = 2^128

    cached_content, timestamp = get_cached_content(path)  # timestamp is a UNIX timestamp for simplicity
    if cached_content is None:
        return fetch_and_cache(path)
    age = now() - timestamp
    if age < config_option_1:
        # Only show cached content, no cache refresh
        return cached_content
    if config_option_1 <= age < config_option_2:
        # Concurrently refresh cache and show cached content (background fetch)
        run_in_background(fetch_and_cache(path))
        return cached_content
    if age >= config_option_2:
        # Fresh content *only*
        set_cache(path, None)
        return fetch_and_cache(path)

Publishers would be able to set default values for config_option_1 and config_option_2 which the reader can change if they want

I guess I don't understand the difference between option1 and option2

State 1 will only show cached content, state 2 will show cached content while also refreshing the cache (useful if they are quickly clicking through pages to remove some origin server load, expected value (config_option_1) around a minute)

Does workbox.js always issue a request in the background? I thought it would always check.

I believe there is a lower level API where you have more control, otherwise you can simply not use workbox and write the serviceworker directly

What would be the purpose of completely disabling that background request?

To remove unneccesary server load and egress (anyone could benefit from having this at a few seconds as it's unlikely the server updating within those 5 seconds will be important since they are already browsing). It would also help the client if they have a bandwidth cap like on cellular data (refreshing the cache every 5 seconds is already probably more often than they would want so if the publisher sets a low value the reader can increase it)

And for option 3, what would be the purpose of never showing the cached entry? Isn't that really important, like the whole point of offline viewing? I would expect it to show the cached content for as long as possible if it is unable to update.

State 3's user experience doesn't feel that much different from online first, many users on mobile stop loading a page after a few seconds and the timeout for online first would need to be high enough to be useful. A possible extra boolean config option could be added to change state 3 (or just replace the state 3 in the above snippet with this one if it doesn't seem useful):

    if age >= config_option_2:
        # Force fetch new content with timeout, fall back to cache
        with timeout(seconds=5):
            return fetch_and_cache(path)
        return cached_content

I think this is probably the best way to implement the advantage of online only since the only benefit is showing fresher content after a period of inactivity (which works well with the final state)

@ehuss
Copy link
Contributor

ehuss commented Sep 22, 2020

To remove unneccesary server load and egress

Does this PR not use the same caching logic as the browser normally does? I would expect it to have the same behavior as normal online behavior (doing whatever a browser normally does to determine if it needs to access the network and checking if a page is out-of-date).

@nihaals
Copy link
Contributor

nihaals commented Sep 22, 2020

You can't cache the initial request without SWs AFAIK, so normal online behaviour doesn't exist (for the actual pages)

In this implementation, any normal browser caching stuff should be bypassed as that would cause conflicts and you wouldn't have control over it. By having it under a config option, the publisher and reader can actually control the no fetch caching of pages (which I think is objectively better?)

@sanmai-NL
Copy link

@jasonwilliams
Copy link
Member Author

jasonwilliams commented Nov 9, 2020

  • if someone can rebase this PR to my branch that would be great, If not I don’t know when I’ll get around to doing it.
  • I think we should allow the publisher to choose swr or network first in their settings and support both. If they choose swr then we can offer a pop up when the data has updated. This means we don’t need to set arbitrary times.

@nihaals looking at #1000 (comment) id rather not get too complex as a first attempt at this feature. I think for now we can say publishers can select network-first or cache-first, then maybe a cache time for how long they want their pages to stay in the cache for (default to some time)

Does workbox.js always issue a request in the background? I thought it would always check.

@ehuss the choices are here: https://developers.google.com/web/tools/workbox/guides/route-requests#handling_a_route_with_a_workbox_strategy

@sanmai-NL
Copy link

@jasonwilliams I don't have write access..

@jasonwilliams
Copy link
Member Author

@sanmai-NL you should be able to clone https://github.com/jasonwilliams/mdBook/tree/offline_support and rebase it against upstream then do a pull request back up

@nihaals
Copy link
Contributor

nihaals commented Nov 13, 2020

Just saw this

image

It reminded me that client side apps already have something like the update pop up we were talking about and I don't think that causes many UX problems (although this is only if you leave the tab open during an update, if you go to it for the first time in a while you still get the new assets unless they also have a service worker)

@jasonwilliams
Copy link
Member Author

@nihaals if the publisher opts in to offline first we can implement that option.

@jasonwilliams
Copy link
Member Author

I still believe there is a significant use case for mdbooks to support offline usability. But won't have time to catch up on this now, im happy if someone else wanted to pick up on this from where I left off

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants