Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make FSRS the default? #3616

Open
dae opened this issue Dec 6, 2024 · 65 comments
Open

Make FSRS the default? #3616

dae opened this issue Dec 6, 2024 · 65 comments

Comments

@dae
Copy link
Member

dae commented Dec 6, 2024

In the next non-trivial (not 24.11.x) update, I think it's about time we enable FSRS out of the box. Any objections?

@Expertium
Copy link
Contributor

Expertium commented Dec 7, 2024

Any objections?

Yes. Let's not make FSRS the default before automatic optimization. Realistically, how many users do you expect to click "Optimize" at least once in their lifetime? I'd say 50% at best, likely less. And how many users will click "Optimize" multiple times? 10%? 5?%

Right now it's mostly power users and tech-savvy people that are using FSRS, so they know that optimization should be done regularly. An average user who is using Anki with out of the box settings won't realize that optimization has to be done at all.
For a power user, automatic optimization saves 2 seconds of clicking "Optimize". For an average user, it makes the difference between using the default parameters and the personalized parameters.

@L-M-Sherlock
Copy link
Contributor

My point: FSRS with default parameters is better than SM-2 in 91.9% cases.

image

source: https://github.com/open-spaced-repetition/srs-benchmark/blob/main/plots/Superiority-9999.png

A CONFLICT OF INTEREST: I'm the main developer of FSRS, please disregard my opinion.

@Expertium
Copy link
Contributor

Expertium commented Dec 7, 2024

image

And FSRS-5 with optimized parameters is better in 99.0% of cases. In % it may not seem like a big difference (91.9% vs 99.9%), but in terms of odds, it's an improvement from "1 user out of 12 would be better off using SM-2 than FSRS" to "1 user out of 100 would be better off using SM-2 than FSRS".

@user1823
Copy link
Contributor

user1823 commented Dec 7, 2024

And FSRS-5 with optimized parameters is better in 99.0% of cases.

Well, no one here is saying that automatic optimization won't be implemented.

But, until we can develop AO, I think that it's reasonable to provide users with something that is better than what they currently have, even if it is not the best.

Also, let's stop discussing about AO now because arguments from both sides have been made and it's dae who has to take the decision now.

If you (or someone else) has any other objection, please feel free to discuss.

@L-M-Sherlock
Copy link
Contributor

L-M-Sherlock commented Dec 7, 2024

FSRS-5 with AO is better than FSRS-5 with default parameters in 80.5%, which is less than 91.9%.

image

So the improvement from AO is less than improvement from FSRS-5.

@dae
Copy link
Member Author

dae commented Dec 7, 2024

@Expertium you have contributed a large amount of time and effort both into suggesting improvements to FSRS, and advising users on its correct usage, including very comprehensive posts like https://old.reddit.com/r/Anki/comments/1h2otym/anki_2411_one_of_the_biggest_updates_ever/. They have been noticed, and I really appreciate all the work you have put in.

That said, this is the second time you've attempted to delay an improvement until it's perfect. I don't think that's the best approach - I think it's better that we get these improvements into the hands of the bulk of users, and address any issues in the future.

@Expertium
Copy link
Contributor

Expertium commented Dec 7, 2024

Alright. Maybe make the mythical optimization reminder (I have never seen it myself) more frequent? I've heard that it only shows up for people who have used "Optimize all presets", though. That would have to change.

@chrislongros
Copy link

I agree that FSRS should be set to default, a milestone change.

@Expertium
Copy link
Contributor

@dae how about a really radical solution - Optimize [all presets] right next to Sync? That way it's impossible to miss.
This + a pop-up that appears each time the number of reviews in the collection doubles (starting from, say, 100 reviews), which is a better rule than "every month". So the user will get a pop-up at 100 reviews, then at 200, then at 400, etc.
Optimize right next to sync

@nymvaline
Copy link

I'm neutral on this change, but I wanted to ask what "by default" means. For new users/profiles only? Or would it take effect for everyone as soon as we download the new version (whatever number it ends up being) unless I intentionally for myself change it back to SM2 + custom scheduling?

(I'm also wondering about Ankimobile, since that one often updates in the background without me noticing. If it's not just for new users/profiles and would go out to everyone, would I get an alert that the scheduler has changed?)

@Expertium
Copy link
Contributor

Expertium commented Dec 7, 2024

No forced transition. If there is a previous installation of Anki on your device, the settings of that installation will be kept. If no previous installations are found aka this is the first time you are installing Anki, FSRS will be enabled by default. The option to enable SM-2 will be there for backward compatibility reasons, it's not like SM-2 will be completely deleted.

That's how I imagine it, and I'm betting 100 bucks that's how it will be.

@brishtibheja
Copy link
Contributor

I would like to bring @dae's attention to these issues I feel need to get fixed.

@Expertium
Copy link
Contributor

Expertium commented Dec 7, 2024

I wouldn't expect 2) to be solved any time soon, tbh.
And frankly, I'm much more interested in whether Dae will like this idea.

@Expertium
Copy link
Contributor

Expertium commented Dec 7, 2024

Wait, I completely forgot about the Hard problem.
https://forums.ankiweb.net/t/how-to-prevent-users-from-misusing-hard-ideas-are-welcome/49092/133?u=expertium
@dae nevermind, I am unironically, chronically, radically, medically, concretely, discreetly opposed to making FSRS the default algorithm until the UI is changed to make it clear that Hard is not "fail".
Remember, at least 10% of Anki users misuse Hard, and this is based on nerds from r/Anki. It's likely worse outside of r/Anki, since r/Anki is a place specifically for Anki enthusiasts.
image

Dae, here are some options:

  1. Do nothing. Cons: Users will continue misusing Hard and then complain that FSRS is terrible.
  2. Implement what I and others have suggested
    image
    Cons: people who like symmetry will hate it.
  3. Make 2 buttons the default, and leave an option to enable Hard and Easy in Preferences. Cons: it will make old documentation and old videos about Anki confusing.
  4. Make an interactive tutorial that helps new users, like this. Cons: it will require a loooooot of effort.

EDIT: Dae, please tell me that this issue is "I am considering making FSRS the default, but willing to postpone it if there are serious roadblocks" and NOT "I have already made up my mind, but feel free to shout into the void to get a false sense of involvement". Because right now I'm getting the vibes of the latter.
Feel free to call me a dunderhead and tell me to be more respectful, but I'm not the only one who feels this way.

@YukiNagat0
Copy link
Contributor

YukiNagat0 commented Dec 7, 2024

Automatic optimization (without option to disable it and manually tweak the weights) is a very bad idea for many reasons.
FSRS will never be perfect to the point where you can rely on it without a doubt.

The biggest problem with FSRS is overfitting.
I will try to explain what I mean:
Let's say the user only uses the Again and Good buttons, and his usual pattern of answering (with optimized weights, let's call w_n) is something like:
1st day (New): 1, 3, 3, 3
3rd day: 3
8th day: 3
...
This user answers most of his new cards in such a pattern for quite a long time. The problem arises when the user decides to re-optimize the weights to w_(n+1) state. FSRS, analyzing his reviews, sees that if the first answer is 1, then the user will answer 3 and 3 for the 3rd and 8th day, so FSRS will increase w_0. initial stability (Again) in such a way that the user can immediately skip to the 7th day, for example.
But the point is that there is no evidence that this will result in the same "recall quality/memorization of this piece of information" as for the "1th, 3th, 8th day" case.


To summarize what I have written above: if there is a repeating good pattern of answers, then FSRS will optimize it, for fewer steps (this is the main principle for any optimization model). But very likely, that this good pattern was exactly because of extra steps.


I have exactly this situation, and because of it I have to manually tweak weights (mainly w_0) to get more realistic intervals.

With AO, such manipulations will not be possible.


Manual tweaking and the ability to leave a weights in one state, until the user wants to optimize them, should stay in Anki (as well as manual "learning/relearning steps").
For me, the right decision seems to be to add a toggle in the preset menu for AO on preset basis.
Or at least, as Expertium suggested - "Optimize button [all presets] right next to Sync".

These two solutions will satisfy all types of users - those who want to fully rely on FSRS and those who want some thoughtful control.


Regarding "Make FSRS default", I fully agree.

@Expertium
Copy link
Contributor

FSRS will increase w_0. initial stability (Again) in such a way that the user can immediately skip to the 7th day, for example.

To summarize what I have written above: if there is a repeating good pattern of answers, then FSRS will optimize it, for fewer steps (this is the main principle for any optimization model). But very likely, that this good pattern was exactly because of extra steps.

That's not how FSRS works. @L-M-Sherlock feel free to clear the misunderstanding.

@YukiNagat0
Copy link
Contributor

YukiNagat0 commented Dec 7, 2024

FSRS will increase w_0. initial stability (Again) in such a way that the user can immediately skip to the 7th day, for example.

To summarize what I have written above: if there is a repeating good pattern of answers, then FSRS will optimize it, for fewer steps (this is the main principle for any optimization model). But very likely, that this good pattern was exactly because of extra steps.

That's not how FSRS works. @L-M-Sherlock feel free to clear the misunderstanding.

If you are so sure, please explain why my w_0 always increases after re-optimizing the weights?
Even if I am not entirely correct in the FSRS algorithm, the result is the same - FSRS is overfitting, which results in always increasing w_0, hence increasing intervals.

@gloryi
Copy link

gloryi commented Dec 7, 2024

Not a contributor. But i'd like to propose to the team an option to choose one of two models after first launch, with some simplie description. And with clarification that model could be changed in settings later. Instead of just changing years-long default, proven to attract a lot of users. Also making users more aware by that of different models and importance of some settings parameters. I guess, way more users would try both models and different settings after initial launch, having that banner.

"Before starting we'd like to ask. How do you prefer to learn.

  • Experimental way, possibly spending less time.
  • More stable and predictable way, sometimes having more reviews for a day.
    (Learning model could be changed in settings later)"

@YukiNagat0
Copy link
Contributor

YukiNagat0 commented Dec 7, 2024

Not a contributor. But i'd like to propose to the team an option to choose one of two models after first launch, with some simplie description. And with clarification that model could be changed in settings later. Instead of just changing years-long default, proven to attract a lot of users. Also making users more aware by that of different models and importance of some settings parameters. I guess, way more users would try both models and different settings after initial launch, having that banner.

"Before starting we'd like to ask. How do you prefer to learn.

  • Experimental way, possibly spending less time.
  • More stable and predictable way, sometimes having more reviews for a day.
    (Learning model could be changed in settings later)"

Unfortunately, the Anki team chose the path of "Let's simplify everything by taking away the user's right to customize the app/theirs workflow and remove 'unnecessary' options" some time ago (if you want, you can read the PR for Load Balancer - #3230). So from now on there will never be a banner that lets the user decide something for himself because it is "too difficult for the average user".

@david-allison
Copy link
Contributor

david-allison commented Dec 8, 2024

Set FSRS as the default, with 4 caveats (✨)

Caveats (TL;DR)
  • We should aim for automatic optimization in future.
  • Delve into the comparison between FSRS-5-unoptimized and SM-2 to be sure we're not picking up pennies in front of a steamroller. The users in the 8.1% may have a scheduler with catastrophically worse probability of recall, and FSRS may be marginally better at prediction (unlikely, but let's use the data).
  • A new user to Anki under FSRS needs to be warned if misuse of Hard is detected. I'd be happy with something as simple as repurposing the old 'too many decks can slow down your collection' warning on the main screen if hard misuse is detected/suspected.
  • (nice to have) - simplify FSRS settings. Most of the settings pane can be hidden behind either an 'advanced' option, or prompted when FSRS is enabled for the first time, then hidden.

  • We should not provide a choice of algorithm during onboarding:
    • FSRS is a sensible default, and it should be a very advanced option to move to a (likely) worse scheduler. There is huge cognitive load in selecting an algorithm, and the onboarding user experience would be awful.
  • We should not force a transition to FSRS.

Automatic Optimization

Result: Obvious win for FSRS (better results for 91.9% of users).

✨ Caveat 1: We should aim for automatic optimization in future.
✨ Caveat 2: Delve into the comparison between FSRS-5-unoptimized and SM-2 to be sure we're not picking up pennies in front of a steamroller. The users in the 8.1% may have a scheduler with catastrophically worse probability of recall, and FSRS may be marginally better at prediction (unlikely, but let's use the data).

Treating FSRS without optimization as a separate scheduler (FSRS-5-unoptimized) the question is: "Do we make FSRS-5-unoptimized the default for new users".

Table: % of collections in the benchmark where Algorithm A (row) estimates the probability of recall more accurately than Algorithm B (column). source

FSRS-5 FSRS-5-unoptimized SM-2
FSRS-5 - 80.5% 99.0%
FSRS-5-unoptimized 19.5% - 91.9%
SM-2 1.0% 8.1% -

Inverting the question: if we were on FSRS-5-unoptimized, would we move back to SM-2? Obviously not.

Misuse of 'Hard'

Blocker (IMO), but I feel this can quickly be resolved with a warning, and improved further with future UX/onboarding efforts.

✨ Caveat 3: A new user to Anki under FSRS needs to be warned if misuse of Hard is detected. I'd be happy with something as simple as repurposing the old 'too many decks can slow down your collection' warning on the main screen if hard misuse is detected/suspected. Note: only an example; I don't have skin in the game regarding implementation.

Reduction in Settings/User Control

✨ Caveat 4: (nice to have) - simplify FSRS settings. Most of the settings pane can be hidden behind either an 'advanced' option, or prompted when FSRS is enabled for the first time, then hidden.

Note: This affects power users, regular users are less likely to change settings in general.

Net positive: too many advanced settings is intimidating and increases cognitive load. FSRS provides one lever for the user to pull (desired retention). Education (visually, in the deck options) around why 100% retention isn't ideal could do with improvement, but it's 'good enough'.

The first question a large number of users have is "[I have an exam in X] what settings should I use?". SM-2s options offer numerous opportunities for a user to make mistakes (especially with how unintuitive spaced repetition is to a new user). Options in FSRS are a 'pit of success', and having 1 option to understand is easier than needing to understand intervals, graduation, the answer buttons, steps, lapses etc...

If a power user wants control, they have the option to move back to a specialized algorithm.

@Danika-Dakika
Copy link

@Expertium

Remember, at least 10% of Anki users misuse Hard, and this is based on nerds from r/Anki. It's likely worse outside of r/Anki, since r/Anki is a place specifically for Anki enthusiasts.

I've seen you say that before, but it seems to me it could just as easily be better outside of Anki-enthusiast communities. [That's not even considering the anti-authoritarian/"I just want to watch the world burn" bent that seems to be more common on Reddit than other parts of the internet. 😅 A certain number of your 18 misusers might have simply been lying. ]

Don't the main reasons for Hard-misuse spring from Anki-guru-fostered and Anki-enthusiast-propagated ideas about how you can "game" the scheduling algorithm? You know, all the same stuff that has caused folks to show up asking for help for the first time with SM-2 settings that are unhinged from reality?

I think average-Jane Anki user when faced with the 4 buttons, and no outside knowledge (for good or for ill) would be more likely to look at Again-Hard-Good-Easy and logically analyze them --

image

Hard and Easy are opposites and I (instantly, intuitively) understand what it means to say "it was hard" or "it was easy." Since they are balanced on either side of Good, that must be in between them, like saying "it was good enough." Again is outside of that series, but it can't mean "it was again," because that doesn't make any sense. So it must mean "I want to see it again." It looks like it shows me the card again in a few minutes so I can make sure I remember it? Okay, I'll use that when I get it wrong. [[end scene]]

I acknowledge that I have no more support for my position than you do for yours ... but that's pretty much my point. Unless you have a survey of a randomized sample of the the "I've been using Anki for 5 years and I just found out about the manual/forum/subreddit/discord/YTers/etc today" contingent, the results will always be useful-but-pliable. No one should die on the hill of protecting that ephemeral 10% of users.


And: a strong +1 to @david-allison 's point about measuring how-much-worse it is for the 8.1% against how-much-better it is for the 91.9%. Data, data, data.

@L-M-Sherlock
Copy link
Contributor

I wouldn't expect 2) to be solved any time soon, tbh.

I have a solution for that: freeze the stability during same-day reviews if the user wants. It could be implemented in FSRS-rs.

@david-allison
Copy link
Contributor

david-allison commented Dec 8, 2024

And: a strong +1 to david-allison's point about measuring how-much-worse it is for the 8.1% against how-much-better it is for the 91.9%. Data, data, data.

Here's a breakdown of the scheduler comparison data, for someone to create a decent looking histogram [warning: heavy page]: https://gist.github.com/david-allison/a623d76654e216478d107655bbb5b2dd

See: #3616 (comment) for charts

@Expertium
Copy link
Contributor

A new user to Anki under FSRS needs to be warned if misuse of Hard is detected.

Sadly, me and Jarrett couldn't think of a good way to detect it.

@Expertium
Copy link
Contributor

I have a solution for that: freeze the stability during same-day reviews if the user wants. It could be implemented in FSRS-rs.

So the user would have to decide on their own? That's not a good idea. Most users won't be aware of problems with formulas.

@brishtibheja
Copy link
Contributor

What Jarrett said in discord:

This issue is involved in three factors:

  1. the ratio: post-lapse stability / last stability
  2. the impact of the same reviews: w[17] and w[18]
  3. the number of relearning steps

We can automatically detect this issue via taking above factors into account.

@Expertium
Copy link
Contributor

That's a different issue. I wasn't talking about short-term S, I was talking about misusing Hard. We can't detect it automatically.

@brishtibheja
Copy link
Contributor

Yeah, I was saying we might detect that one automatically and have FSRS automatically freeze S.

@Expertium
Copy link
Contributor

@L-M-Sherlock what about my old idea here?
https://forums.ankiweb.net/t/how-to-prevent-users-from-misusing-hard-ideas-are-welcome/49092/61?u=expertium

Just do 3 checks:

Check if w_15 (a parameter in the parameters field) is <0.01
Check if RMSE is >5.0%
Check if the number of reviews is >1000, just to avoid false positives
If all 3 are true, display the following pop-up message:

There is a possibility that you are using the Hard button incorrectly. When you press Hard, Anki assumes that you have successfully recalled the card.
Please keep in mind that Hard is not “fail”, it’s “pass”.

It’s not 100% reliable, but it should be reliable enough. And it’s as simple as it gets: three else-if statements and a pop-up. Doesn’t get much simpler than this.

Even if it's not super reliable, it's better than nothing. And the cost of a false positive is small: just mildly annoying the user once.

@Expertium
Copy link
Contributor

Expertium commented Dec 8, 2024

Alright, here are 2 charts
SM-2 vs FSRS-5 with default parameters:
Figure_1

SM-2 vs FSRS-5 with optimized parameters:
Figure_2

Obvious caveat: SM-2 wasn't designed to predict probabilities, and the only reason it does so in the benchmark is because Jarrett added extra formulas on top of it.

Actually, let's compare them under the most generous (for SM-2) assumptions possible.

  1. We hooked SM-2 up to the same optimizer that is used by FSRS-5
  2. We forgot to optimize FSRS-5

Figure_3

Even under these assumptions, FSRS-5 still outperforms SM-2 in 85.7% of cases.

@GithubAnon0000
Copy link

As an FSRS convert and fanboy, hard and easy buttons should simply be removed from Anki.

As an consistent again, hard, good button user (with FSRS of course) I wholeheartedly disagree.

But it can be a good idea to make two buttons the default. Almost half of the people use the four buttons inconsistently (like the data @Expertium provided shows). This negativly impacts their study results and retention.
For those users, an easier fail/pass system might be a good thing and they might benefit from it.

@Gattocrucco
Copy link

Anki novice here! Two anecdotes:

  • It took me about 1.5 years to use the buttons perfectly consistently all the time
  • When I'm mentally tired, I press "hard" much more due to being tired, which is not related to an intrinsic difficulty of the card

@brishtibheja
Copy link
Contributor

When I'm mentally tired, I press "hard" much more due to being tired, which is not related to an intrinsic difficulty of the card

Yup, being consistent is hard. But we're probably not solving this issue anytime soon so a 2 button mode in the future should be the answer.

@Expertium
Copy link
Contributor

@dae I'll quote David

We need an interaction reminder to optimize... an explanation of desired retention, and a tutorial on the buttons

I agree on all three.

@david-allison
Copy link
Contributor

david-allison commented Dec 25, 2024

It's Christmas. Merry Christmas 🎄🎄🎄

I shouldn't have a /weighty/ opinion here unless I'm deep in each discussion and willing to see it through to conclusion. I'm also concerned that we're letting perfect get in the way of the good. If this turns into a 'we need to improve everything' then this discussion will spiral. Once we decide we're moving onto FSRS, we need a few targeted goals which we can get across the finish line, rather than another 200-post forum thread.

1) Interaction reminder

I would like to see something like this upstream. A common issue is that users don't know they need to optimize at all, and then the "once a month per preset" still causes confusion.

Example location/style (if a user hasn't optimized in a month).
I don't believe there's a reason for us not to optimize immediately if a user turns on FSRS, but may be mistaken here. if not, this should also be a prompt.

Image

2) Explanation of desired retention

I've briefly toyed with using the graph for a visual representation of what desired retention means.
I feel we can do better here in the explanation. For now we can do this outside the app on the AnkiDroid site.

3) Tutorial on the buttons

Outside the app until we reach consensus. We can do this on the AnkiDroid website.

@pandadeepimpact
Copy link

Another Anki novice chiming in.

Regarding optimization, I'm one of those users who never use Optimize at all. Not that I'm not aware of that feature, but I got badly confused when I searched about it, especially with the preponderance of complaints on crazy long intervals after optimization. A reminder when to optimize could help, but I think there is a need first to clarify why to optimize. What is its advantage/s over using the default values? Why is it recommended to be done as frequent as once per month? I find "Best match your performance" in the help section rather vague, and the rest of the description is too technical.

Moreover, the message that pops up after pressing Optimize contains a lot of jargon, like this:

"Log loss: 0.2002, RMSE(bins): 6.73%. Smaller numbers indicate a better fit to your review history."

What's log loss? What's RMSE(bins)? How small should the numbers be to be acceptable? For what purpose is this information presented?

Regarding the buttons, I'd agree with @dae to decrease the default to 2 buttons. In my experience, I almost never use the "Hard" and "Easy" buttons. If I have difficulty remembering an information, I'd rather repeat it more times with "Again". If a card is too easy, I tend to simply suspend it, as I noticed that it is usually because I have enough repetitions of that information in my daily life that an external tool is not required anymore to remember it.

@Expertium
Copy link
Contributor

Expertium commented Dec 26, 2024

@pandadeepimpact

but I think there is a need first to clarify why to optimize

I don't see how that could be done without a proper onboarding tutorial (which, IMO, would be the best course of action to fix a whole lot of problems related to people being confused).

What is its advantage/s over using the default values?

FSRS will be ale to predict the probability of recall more accurately for you.

Why is it recommended to be done as frequent as once per month?

That rule is kinda crappy because it doesn't take into account how many reviews you did vs how many you already had. If you had 10 reviews and did 100, that's a huge increase. If you had 100,000 reviews and did 100, that's barely noticeable. So a better rule is to optimize every time the number of reviews doubles. So at 100, then at 200, then at 400, then at 800, etc. But "once per month" is simpler, hence why it's used.

I find "Best match your performance" in the help section rather vague, and the rest of the description is too technical.

I'm not sure what a good middle ground would be. "Optimization makes parameters well-suited for your review history" or "Optimization fine-tunes parameters to make sure that FSRS is personalized for you" is about as good as it gets without getting too technical.

Moreover, the message that pops up after pressing Optimize contains a lot of jargon

The numbers appear only if you click "Evaluate", though. But I agree that this information is not necessary for the vast majority of users.

@GithubAnon0000
Copy link

If I had to explain to someone that is new to spaced repetition, forgetting curve ect. why optimization with FSRS is neccessary, I'd probably say something along the lines of this:

Anki uses the FSRS algorithm. This algorithm decides when you should see a certain card again, so that you remember as much information as possible, while having more free time for other things.

While good, the default algorithm isn't well suited to your memory just yet. To make the algorithm more effective (which means it does a better job at predicting when you forget a card), it needs to learn how you remember things. For that, it uses your review data, once you press the optimize button.

If you press the optimize button, the parameters that FSRS uses to predict your forgetting curve get updated and thus more accurate. If FSRS is more accurate, you'll be able to retain more information and forget less. That's why you should optimize your FSRS parameters regularily.

So basically: try to use easy words and describe it as easily as possible, while making sure the novice understand the general overview / picture.

@Expertium
Copy link
Contributor

Expertium commented Dec 31, 2024

LMSherlock finished benchmarking the actual variant of SM-2 implemented in Anki (as opposed to the original SM-2), here are new graphs (the graphs here are outdated now since we have a better comparison)

I'll be calling Anki's algorithm "Anki-SM-2", which is more concise than "Anki's variant of SM-2".

Anki-SM-2 with default parameters vs FSRS-5 with default parameters:
Image
FSRS is better for 89.9% of users.

Anki-SM-2 with default parameters vs FSRS-5 with optimized parameters:
Image
FSRS is better for 98.1% of users.

Anki-SM-2 with optimized parameters vs FSRS-5 with default parameters:
Image
FSRS is better for 77.0% of users.

Again, same considerations apply:

  1. Anki's algorithm wasn't designed to predict probabilities, so an extra formula must be added.
  2. You can't optimize the parameters of Anki's algorithm in Anki itself. Theoretically, the optimizer that is used by FSRS works with Anki's variant of SM-2 just fine, but it's not implemented in practice.

A small version of the table that shows how often one algorithm outperforms another:
Image

@Danika-Dakika
Copy link

@Expertium You're using "Anki" there to mean ... "Anki using the SM-2 algorithm"? It's a bit confusing, because it's all Anki, right? I worry it won't travel well.

@Expertium
Copy link
Contributor

Expertium commented Jan 1, 2025

I mean "Anki's variant of SM-2"
EDIT: Anki's algorithm isn't exactly the same as SM-2 described by Woz here. But Anki doesn't have a designated name for it. This makes it rather awkward, as I have to either use "Anki", which is short but not clear, or "Anki's variant of SM-2", which is clear but long.

@Danika-Dakika
Copy link

Got it! "Anki-SM-2" might be a good fit for that, but I trust your judgement. Thanks!

@Expertium
Copy link
Contributor

Expertium commented Jan 6, 2025

Dae, it might be a bit too early to discuss this, but I hope that once the two-button mode will become the default and Hard+Easy will have to be enabled separately, each time they are enabled (or at least once), a pop-up will appear that will explain the recommended button usage.

We need a solution that actually teaches users what the intended button usage is, which is why I still think that changing the distance between the buttons and the color of button borders is the best solution. Or an interactive tutorial. Without either of those two, at the very least we will need a pop-up with a brief explanation that Hard is not "fail". Without even a pop-up, two-button mode won't solve anything. Users will just think "Oh god, this app requires me to go to settings just to unlock anwer buttons?!" and then new users will make the same mistakes as previous users who misused Hard, just with one extra minor inconvenience.

@tomkahn
Copy link

tomkahn commented Jan 7, 2025

I see that the ratio of people who are in favor of making FSRS the new default algorithm is about 10:1. I added my vote to the minority camp. Here's why I would wait longer before making the switch:

  1. While 10:1 sounds clear enough keep in mind that this is the vote among a very small subset of power users. Most Anki users don't have a GitHub account and they also don't use Reddit. They probably don't even post anything but are simply confused by major changes and ever more complexity. They care less about the last bit of efficiency and more about being able to actually handle this magical studying app that everyone is raving about.
  2. If Anki switches to a two button layout (which seems to be necessary to prevent misuse of the hard button) this will mean that most videos about Anki will have to be rerecorded. Many people will probably also be wondering how an algorithm can be more efficient if it works with less information. (So far more buttons were usually a sign of an advanced algorithm, whereas 2 buttons stood for the simple Leitner algorithm.) Of course this alone doesn't mean that the change is bad or shouldn't be done, it just means that there is a non negligible cost associated with it. This in combination with other things (like the need to optimize regularly, the completely arcane terms and parameters in the deck options) will confuse many every day users.
  3. So those are the costs which of course might be justified by even greater benefits that a change in the algorithm brings. In this area I have to say that I am not convinced yet that FSRS really is more efficient than the tried and true SM-2. I am the first to admit that it very well could be, but I certainly don't know and I would argue that very few people are in the position to have a justified belief that this new algorithm is actually better. Hear me out:
    1. Yes I saw the diagrams above but I also know that most published research findings are false, so I have very low priors for statistics published on GitHub by the same (good) people who invested a lot of work into this algorithm. I can't simply check if this is correct as the math is above my paygrade. I also have no idea if, for example, a "positive difference in RMSE" actually maps to "being better" in the real world. It certainly may, but I couldn't tell you.
    2. I really wonder who here fits the following criteria (I certainly don't):
      1. Has a background in statistics and is able to tell if the math used to prove that FSRS is superior (and the suppositions behind it) actually checks out.
      2. Did actually check if the math is correct or not.
      3. Doesn't have a bias in favor of FSRS.
      4. Of course it would be even better if this person also reviewed the algorithm itself.
  4. I also stumbled upon a few people who reported negative experiences with FSRS: 1, 2, 3, 4 Of course they might have been doing something wrong (misusing hard for example) but it seems to me like this could mean trouble if many more users are affected.
  5. So all in all: While the costs are clear and not small, the benefits at least to me are much less certain. I would wait for more real world data to come in to see how many people have issues with FSRS and how real its benefits actually are before making this the default engine of Anki.
  6. At the end of the day, this is of course completely up to you, Damien, and you have every right to disregard the points I made above. It was just important to me to raise them.

Lastly I hope I didn't offend anyone here, I think it's great that @L-M-Sherlock and @Expertium invested so much work in a new and possibly better algorithm!

@Expertium
Copy link
Contributor

Expertium commented Jan 7, 2025

Regarding 2: yes, old posts/blogs/videos/etc. will become confusing once Hard and Easy are hidden. This is another reason why I keep pushing for this solution instead (distance + color):
Image

I also have no idea if, for example, a "positive difference in RMSE" actually maps to "being better" in the real world. It certainly may, but I couldn't tell you.

Sadly, this is a problem. We have metrics (logloss, RMSE, and others) for determining how accurately an algorithm can predict the probability of recall, but we can't translate that into "reduction of minutes of studying", which is what users actually care about.

@L-M-Sherlock
Copy link
Contributor

Regarding 3: It's very hard to find someone who fits all those criteria. Here's why:

  1. They'd need to know about FSRS, which usually means being an Anki user.
  2. They'd have to check FSRS just because they're curious, with no other reward.
  3. Not many people have a strong background in statistics.
  4. Not many people are good at coding.

I think more people like this might show up if FSRS becomes the default algorithm in Anki.

Also, what gets used gets maintained and improved. I've made many FSRS libraries, but I only get bug reports from Anki users. This helps me fix problems. I can't be sure the other libraries work perfectly. They just lack of test. The only truly end-to-end test is the end-user. I learned this idea from reading https://gwern.net/holy-war.

@ratman-codes
Copy link

ratman-codes commented Jan 8, 2025

At a minimum, as it stands today I should be able to disable hard and easy buttons. They are a distraction. They damage my algorithm when I accidentally click them or hit their key shortcut. They clutter the UI. For me, Easy and Hard are actively bad things.

@GithubAnon0000
Copy link

I should be able to disable hard and easy buttons

You already can, e.g. with this addon: https://ankiweb.net/shared/info/876946123

@ratman-codes
Copy link

It shouldn't require an addon though. Addons like that break other addons such as green/red color addons, etc.

@brishtibheja
Copy link
Contributor

I think dae has said it (at least thrice now) that we'll get a 2 button mode in the future.

@shigeyukey
Copy link

If only 2 buttons are used in the SM2 algorithm Ease may continue to decrease and EaseHell may occur in the long run (FSRS does not have this problem), so I think if a user is using SM2 it may be need to disable the 2 buttons or display a warning.

@inexplicabro
Copy link

inexplicabro commented Jan 9, 2025

If only 2 buttons are used in the SM2 algorithm Ease may continue to decrease and EaseHell may occur in the long run (FSRS does not have this problem), so I think if a user is using SM2 it may be need to disable the 2 buttons or display a warning.

From my understanding, in SM2's two button mode (Again and Good) neither button affects the Ease so it should not cause Ease Hell.

Edit: Apologies, I infact did not understand

@brishtibheja
Copy link
Contributor

"Again" affects the ease.

@JSchoreels
Copy link

JSchoreels commented Jan 12, 2025

Something I was thinking about for the past few weeks is also how fundamentally how Anki experience can be different with SM-2 or FSRS in terms of how retention will be built, or if instead, retention will be stagnated.

In SM-2, one point you can read everywhere is how Mature card should have higher retention than Young one. Another settings, is the fact that Leeches should be suspended. This works because in SM-2, the goal is NOT to predict when your Predicted Retention goes below your Desired Retention. In SM-2, it's a subjective evaluation of the outcome of a card that will determine how the ease factor will be modified. If a user only uses Again/Good, each time the user fails and a new lapse starts, the new interval succession will be tighter, so there is a better change he doesn't fail anymore. The Desired Retention here is just a factor that will modify the whole deck by a constant factor.

In summary, SM-2 in Anki leads to higher workloads, but time-increasing retention, on a by-card basis.

On the contrary, FSRS is a prediction model. You specify what is your desired retention, and based on your results, it will adapt. If you perform above Desired Retention, parameters might be optimized to reflect it and your interval will be longer, so intervals will grow longer, to make you have a lesser workload while maintaining the Desired Retention. With FSRS, in theory, Mature cards should not have higher actual retention than young ones. Also, with FSRS, leeches are pretty normal since FSRS will constantly grow interval, so your actual retention matches as closely as possible as the Desired Retention.

In summary, FSRS in Anki leads to a lower workload, but also a lower learning/studying/reviewing experience.

Notice how I precised "in Anki". I should have said : In Anki right now. If Anki was adapted to expect higher desired retention on same cards based on different factors (number of reviews, ...), then usage of FSRS could also lead to better and better performance with time.

Something I just want to really insist on, is the fact that while it's certainly tempting to be blinded by stats and interpreting how "better prediction = better tool", it would be interesting to stay focus on why people use Anki in the first place. To build a model that will predict their 80% probability of remembering something, and decreasing their workload while never going above that? Or instead, to gradually increase that retention to mastery level? In my opinion, the latter.

In any case, if you think "Just increase retention", increasing retention a deck level would lead to an enormous amount of workload that even SM-2 would not lead to.

This issues discuss this well : open-spaced-repetition/fsrs4anki#694

Some people facing that realization after changing to FSRS :

@Expertium
Copy link
Contributor

As I said in the issue you linked:

There two issues with your idea.

First, a greater cognitive burden for the user, who will have to configure two different values of desired retention instead of one, and people are already struggling with realizing that desired retention affects interval lengths. I'm not sure how many users know it, my pessimistic estimate would be 50%. In other words, I'd say about 50% of users have no idea that desired retention affects interval lengths. I'm saying this because I've been doing the Anki equivalent of tech support for about a year. Maybe 75% know it, if I'm being optimistic. Even fewer have ever touched "Compute minimum recommended retention" or used different values of desired retention for different presets. I think FSRS should remove options and settings rather than adding them, if we ever want FSRS to be used by anyone who isn't a complete nerd.

Second issue - defining what counts as "mature". It's arbitrary. In Anki a card is considered "mature" if its interval is >=21 days, but why not 20 or 22?

@JSchoreels
Copy link

JSchoreels commented Jan 12, 2025

You keep saying to you want to preserve burden for the user, but you still deflect the fact that people expect cards to "stick better" with time.

The feedbacks are not rare,

It's a clear sign that people expect their retention to slowly increase over time, whatever the mature threshold is.

But with all due respect, it seems you're more interested in the statistical prowess of predicting review outcomes than user experience.

Most of the time, answers in those topics are "Optimize your Parameters", "There is nothing we can do about it". It almost feels like the algorithm becomes the focus above the very human experience of mastering something.

Note : I still do think FSRS has its use. FSRS is a nice way to have the minimum workload possible while keeping average knowledge. It's a nice advantage which comes with the cost of seeing more mature cards dropping in terms of retention. But this has to be highlighted for the very own sake of the user experience you are preaching.

@Expertium
Copy link
Contributor

Expertium commented Jan 12, 2025

I'm simply saying that if users struggle to understand what one desired retention does, having two desired retentions will blow users' minds, but not in a good way.

In a world where >95% of users understand how desired retention affects their workload, I would have no objections to adding a second desired retention for mature (however that is defined) cards. But we don't live in that world.
Or in a world where Anki has two UI layouts: Beginner and Advanced, but we don't live in that world either.

@GithubAnon0000
Copy link

To build a model that will predict their 80% probability of remembering something, and decreasing their workload while never going above that? Or instead, to gradually increase that retention to mastery level? In my opinion, the latter.

It's impossible to remember everything with a 100% success rate. So it must be lower than that.

Besides, you will eventually reach mastery level. Mastery level doesn't mean you know 100% (which is impossible). You could still have desired retetion set to e.g. 95% and depending on how you grade yourself, the interval will increase. Which essentially means

  1. You might forget card A in 3 days, so fsrs shows it to you just before you forget it in 3 days.
  2. Card B on the other hand will be forgotten in over a decade.

(with 95% probability of recalling the card when A or B is scheduled)
If you know something well enough that you wouldn't forget it even after a very long time (even assuming no other forms of repetition, like using the knowledge in your job), argueable you reached mastery.

Building on top of this: I don't see why leeches would increase at all:

Also, with FSRS, leeches are pretty normal since FSRS will constantly grow interval, so your actual retention matches as closely as possible as the Desired Retention.

The interval doesn't grow arbitrarily. It is based on your memory curve and how well you remembered that specific card.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests