-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make FSRS the default? #3616
Comments
Yes. Let's not make FSRS the default before automatic optimization. Realistically, how many users do you expect to click "Optimize" at least once in their lifetime? I'd say 50% at best, likely less. And how many users will click "Optimize" multiple times? 10%? 5?% Right now it's mostly power users and tech-savvy people that are using FSRS, so they know that optimization should be done regularly. An average user who is using Anki with out of the box settings won't realize that optimization has to be done at all. |
My point: FSRS with default parameters is better than SM-2 in 91.9% cases. source: https://github.com/open-spaced-repetition/srs-benchmark/blob/main/plots/Superiority-9999.png A CONFLICT OF INTEREST: I'm the main developer of FSRS, please disregard my opinion. |
Well, no one here is saying that automatic optimization won't be implemented. But, until we can develop AO, I think that it's reasonable to provide users with something that is better than what they currently have, even if it is not the best. Also, let's stop discussing about AO now because arguments from both sides have been made and it's dae who has to take the decision now. If you (or someone else) has any other objection, please feel free to discuss. |
@Expertium you have contributed a large amount of time and effort both into suggesting improvements to FSRS, and advising users on its correct usage, including very comprehensive posts like https://old.reddit.com/r/Anki/comments/1h2otym/anki_2411_one_of_the_biggest_updates_ever/. They have been noticed, and I really appreciate all the work you have put in. That said, this is the second time you've attempted to delay an improvement until it's perfect. I don't think that's the best approach - I think it's better that we get these improvements into the hands of the bulk of users, and address any issues in the future. |
Alright. Maybe make the mythical optimization reminder (I have never seen it myself) more frequent? I've heard that it only shows up for people who have used "Optimize all presets", though. That would have to change. |
I agree that FSRS should be set to default, a milestone change. |
@dae how about a really radical solution - Optimize [all presets] right next to Sync? That way it's impossible to miss. |
I'm neutral on this change, but I wanted to ask what "by default" means. For new users/profiles only? Or would it take effect for everyone as soon as we download the new version (whatever number it ends up being) unless I intentionally for myself change it back to SM2 + custom scheduling? (I'm also wondering about Ankimobile, since that one often updates in the background without me noticing. If it's not just for new users/profiles and would go out to everyone, would I get an alert that the scheduler has changed?) |
No forced transition. If there is a previous installation of Anki on your device, the settings of that installation will be kept. If no previous installations are found aka this is the first time you are installing Anki, FSRS will be enabled by default. The option to enable SM-2 will be there for backward compatibility reasons, it's not like SM-2 will be completely deleted. That's how I imagine it, and I'm betting 100 bucks that's how it will be. |
I would like to bring @dae's attention to these issues I feel need to get fixed.
|
I wouldn't expect 2) to be solved any time soon, tbh. |
Wait, I completely forgot about the Hard problem. Dae, here are some options:
EDIT: Dae, please tell me that this issue is "I am considering making FSRS the default, but willing to postpone it if there are serious roadblocks" and NOT "I have already made up my mind, but feel free to shout into the void to get a false sense of involvement". Because right now I'm getting the vibes of the latter. |
Automatic optimization (without option to disable it and manually tweak the weights) is a very bad idea for many reasons. The biggest problem with FSRS is overfitting. To summarize what I have written above: if there is a repeating good pattern of answers, then FSRS will optimize it, for fewer steps (this is the main principle for any optimization model). But very likely, that this good pattern was exactly because of extra steps. I have exactly this situation, and because of it I have to manually tweak weights (mainly w_0) to get more realistic intervals. With AO, such manipulations will not be possible. Manual tweaking and the ability to leave a weights in one state, until the user wants to optimize them, should stay in Anki (as well as manual "learning/relearning steps"). Regarding "Make FSRS default", I fully agree. |
That's not how FSRS works. @L-M-Sherlock feel free to clear the misunderstanding. |
If you are so sure, please explain why my w_0 always increases after re-optimizing the weights? |
Not a contributor. But i'd like to propose to the team an option to choose one of two models after first launch, with some simplie description. And with clarification that model could be changed in settings later. Instead of just changing years-long default, proven to attract a lot of users. Also making users more aware by that of different models and importance of some settings parameters. I guess, way more users would try both models and different settings after initial launch, having that banner. "Before starting we'd like to ask. How do you prefer to learn.
|
Unfortunately, the Anki team chose the path of "Let's simplify everything by taking away the user's right to customize the app/theirs workflow and remove 'unnecessary' options" some time ago (if you want, you can read the PR for Load Balancer - #3230). So from now on there will never be a banner that lets the user decide something for himself because it is "too difficult for the average user". |
Set FSRS as the default, with 4 caveats (✨) Caveats (TL;DR)
Automatic OptimizationResult: Obvious win for FSRS (better results for 91.9% of users). ✨ Caveat 1: We should aim for automatic optimization in future. Treating FSRS without optimization as a separate scheduler ( Table: % of collections in the benchmark where Algorithm A (row) estimates the probability of recall more accurately than Algorithm B (column). source
Inverting the question: if we were on Misuse of 'Hard'Blocker (IMO), but I feel this can quickly be resolved with a warning, and improved further with future UX/onboarding efforts. ✨ Caveat 3: A new user to Anki under FSRS needs to be warned if misuse of Hard is detected. I'd be happy with something as simple as repurposing the old 'too many decks can slow down your collection' warning on the main screen if hard misuse is detected/suspected. Note: only an example; I don't have skin in the game regarding implementation. Reduction in Settings/User Control✨ Caveat 4: (nice to have) - simplify FSRS settings. Most of the settings pane can be hidden behind either an 'advanced' option, or prompted when FSRS is enabled for the first time, then hidden. Note: This affects power users, regular users are less likely to change settings in general. Net positive: too many advanced settings is intimidating and increases cognitive load. FSRS provides one lever for the user to pull (desired retention). Education (visually, in the deck options) around why 100% retention isn't ideal could do with improvement, but it's 'good enough'. The first question a large number of users have is "[I have an exam in X] what settings should I use?". SM-2s options offer numerous opportunities for a user to make mistakes (especially with how unintuitive spaced repetition is to a new user). Options in FSRS are a 'pit of success', and having 1 option to understand is easier than needing to understand intervals, graduation, the answer buttons, steps, lapses etc... If a power user wants control, they have the option to move back to a specialized algorithm. |
I've seen you say that before, but it seems to me it could just as easily be better outside of Anki-enthusiast communities. [That's not even considering the anti-authoritarian/"I just want to watch the world burn" bent that seems to be more common on Reddit than other parts of the internet. 😅 A certain number of your 18 misusers might have simply been lying. ] Don't the main reasons for Hard-misuse spring from Anki-guru-fostered and Anki-enthusiast-propagated ideas about how you can "game" the scheduling algorithm? You know, all the same stuff that has caused folks to show up asking for help for the first time with SM-2 settings that are unhinged from reality? I think average-Jane Anki user when faced with the 4 buttons, and no outside knowledge (for good or for ill) would be more likely to look at Again-Hard-Good-Easy and logically analyze them --
I acknowledge that I have no more support for my position than you do for yours ... but that's pretty much my point. Unless you have a survey of a randomized sample of the the "I've been using Anki for 5 years and I just found out about the manual/forum/subreddit/discord/YTers/etc today" contingent, the results will always be useful-but-pliable. No one should die on the hill of protecting that ephemeral 10% of users. And: a strong +1 to @david-allison 's point about measuring how-much-worse it is for the 8.1% against how-much-better it is for the 91.9%. Data, data, data. |
I have a solution for that: freeze the stability during same-day reviews if the user wants. It could be implemented in FSRS-rs. |
Here's a breakdown of the scheduler comparison data, for someone to create a decent looking histogram [warning: heavy page]: https://gist.github.com/david-allison/a623d76654e216478d107655bbb5b2dd See: #3616 (comment) for charts |
Sadly, me and Jarrett couldn't think of a good way to detect it. |
So the user would have to decide on their own? That's not a good idea. Most users won't be aware of problems with formulas. |
What Jarrett said in discord:
|
That's a different issue. I wasn't talking about short-term S, I was talking about misusing Hard. We can't detect it automatically. |
Yeah, I was saying we might detect that one automatically and have FSRS automatically freeze S. |
@L-M-Sherlock what about my old idea here?
Even if it's not super reliable, it's better than nothing. And the cost of a false positive is small: just mildly annoying the user once. |
Alright, here are 2 charts SM-2 vs FSRS-5 with optimized parameters: Obvious caveat: SM-2 wasn't designed to predict probabilities, and the only reason it does so in the benchmark is because Jarrett added extra formulas on top of it. Actually, let's compare them under the most generous (for SM-2) assumptions possible.
Even under these assumptions, FSRS-5 still outperforms SM-2 in 85.7% of cases. |
As an consistent again, hard, good button user (with FSRS of course) I wholeheartedly disagree. But it can be a good idea to make two buttons the default. Almost half of the people use the four buttons inconsistently (like the data @Expertium provided shows). This negativly impacts their study results and retention. |
Anki novice here! Two anecdotes:
|
Yup, being consistent is hard. But we're probably not solving this issue anytime soon so a 2 button mode in the future should be the answer. |
@dae I'll quote David
I agree on all three. |
Another Anki novice chiming in. Regarding optimization, I'm one of those users who never use Optimize at all. Not that I'm not aware of that feature, but I got badly confused when I searched about it, especially with the preponderance of complaints on crazy long intervals after optimization. A reminder when to optimize could help, but I think there is a need first to clarify why to optimize. What is its advantage/s over using the default values? Why is it recommended to be done as frequent as once per month? I find "Best match your performance" in the help section rather vague, and the rest of the description is too technical. Moreover, the message that pops up after pressing Optimize contains a lot of jargon, like this: "Log loss: 0.2002, RMSE(bins): 6.73%. Smaller numbers indicate a better fit to your review history." What's log loss? What's RMSE(bins)? How small should the numbers be to be acceptable? For what purpose is this information presented? Regarding the buttons, I'd agree with @dae to decrease the default to 2 buttons. In my experience, I almost never use the "Hard" and "Easy" buttons. If I have difficulty remembering an information, I'd rather repeat it more times with "Again". If a card is too easy, I tend to simply suspend it, as I noticed that it is usually because I have enough repetitions of that information in my daily life that an external tool is not required anymore to remember it. |
I don't see how that could be done without a proper onboarding tutorial (which, IMO, would be the best course of action to fix a whole lot of problems related to people being confused).
FSRS will be ale to predict the probability of recall more accurately for you.
That rule is kinda crappy because it doesn't take into account how many reviews you did vs how many you already had. If you had 10 reviews and did 100, that's a huge increase. If you had 100,000 reviews and did 100, that's barely noticeable. So a better rule is to optimize every time the number of reviews doubles. So at 100, then at 200, then at 400, then at 800, etc. But "once per month" is simpler, hence why it's used.
I'm not sure what a good middle ground would be. "Optimization makes parameters well-suited for your review history" or "Optimization fine-tunes parameters to make sure that FSRS is personalized for you" is about as good as it gets without getting too technical.
The numbers appear only if you click "Evaluate", though. But I agree that this information is not necessary for the vast majority of users. |
If I had to explain to someone that is new to spaced repetition, forgetting curve ect. why optimization with FSRS is neccessary, I'd probably say something along the lines of this:
So basically: try to use easy words and describe it as easily as possible, while making sure the novice understand the general overview / picture. |
LMSherlock finished benchmarking the actual variant of SM-2 implemented in Anki (as opposed to the original SM-2), here are new graphs (the graphs here are outdated now since we have a better comparison) I'll be calling Anki's algorithm "Anki-SM-2", which is more concise than "Anki's variant of SM-2". Anki-SM-2 with default parameters vs FSRS-5 with default parameters:
|
@Expertium You're using "Anki" there to mean ... "Anki using the SM-2 algorithm"? It's a bit confusing, because it's all Anki, right? I worry it won't travel well. |
I mean "Anki's variant of SM-2" |
Got it! "Anki-SM-2" might be a good fit for that, but I trust your judgement. Thanks! |
Dae, it might be a bit too early to discuss this, but I hope that once the two-button mode will become the default and Hard+Easy will have to be enabled separately, each time they are enabled (or at least once), a pop-up will appear that will explain the recommended button usage. We need a solution that actually teaches users what the intended button usage is, which is why I still think that changing the distance between the buttons and the color of button borders is the best solution. Or an interactive tutorial. Without either of those two, at the very least we will need a pop-up with a brief explanation that Hard is not "fail". Without even a pop-up, two-button mode won't solve anything. Users will just think "Oh god, this app requires me to go to settings just to unlock anwer buttons?!" and then new users will make the same mistakes as previous users who misused Hard, just with one extra minor inconvenience. |
I see that the ratio of people who are in favor of making FSRS the new default algorithm is about 10:1. I added my vote to the minority camp. Here's why I would wait longer before making the switch:
Lastly I hope I didn't offend anyone here, I think it's great that @L-M-Sherlock and @Expertium invested so much work in a new and possibly better algorithm! |
Regarding 3: It's very hard to find someone who fits all those criteria. Here's why:
I think more people like this might show up if FSRS becomes the default algorithm in Anki. Also, what gets used gets maintained and improved. I've made many FSRS libraries, but I only get bug reports from Anki users. This helps me fix problems. I can't be sure the other libraries work perfectly. They just lack of test. The only truly end-to-end test is the end-user. I learned this idea from reading https://gwern.net/holy-war. |
At a minimum, as it stands today I should be able to disable hard and easy buttons. They are a distraction. They damage my algorithm when I accidentally click them or hit their key shortcut. They clutter the UI. For me, Easy and Hard are actively bad things. |
You already can, e.g. with this addon: https://ankiweb.net/shared/info/876946123 |
It shouldn't require an addon though. Addons like that break other addons such as green/red color addons, etc. |
I think dae has said it (at least thrice now) that we'll get a 2 button mode in the future. |
If only 2 buttons are used in the SM2 algorithm Ease may continue to decrease and EaseHell may occur in the long run (FSRS does not have this problem), so I think if a user is using SM2 it may be need to disable the 2 buttons or display a warning. |
From my understanding, in SM2's two button mode (Again and Good) neither button affects the Ease so it should not cause Ease Hell. Edit: Apologies, I infact did not understand |
"Again" affects the ease. |
Something I was thinking about for the past few weeks is also how fundamentally how Anki experience can be different with SM-2 or FSRS in terms of how retention will be built, or if instead, retention will be stagnated. In SM-2, one point you can read everywhere is how Mature card should have higher retention than Young one. Another settings, is the fact that Leeches should be suspended. This works because in SM-2, the goal is NOT to predict when your Predicted Retention goes below your Desired Retention. In SM-2, it's a subjective evaluation of the outcome of a card that will determine how the ease factor will be modified. If a user only uses Again/Good, each time the user fails and a new lapse starts, the new interval succession will be tighter, so there is a better change he doesn't fail anymore. The Desired Retention here is just a factor that will modify the whole deck by a constant factor. In summary, SM-2 in Anki leads to higher workloads, but time-increasing retention, on a by-card basis. On the contrary, FSRS is a prediction model. You specify what is your desired retention, and based on your results, it will adapt. If you perform above Desired Retention, parameters might be optimized to reflect it and your interval will be longer, so intervals will grow longer, to make you have a lesser workload while maintaining the Desired Retention. With FSRS, in theory, Mature cards should not have higher actual retention than young ones. Also, with FSRS, leeches are pretty normal since FSRS will constantly grow interval, so your actual retention matches as closely as possible as the Desired Retention. In summary, FSRS in Anki leads to a lower workload, but also a lower learning/studying/reviewing experience. Notice how I precised "in Anki". I should have said : In Anki right now. If Anki was adapted to expect higher desired retention on same cards based on different factors (number of reviews, ...), then usage of FSRS could also lead to better and better performance with time. Something I just want to really insist on, is the fact that while it's certainly tempting to be blinded by stats and interpreting how "better prediction = better tool", it would be interesting to stay focus on why people use Anki in the first place. To build a model that will predict their 80% probability of remembering something, and decreasing their workload while never going above that? Or instead, to gradually increase that retention to mastery level? In my opinion, the latter. In any case, if you think "Just increase retention", increasing retention a deck level would lead to an enormous amount of workload that even SM-2 would not lead to. This issues discuss this well : open-spaced-repetition/fsrs4anki#694 Some people facing that realization after changing to FSRS : |
As I said in the issue you linked:
|
You keep saying to you want to preserve burden for the user, but you still deflect the fact that people expect cards to "stick better" with time. The feedbacks are not rare,
It's a clear sign that people expect their retention to slowly increase over time, whatever the mature threshold is. But with all due respect, it seems you're more interested in the statistical prowess of predicting review outcomes than user experience. Most of the time, answers in those topics are "Optimize your Parameters", "There is nothing we can do about it". It almost feels like the algorithm becomes the focus above the very human experience of mastering something. Note : I still do think FSRS has its use. FSRS is a nice way to have the minimum workload possible while keeping average knowledge. It's a nice advantage which comes with the cost of seeing more mature cards dropping in terms of retention. But this has to be highlighted for the very own sake of the user experience you are preaching. |
I'm simply saying that if users struggle to understand what one desired retention does, having two desired retentions will blow users' minds, but not in a good way. In a world where >95% of users understand how desired retention affects their workload, I would have no objections to adding a second desired retention for mature (however that is defined) cards. But we don't live in that world. |
It's impossible to remember everything with a 100% success rate. So it must be lower than that. Besides, you will eventually reach mastery level. Mastery level doesn't mean you know 100% (which is impossible). You could still have desired retetion set to e.g. 95% and depending on how you grade yourself, the interval will increase. Which essentially means
(with 95% probability of recalling the card when A or B is scheduled) Building on top of this: I don't see why leeches would increase at all:
The interval doesn't grow arbitrarily. It is based on your memory curve and how well you remembered that specific card. |
In the next non-trivial (not 24.11.x) update, I think it's about time we enable FSRS out of the box. Any objections?
The text was updated successfully, but these errors were encountered: