-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discover] Filter out globally seen posts client-side #7145
base: main
Are you sure you want to change the base?
Conversation
@@ -1,6 +1,8 @@ | |||
import React from 'react' | |||
import {AppState, AppStateStatus} from 'react-native' | |||
import {AppBskyFeedDefs} from '@atproto/api' | |||
// @ts-ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
// store in a .d.ts file
declare module 'bloomfilter' {
export class BloomFilter {
buckets: Int32Array;
/**
* Create a new empty bloom filter of size m with hashes k or
* provide buckets as a number[] or Int32Array to deserialize a bloom filter
* @param mOrBucketsArray number of bits (will be rounded up to nearest 32), or buckets
* to deserialize into a filled bloomfilter
* @param k number of hashes
*/
constructor(m: number | number[] | Int32Array, k: number);
/**
* Add a value to a bloom filter
* @param value
*/
add(value: any): void;
/**
* Test whether a value exists in a bloom filter. (False positives are
* possible, false negatives are not.)
*/
test(value: any): boolean;
/**
* Estimated cardinality.
*/
size(): number;
}
}
Based on https://www.npmjs.com/package/@types/bloomfilter#additional-details and https://github.com/jasondavies/bloomfilter.js/blob/master/bloomfilter.js#L72-L77
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we just use @types/bloomfilter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can, but then you miss size(): number
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think #7145 (comment) is right, and instead you'll want to "just" create a local typescript file, even, porting the 7 year old code. Maintain it locally.
|
This looks really good! Nice one! I did have one thought: the bloomfilter library (file) is only 100 lines long, has no dependencies, and hasn't been updated in 7 years. Likely abandoned. I wonder if this would be a good candiate for vendoring rather than an npm dependency. Only other requirement would be including the BSD-3 licence (MIT compatible) perhaps at the top of the file. Hope you don't mind my drive by comment, saw the mention of bloom filter+ discover feed and definitely had to have a look. Cheers, |
b5995f5
to
1461add
Compare
I'm frustrated that the home button refreshes the timeline instead of adding new posts to the top. I'm just used to using Twitter that way, and on BlueSky when I click it, all the posts disappear for good. This change will only exaggerate the issue since I’ll never get a chance to see them again. 😞 |
This is only relevant for the Discover feed. It was already filtering out already seen posts on Discover so this change has no effect on the issue you're describing. I agree re: it would be nice to maintain continuity; we want to do this too but it's a bigger project. |
We have a tension between two wants:
The proposed solution here, which we arrived at after a discussion with @pfrazee and @whyrusleeping, is client-side deduplication. It's not perfect but it hits these points:
Test Plan
Add a log, verify posts get added to the buffer. Currently they get added:
You can verify that with a log.
yea.mov
Then you can verify that it does the filtering. For that, add a log to where the filtering happens and log the post being filtered out. Then spend some time scrolling Following. Then go to Discover and repeatedly request it — if you're lucky, you'll trigger the behavior. Observe that the filtered post is not being displayed.
Verify logged-out Discover still works.
Verify the filtering is not being applied to other feeds.
The Mechanism
I used a Bloom filter because it's cool. (?) No seriously, it's relatively light in memory (see the linked formula), otherwise a Set would be a lot larger. If we want to maintain continuity and serialize it or sync it between tabs, it'll get more complicated, but I think this is an OK start for now.
I'm doing this in the response handler rather than in the query or the feed tuner because it's not deterministic. The filter is mutable so we need to make sure we apply it once and consistently. Right before handing off the response seems like the perfect time.