Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Send account events to ozone #708

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open

✨ Send account events to ozone #708

wants to merge 31 commits into from

Conversation

foysalit
Copy link
Contributor

@foysalit foysalit commented Jul 26, 2024

Specific features:

  • automod engine feature for persisting record events (create/update/delete) to Ozone (if a rule requests this)
  • rules to track ozone event on record subjects, and then persist future record updates to Ozone
  • optional logic to persist account and identity events to ozone

@bnewbold
Copy link
Collaborator

Yup! This is roughly the shape I was thinking. Some notes:

  • should have a conditional that if there isn't an authenticated ozone client configured on the engine, just skip
  • an explicit config flag for this behavior. for example, in prod we run two parallel instances of automod, one for blobs and one for all other labels; we would only want to do these lifecycle event pushes from one of those instances
  • we could add more context/metadata to the args of ProcessIdentityEvent, it is very minimal right now

@bnewbold
Copy link
Collaborator

bnewbold commented Aug 1, 2024

More quick notes:

  • to do lexicon generation in indigo, have the atproto repo checked out in parallel. eg, I have ~/code/indigo/ and ~/code/atproto/, so that the lexicons are at ~/code/atproto/lexicons/. you can have a branch or patches to the lexicon schema JSON files, doesn't need to be main branch. then, in indigo, run make lexgen, and then run make fmt lint test to check if anything broke. for example, new query params mean existing calling code needs to be updated. in some cases make cborgen is also necessary.
  • to "de-dupe" event creation, i'd probably do a queryEvents on the same subject with the same event type, and check that the most recent event wasn't created in the past 48 hours or so and having the same type? might need some more thought/tuning
  • just to call it out, there is a tension here between a model of "here is the account/record state at this point in time" vs "here is what changed (diff)". for a human mod reviewer, "what changed" is probably more meaningful, but the technical architecture isn't built that way. we can try to hack around that, but it is always going to be hard to figure out "what was the state before", which is what is required to figure out "what changed". having a log of "current state" allows a human to look and visually see what changed; for example this is what we do for DID PLC history in the account profile view right now

@foysalit
Copy link
Contributor Author

foysalit commented Aug 8, 2024

Thanks, these notes help a lot!

check that the most recent event wasn't created in the past 48 hours

these are quite tricky since profiles may be updated within minutes because people are changing their bio or smth or accounts may be deactivated/reactivated because someone wanted to check how it works etc. I think just setting a shorter window for the last event check, like within the last 30s or smth would cover most legit cases?

for a human mod reviewer, "what changed" is probably more meaningful, but the technical architecture isn't built that way

I think we would cover this via record snapshots in ozone. automod would just emit the changes and ozone would keep snapshots so that we have full history for record updates. for account events, these don't matter much, do they? I guess for handle change it does but then we kinda have that history in plc, right?

@bnewbold
Copy link
Collaborator

bnewbold commented Aug 9, 2024

for dedupe: yup! you are right, we definitely want to capture all those edits. maybe we can de-dupe from the CID or timestamp in the most recent event, instead of just the event type?

for "what changed": PLC is a good example. if only looking at PLC changes, the existing table works well: you can scan the columns and see what changed. in the ozone event history, it is probably going to be a mix of reports, comments, actions/labeling, and then a identity/PLC event in the middle. in that context is isn't easy to see what that one identity event represented though, would need to dig deeper. it is definitely an improvement to have it show up at all, and we can probably figure this out later in either the ozone backend or client.

@foysalit
Copy link
Contributor Author

Summarizing recent changes

  1. we are now keeping count of mod-event and only sending events for following cases IF there's at least 1 mod-event on the record/account
    a. post delete event
    b. any profile event (update and delete since on create we won't have any mod-events)
  2. cleaned up the resolve appeal stuff that got mixed into this from another PR

@foysalit foysalit changed the title 🚧 Send account events to ozone ✨ Send account events to ozone Oct 26, 2024
bnewbold added a commit that referenced this pull request Oct 30, 2024
@foysalit: this is pulling out some of the engine-layer refactors from
#708
@bnewbold
Copy link
Collaborator

I was re-working this over in #910, but i'll just push that work here and close the other one.

Summarizing changes made here:

  • synced some recent changes from main
  • renamed and shuffled some code around; eg "forward" not "reroute"
  • made "record op" persist stuff just another record persist behavior, not something controlled via general event forwarding. so basically controlled by enabling a rule, not configuring hepa
  • did keep "persist account history" as a hepa-level flag (not a rule), but also made that flag much more explicit about what it is doing.

And some outstanding issues:

at least in staging, the de-dupe query doesn't seem to work. I get:

XRPC ERROR 400: InvalidRequest: Invalid event type

when using event types like tools.ozone.moderation.defs#identityEvent

another problem i'm seeing with "delete" record ops is that some ozone events require a "strong ref"; but with a deletion there is no CID for the record, so a strongref can't be created.

Manually Resolved Conflicts:
 	automod/engine/persist.go
@bnewbold
Copy link
Collaborator

One path forward with this PR is to split out the account/identity update stuff from the record-op update.

The account/identity stuff should be good to go once the de-dupe "queryEvents" issue is resolved, and functionality is tested to work as expected.

The recordOp stuff in the general case needs a resolution to the strongRef problem for deleted records, same as: #739

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants