Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(events): introduce compact JSON form of EventEntry #11707

Draft
wants to merge 2 commits into
base: release/v1.26.0
Choose a base branch
from

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Mar 12, 2024

Optional compact form, can round-trip as either full standard Go style JSON or compact tuple struct with decoded "value" field represented as dag-json. Currently turned on as strict default for GetActorEvents and SubscribeActorEvents.

Bear with me, I'll explain the reasons for some of the crazy in the code here, but first what I'm trying to achieve here.

Here's some calibnet events from nv22 as they are directly out of the json response from Filecoin.GetActorEvents, formatted for emphasis on the pieces that you're expected to decode to work out what the event is doing (i.e. the entries):

Built-in:

{"entries":[
  {"Flags":3,"Key":"$type","Codec":81,"Value":"bmRlYWwtcHVibGlzaGVk"},
  {"Flags":3,"Key":"id","Codec":81,"Value":"GgACyyc="},
  {"Flags":3,"Key":"client","Codec":81,"Value":"GQS2"},
  {"Flags":3,"Key":"provider","Codec":81,"Value":"GQSG"}
],"emitter":"t05","reverted":false,"height":1427997,"tipsetKey":[{"/":"bafy2bzacedsp7zcxejngyftpu6hnr72ahsxyrycn5bjtfsp3vwxgvhs2hbpag"},{"/":"bafy2bzaceduo4goajlqtkarcozgm4hl6yla5kgajypguddk4embb6oxj2ah22"},{"/":"bafy2bzaceafs5ewv73nqxqhqqveanv74il5jxpvkp5oymoje6prtqh6b6hzje"}],"msgCid":{"/":"bafy2bzaceatdcipsape77hnf4lvaslwgkx4axbmro4fsgq3dvfk5ri4nbpoec"}}

FEVM:

{"entries":[
  {"Flags":3,"Key":"t1","Codec":85,"Value":"mzhAeIhrixxtxKRVUgms9lTj+bEmu4rNhD4BOizSMmc="},
  {"Flags":3,"Key":"t2","Codec":85,"Value":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADc="},
  {"Flags":3,"Key":"d","Codec":85,"Value":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAKSfDFPBaMbbj1gAAAAAAAAAAAAAAAAAAAAAAAAAAAAApJ8MU8FoxtuPWAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABl7xEw"}
],"emitter":"t410fazjdjy3l32rrlnavap6uljckuvjl7f4ziticsca","reverted":false,"height":1427998,"tipsetKey":[{"/":"bafy2bzaceb7dl5fqr32biq75c6f5e5lwuc7ikdgrztmetweiiwahrgsmvflp4"},{"/":"bafy2bzacebjzpdk4aybiwfpcjs6fmkxf7me452f4o4tcs7wugiambehsnspde"},{"/":"bafy2bzacecw4f5n6xlw3jqui2pw7eoaga3cxyepisestfztbizcfnzwd5kise"}],"msgCid":{"/":"bafy2bzaceahofvwlvtzcyfx75v26boqnai2t6q5uqogdb6bkepcfu2rqlmmyg"}}

With this branch, these same events come out looking like this:

Built-in:

{"emitter":"t05","entries":[
  [3,81,"$type","deal-published"],
  [3,81,"id",183079],
  [3,81,"client",1206],
  [3,81,"provider",1158]
],"height":1427997,"msgCid":{"/":"bafy2bzaceatdcipsape77hnf4lvaslwgkx4axbmro4fsgq3dvfk5ri4nbpoec"},"reverted":false,"tipsetKey":[{"/":"bafy2bzacedsp7zcxejngyftpu6hnr72ahsxyrycn5bjtfsp3vwxgvhs2hbpag"},{"/":"bafy2bzaceduo4goajlqtkarcozgm4hl6yla5kgajypguddk4embb6oxj2ah22"},{"/":"bafy2bzaceafs5ewv73nqxqhqqveanv74il5jxpvkp5oymoje6prtqh6b6hzje"}]},

FEVM:

{"emitter":"t410fazjdjy3l32rrlnavap6uljckuvjl7f4ziticsca","entries":[
  [3,85,"t1",{"/":{"bytes":"mzhAeIhrixxtxKRVUgms9lTj+bEmu4rNhD4BOizSMmc"}}],
  [3,85,"t2",{"/":{"bytes":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADc"}}],
  [3,85,"d",{"/":{"bytes":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAKSfDFPBaMbbj1gAAAAAAAAAAAAAAAAAAAAAAAAAAAAApJ8MU8FoxtuPWAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABl7xEw"}}]
],"height":1427998,"msgCid":{"/":"bafy2bzaceahofvwlvtzcyfx75v26boqnai2t6q5uqogdb6bkepcfu2rqlmmyg"},"reverted":false,"tipsetKey":[{"/":"bafy2bzaceb7dl5fqr32biq75c6f5e5lwuc7ikdgrztmetweiiwahrgsmvflp4"},{"/":"bafy2bzacebjzpdk4aybiwfpcjs6fmkxf7me452f4o4tcs7wugiambehsnspde"},{"/":"bafy2bzacecw4f5n6xlw3jqui2pw7eoaga3cxyepisestfztbizcfnzwd5kise"}]},

Note the built-in forms become significantly more useful, no matter how you consume these. Even with curl you can use jq to mess with them without involving a CBOR decoder.

The changes being:

  • "Tuple struct" formatting of the EventEntry so they're more compact, but we still retain flags and codec code but put them up front ([flag,codec,key,value])
  • We decode the value using the codec (if we can) and present the value field as it comes out
  • We use dag-json to do the JSONification, so we can properly represent bytes without the ambiguity of the Go standard JSON formatting (i.e. is ths string bytes as base64? is it at string that just happens to be base64?), but also allows us to represent objects as complex as we need (currently there's only scalars here and all of the FEVM events are raw for now so will always be bytes)).

About the complexity:

  • This all works transparently with go-jsonrpc, these objects will round-trip faithfully, even with the value decoding
  • This is currently an option, I was imagining it being a second argument to GetActorEvents and SubscribeActorEvents; but if we land this before 1.26.0 final then it's not a breaking change and we could make it default. If we do that, then maybe a third of the complexity in here can go away because this assumes being able to switch between, and gracefully handle, both forms.

@rvagg rvagg requested a review from a team as a code owner March 12, 2024 05:12
@rvagg rvagg force-pushed the rvagg/event-json branch 2 times, most recently from 090d0af to cc7c311 Compare March 12, 2024 05:29
Flags: uint8(flags),
Key: key,
Codec: uint64(codec),
Value: value,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rvagg Where do we decode this using the CBOR decoder ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see it's in eventEntryCompactFromAny

@@ -32,4 +32,7 @@ type EventEntry struct {
Value []byte
}

// TODO: implement EventEntry#UnmarshalJSON and EventEntry#MarshalJSON to allow for both compact and
// non-compact forms as per ActorEvent#UnmarshalJSON and ActorEvent#MarshalJSON

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get rid of this if we make the compact form the only available form right ? The ActorEvent encoders and decoders already handle the compaction for Event entries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not quite, because there are existing API methods that use this via the Event type that we don't use here. Such as ChainGetEvents.

What I would do I think is implement an unmarshaller that takes either form.

Also, we should probably also do this same treatment for the filter, currently I'm constructing queries with filters like this:

{"method":"Filecoin.GetActorEvents", "params": [{"fromHeight":1427974,"toHeight":-1,"fields":{"$type":[{"Codec":81,"Value":"cHZlcmlmaWVyLWJhbGFuY2U="},{"Codec":81,"Value":"amFsbG9jYXRpb24="},{"Codec":81,"Value":"cmFsbG9jYXRpb24tcmVtb3ZlZA=="},{"Codec":81,"Value":"ZWNsYWlt"},{"Codec":81,"Value":"bWNsYWltLXVwZGF0ZWQ="},{"Codec":81,"Value":"bWNsYWltLXJlbW92ZWQ="},{"Codec":81,"Value":"bmRlYWwtYWN0aXZhdGVk"},{"Codec":81,"Value":"b2RlYWwtdGVybWluYXRlZA=="},{"Codec":81,"Value":"bmRlYWwtY29tcGxldGVk"},{"Codec":81,"Value":"c3NlY3Rvci1wcmVjb21taXR0ZWQ="},{"Codec":81,"Value":"cHNlY3Rvci1hY3RpdmF0ZWQ="},{"Codec":81,"Value":"bnNlY3Rvci11cGRhdGVk"},{"Codec":81,"Value":"cXNlY3Rvci10ZXJtaW5hdGVk"}]}}],"id":1,"jsonrpc":"2.0"}

I'd like to do [81,"sector-terminated"], and it would be nice if that was flexible and took either form too I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this is an existing API that we probably don't want to break:

$ curl -s -X POST   -H "Content-Type: application/json"   --data '{"method":"Filecoin.ChainGetEvents", "params": [{"/":"bafy2bzaceahc33tda6xtwxx7agspupiiwrgjwkniie5x3epz76tstv5nhdn5y"}],"id":1,"jsonrpc":"2.0"}' http://localhost:1235/rpc/v1
{"jsonrpc":"2.0","result":[{"Emitter":6,"Entries":[{"Flags":3,"Key":"$type","Codec":81,"Value":"amFsbG9jYXRpb24="},{"Flags":3,"Key":"id","Codec":81,"Value":"GXxO"},{"Flags":3,"Key":"client","Codec":81,"Value":"GQPz"},{"Flags":3,"Key":"provider","Codec":81,"Value":"GQPo"}]}],"id":1}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super relevant to this PR, but it might be worth just deprecating the uncompacted format -- perhaps renaming the existing methods to ChainGetEventsLegacy (so easy breaking change to integrate initially), and then killing them in 3 months. I'm generally opposed to having two different formats for something so user-facing, it's only gonna be more detail for users to wrap their heads around.

Buuutt...future discussion, doesn't affect this changeset.

// decoded using the specified codec where possible, and they are encoded using dag-json form so
// bytes are represented using the `{"/":{"bytes":"base64"}}` form rather than Go standard base64
// encoding.
func (ae ActorEvent) AsCompactEncoded() ActorEvent {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this ? I don't see us not using this in any of the APIs. Is it so that we can ultimately allow users to specify this ? I think the best thing to do would be to ship this PR with 1.26 so we can get rid of this optionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that's what I'd like to do, the bool pointer is annoying and only necessitated by this being an option; I'd remove this and the path that encodes ActorEvent as non-compact (but maybe leave both forms of decoding)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I hate the switching here, agree with @aarshkshah1992

@aarshkshah1992
Copy link
Contributor

@rvagg This is a major UX improvement for Actor Events and the code already looks really tight ! The biggest win for clients is not having to decode CBOR values.

While we shouldn't change anything on the smart contract events, shipping this for Actor events is a great win for the events work in 1.26.

Optional compact form, can round-trip as either full standard Go style JSON
or compact tuple struct with decoded "value" field represented as dag-json.

Currently turned on as strict default for GetActorEvents and
SubscribeActorEvents
Copy link
Contributor

@arajasek arajasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks correct, and is honestly a great improvement. Thank you!

I would strongly favour landing this in 1.26.0, removing the need to switch between the two forms. That logic is...gnarly, and definitely doesn't seem worth the effort to me. I'm typically opposed to rushing things at this stage of a release, but this is a risk-free area of the code, we can correct any issues here fairly easily.

One option we can consider is to simply descope this entire system from the entire 1.26.0 release, and land it in master (or some tagged feature release). That can give us a bit more time to look at this / think this through, while still ensuring that the functionality is there on mainnet upgrade day (just not in the "minimum" release).

// decoded using the specified codec where possible, and they are encoded using dag-json form so
// bytes are represented using the `{"/":{"bytes":"base64"}}` form rather than Go standard base64
// encoding.
func (ae ActorEvent) AsCompactEncoded() ActorEvent {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I hate the switching here, agree with @aarshkshah1992

aec, _ := aePtr.(*ActorEvent) // safe to assume type
*ae = *aec

// check if we were encoded in compact form and set the flag accordingly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the completeness here, but I do think this is overkill.

@@ -32,4 +32,7 @@ type EventEntry struct {
Value []byte
}

// TODO: implement EventEntry#UnmarshalJSON and EventEntry#MarshalJSON to allow for both compact and
// non-compact forms as per ActorEvent#UnmarshalJSON and ActorEvent#MarshalJSON

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super relevant to this PR, but it might be worth just deprecating the uncompacted format -- perhaps renaming the existing methods to ChainGetEventsLegacy (so easy breaking change to integrate initially), and then killing them in 3 months. I'm generally opposed to having two different formats for something so user-facing, it's only gonna be more detail for users to wrap their heads around.

Buuutt...future discussion, doesn't affect this changeset.

@rvagg
Copy link
Member Author

rvagg commented Mar 14, 2024

There's one tiny problem this doesn't deal with that ends up being a bit important—the case of a poorly encoded Value field from a user programmed actor. IIRC in here I deal with that case by just representing it as bytes. Unfortunately that case is lossy, as is the case of an unknown future codec that we're not prepared to deal with. You can't quite round-trip because you've lost information about the original form. You couldn't use this for example to reconstruct an events AMT and get the event root in all cases (which isn't something we'd probably expect to be done, but it could be done to validate that an API gave you the events you cared about).

From discussion today, the proposal is:

  • Two separate API pairs: GetActorEvents + SubscribeActorEvents and GetActorEventsRaw + SubscribeActorEventsRaw.
  • Separate return types, and likely separate filter parameters for both that give us the flexibility to give nice decoded forms to the user except in the case where they want the raw, unaltered forms, perhaps to stick in a database, or perhaps to decode in their funky format without us molesting them.
  • Drop Flags and Codec from the nice forms, if you want the gory details go and ask for it with the Raw forms.
  • Ignore events that we can't decode - either because we don't recognise the codec or it's not decodable.

@arajasek's point about ChainGetEvents is a good one, maybe we do the same with it? ChainGetEvents vs ChainGetEventsRaw? Or we could just leave it alone.

As for descoping this for 1.26.0, that idea has merit except that one of the things we're aiming for is to help fill a data void around DDO, it wouldn't be the end of the world without it. But we are marking this clearly as "EXPERIMENTAL: may change in a future release". I'm suspecting we may even want to change the return value at some point as we find this one lacking; see #11680 for some initial thoughts on what we may end up wanting to do.

@rvagg
Copy link
Member Author

rvagg commented Sep 11, 2024

Converted to draft for now, needs a bit of a rethink: this might be better if it's lossless (i.e. you could faithfully reverse the process to get the raw event data) and could break a little harder to deal with problems like pagination and signalling max-results that we don't have on the *Raw() variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ⌨️ In Progress
Development

Successfully merging this pull request may close these issues.

4 participants