Feed caching #212

goodboy · 2021-08-10T21:09:53Z

Initial data feed caching over piker.data.feed.open_feed() using the new maybe_open_feed().

Adds a new _cacheables.py which contains a bunch of helpers for cache-y things.
This relies on goodboy/tractor#229 in order to give multiple actor-local task consumers broadcast access to quote streams.

Putting this up to get eyes on it and see if there's any reason not to start building streaming apis under this paradigm.

Still TODO:

we should also cache the Feed.index_stream() per actor, which makes me also wonder if we should provide new Feed instances on cache hits? ~~or can we just wrap it and override ._index_stream or is that to mutate-y~~ ended up using the new maybe_cache_ctx() mngr introduced in this PR
attaching to a data feed currently registers the client stream with the sample and broadcast loop; would there be a benefit of switching from this brokerd-always-pushes to a tasks always pull model here?
- i'm thinking no
- the only thing that might be handy is a re-impl of uniform_rate_send as a broadcast_receiver() potentially? in which case i think we can drop all the timing logic and just let sent quotes queue up and then pull on a fixed period? i may be thinking about this wrong..
feed "pausing" support which allows sending a pause/resume message to the endpoint to add remove the subscription dynamically
~~taking the Feed api to our fsp subsystem (this will likely get delayed to a new PR)~~ -> Expose FSP streams as Feeds #216

goodboy · 2021-08-10T21:10:57Z

piker/_cacheables.py

+# maybe_open_ctx() below except it uses an async exit statck.
+# ideally wer pick one or the other.
+@asynccontextmanager
+async def open_cached_client(


this is old code that i think should be ported to the maybe_open_ctx() below but for broker backend client instances.

boo yeah. worked like a charm 🏄🏼

goodboy · 2021-08-10T21:12:16Z

piker/_cacheables.py

+    def get_and_use() -> AsyncIterable[T]:
+        # key error must bubble here
+        feed = cache.ctxs[key]
+        log.info(f'Reusing cached feed for {key}')


i guess this should be some representation format instead of "feed" 😂

goodboy · 2021-08-10T21:13:02Z

piker/data/feed.py

+        if cache_hit:
+            # add a new broadcast subscription for the quote stream
+            # if this feed is likely already in use
+            async with feed.stream.subscribe() as bstream:


This is the part that requires goodboy/tractor#229 tokio style broadcasting.

goodboy · 2021-08-10T23:40:38Z

Just updated infect_asyncio in tractor to match. so we shud be guuuudd.

goodboy · 2021-08-11T15:48:13Z

Probably also keeping 👀 on down the road is the cachetools project.

piker/_cacheables.py

goodboy · 2021-08-16T12:31:30Z

Added Feed.pause()/.resume() so we can use it when charts are switched to avoid brokerd pushing more streams then necessary and also unseen draw cycles on charts not in focus 😎

goodboy · 2021-08-16T12:32:37Z

piker/data/feed.py

+            uid = ctx.chan.uid
+            fqsn = f'{symbol}.{brokername}'
+
+            async for msg in stream:


This implements "feed pausing" - the beauty of 2 way streamzz 🏄🏼‍♀️

Try out he new broadcast channels from `tractor` for data feeds we already have cached. Any time there's a cache hit we load the cached feed and just slap a broadcast receiver on it for the local consumer task.

Maybe i've finally learned my lesson that exit stacks and per task ctx manager caching is just not trionic.. Use the approach we've taken for the daemon service manager as well: create a process global nursery for each unique ctx manager we wish to cache and simply tear it down when the number of consumers goes to zero. This seems to resolve all prior issues and gets us error-free cached feeds!

…ed..." Think this was fixed by passing through `**kwargs` in `maybe_open_feed()`, the shielding for fsp respawns wasn't being properly passed through.. This reverts commit 2f1455d.

goodboy · 2021-08-31T13:36:37Z

piker/_cacheables.py

+    ctx_key = id(mngr)
+
+    # TODO: does this need to be a tractor "root nursery"?
+    async with maybe_open_nursery(cache.nurseries.get(ctx_key)) as n:


I'm still slightly unclear how the teardown part of lifetime works here; pretty sure it's going to tear down when the first consumer task is complete instead of when the last arriving consumer does.

In order to get the latter behavior we might need to have an actor global nursery that's brought up with the runtime / the consumer process?

Not sure this absolutely must be addressed right now since usually the creator task stays up as long as the app / daemon which is using the feed.

goodboy · 2021-08-31T13:38:19Z

piker/data/feed.py

@@ -335,6 +355,31 @@ async def attach_feed_bus(
            bus._subscribers[symbol].remove(sub)


+@asynccontextmanager
+async def open_sample_step_stream(


Allows us to actor-cache the OHLC step event stream per delay (since you likely will want the same event for all local consumers.

goodboy · 2021-08-31T13:39:00Z

piker/data/feed.py

+            brokername,
+            [sym],
+            loglevel=loglevel,
+            **kwargs,


This `kwargs* pass through was critical...

guilledk

Love to see the new _broadcast machinery. Also _cacheables looking sexy!

In order to ensure the lifetime of the feed can in fact be kept open until the last consumer task has completed we need to maintain a lifetime which is hierarchically greater then all consumer tasks. This solution is somewhat hacky but seems to work well: we just use the `tractor` actor's "service nursery" (the one normally used to invoke rpc tasks) to launch the task which will start and keep open the target cached async context manager. To make this more "proper" we may want to offer a "root nursery" in all piker actors that is exposed through some singleton api or even introduce a public api for it into `tractor` directly.

iamzoltan

Looks kosher to me. Digging the improvements!

goodboy · 2021-09-01T13:02:32Z

piker/_cacheables.py

+
+
+@asynccontextmanager
+async def open_cached_client(


not bad right?

goodboy · 2021-09-01T13:11:25Z

piker/_cacheables.py

+    '''
+    lock = trio.Lock()
+    users: int = 0
+    values: dict[tuple[str, str], tuple[AsyncExitStack, Any]] = {}


woops, stack is tossed now.

goodboy added data-layer real-time and historical data processing and storage (sub-)systems general sw design and eng perf efficiency and latency optimization labels Aug 10, 2021

goodboy requested a review from guilledk August 10, 2021 21:09

goodboy commented Aug 10, 2021

View reviewed changes

goodboy commented Aug 12, 2021

View reviewed changes

piker/_cacheables.py Outdated Show resolved Hide resolved

goodboy commented Aug 16, 2021

View reviewed changes

goodboy added 18 commits August 26, 2021 17:04

Cache brokerd feeds for reuse in clearing loop

146c684

Start top level cacheing apis module

a0660e5

Move feed cacheing to cache mod; put entry retreival into ctx mng

0ce8057

Add lifo cache to new module; drop "utils", bleh

68ce5b3

Let's abstractify: ->

66f1d91

Add an njs cache gist link

7d5add1

Drop feed refs

224dbbc

Add a maybe_open_feed() which uses new broadcast chans

a7d3afc

Try out he new broadcast channels from `tractor` for data feeds we already have cached. Any time there's a cache hit we load the cached feed and just slap a broadcast receiver on it for the local consumer task.

Use maybe_open_feed() in ems and fsp daemons

7d0f473

Add (lack of proper) ring buffer note

2202abc

Add disclaimer to old data mod

310d8f4

Fix missing cache hit bool element of return

954dc6a

Use broadcast chan for order client and avoid chan repacking

71b50fd

TO SQUASH cached ctx.

0c95160

Add super basic support for data feed "pausing"

c8e3208

Add njs token bucket gist url

2f5abaa

Add pause/resume feed api, delegate to msg stream for broadcast api

1e42f58

Drop removed module import

fe0d66e

goodboy added 2 commits August 26, 2021 17:04

Remove dead OHLC index consumers from subs list on error

2a9d24c

Lol, don't use maybe_open_feed() for now, it's totally borked...

2f1455d

goodboy force-pushed the feed_caching branch from a534701 to 2f1455d Compare August 26, 2021 21:05

goodboy added 3 commits August 30, 2021 17:54

Revert "Lol, don't use maybe_open_feed() for now, it's totally bork…

cae7f48

…ed..." Think this was fixed by passing through `**kwargs` in `maybe_open_feed()`, the shielding for fsp respawns wasn't being properly passed through.. This reverts commit 2f1455d.

Facepalm^2: pass through kwargs

bbcce0c

goodboy requested a review from iamzoltan August 31, 2021 13:17

Cache sample step streams per actor

1184a4d

goodboy commented Aug 31, 2021

View reviewed changes

piker/data/feed.py

brokername,

[sym],

loglevel=loglevel,

**kwargs,

Copy link

Contributor Author

goodboy Aug 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This `kwargs* pass through was critical...

guilledk approved these changes Aug 31, 2021

View reviewed changes

iamzoltan approved these changes Sep 1, 2021

View reviewed changes

goodboy mentioned this pull request Sep 1, 2021

Expose FSP streams as Feeds #216

Open

Re-implement client caching using maybe_open_ctx

2df16e1

goodboy commented Sep 1, 2021

View reviewed changes

piker/_cacheables.py

@asynccontextmanager

async def open_cached_client(

Copy link

Contributor Author

goodboy Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not bad right?

Drop tractor stream shielding use

26cb7aa

goodboy commented Sep 1, 2021

View reviewed changes

Allocate an event per context

4527d4a

goodboy force-pushed the feed_caching branch from 0ee755d to 4527d4a Compare September 1, 2021 14:17

goodboy merged commit 37d94fb into master Sep 1, 2021

goodboy deleted the feed_caching branch September 1, 2021 14:25

goodboy mentioned this pull request Sep 2, 2021

Only set event if entry still exists #217

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feed caching #212

Feed caching #212

goodboy commented Aug 10, 2021 •

edited

Loading

goodboy Aug 10, 2021 •

edited

Loading

goodboy Sep 1, 2021

goodboy Aug 10, 2021

goodboy Aug 10, 2021

goodboy commented Aug 10, 2021

goodboy commented Aug 11, 2021

goodboy commented Aug 16, 2021

goodboy Aug 16, 2021 •

edited

Loading

goodboy Aug 31, 2021

goodboy Aug 31, 2021

goodboy Aug 31, 2021

guilledk left a comment

iamzoltan left a comment

goodboy Sep 1, 2021

goodboy Sep 1, 2021

Feed caching #212

Feed caching #212

Conversation

goodboy commented Aug 10, 2021 • edited Loading

goodboy Aug 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goodboy commented Aug 10, 2021

goodboy commented Aug 11, 2021

goodboy commented Aug 16, 2021

goodboy Aug 16, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guilledk left a comment

Choose a reason for hiding this comment

iamzoltan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goodboy commented Aug 10, 2021 •

edited

Loading

goodboy Aug 10, 2021 •

edited

Loading

goodboy Aug 16, 2021 •

edited

Loading