-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async architecture (inventory) #880
Comments
I think we should start with a discussion regarding Inventory and async and what this means/how it works? Currently, inventory is single-threaded sync (it is outside the context of "runners"/before runners). So if we have, for example, a NetBox system with 1000 devices and wanted an async inventory plugin how would this operate? I picked NetBox as there is a NetBox Nornir-plugin and I have a working NetBox system to test against (but it could be any remote HTTP API inventory system). Current inventory load behavior is this: inventory_plugin = InventoryPluginRegister.get_plugin(config.inventory.plugin)
inv = inventory_plugin(**config.inventory.options).load()
# then basically call a transform function (on the hosts if one has been defined) |
I think the first question I'd raise here is; is it really necessary? do we actually need an async inventory for any particular reason or is it just to say we support async inventories? Asking because the inventory is a one-shot thing you run when you initialize it, afterwards you mostly work with already loaded data. I'd understand async tasks to incrementally add information (i.e. a sync inventory that loads the hosts/groups and metadata but interfaces' data, BGP information, etc. is loaded afterwards via async tasks) but not sure I understand the use case for an async inventory. |
I would probably generalize the question and ask, "do we need concurrency in inventory loading" (whether async vs threaded). In other words, for large inventories being loaded via API what are the load times and what kind of improvements can be made via using a concurrent solution (and how much complexity does that entail). |
I could imagine that the API and the rate limit are the bottleneck. Or we are loading too much data. I would appreciate it if an inventory plugin would have an option for a minimal import. (Only host name and some needed properties but not all interfaces, for example.) And if more information is required it could be added to a task. Of course, we have to ensure the threads don't write the same inventory properties, but if we do it by host, it should be fine. I think it would be beneficial to investigate where the bottleneck is. |
100% that, that's why I was suggesting before that what makes more sense is leaving the inventory as it is and rely on "inventory tasks" (which could be async) to load workflow-specific data. IMO, given there is no clear use-case in mind, I'd suggest tabling this particular topic (async inventory) until something more concrete arises. P.S: To keep discussions a bit easier to track I'd also suggest to create different issues for each one of your bulletpoints, otherwise this might get messy real quick :P |
@dbarrosop @ubaumann So just to make sure I am understanding what you are both saying--a better pattern would be to have very simplified inventory plugins (minimal set of information needed to connect to the device). And then subsequent Nornir tasks which populate additional information in hosts/groups for the inventory information that you need. These subsequent Nornir tasks could then be threaded or async (assuming we build an async-runner). Is that a correct summary? |
Yes, exactly. Use the inventory plugin to download your devices, groups and metadata (so you can group them and filter them) and then rely on tasks within your workflows to download the specific data you need for that specific workflow. In my experience it is the best way of managing a large network with lots of configuration data. Otherwise you end up with a humungous inventory that takes ages to be loaded and requires a generous amount of RAM. |
Depending on which environment Nornir is running in I think it could be relevant to have a way to load the initial nodes in an asyncio loop. I.e., if Nornir is started from an environment that is already running within asyncio you wouldn't want to have a blocking call even if it's just to gather a couple of hostnames. However given how Nornir works I don't see this as an issue at all. The only current "blocker" is that the InitNornir function does two synchrounous requests now: return Nornir(
inventory=load_inventory(config),
runner=load_runner(config),
config=config,
data=data,
) If someone wants the initial loading of the inventory to be async if you be very easy to just have these kind of helper functions in your own code, even if they aren't part of the core. |
@dbarrosop came to a similar conclusion with ansible and documented here https://blog.networktocode.com/post/nautobot-ansible-variable-management-at-scale/ |
A remaining question though is, if you are loading a large number of devices from inventory (say >1000) even with minimal API calls per device, is it worth adding concurrency for this in Nornir. Or is this a case where people would always just solve it on their own (with some sort of a custom inventory plugin) or just not really a problem? |
I guess you could go as wide as the threads you have on your api, but seems just as likely to cause issues than resolve, for when you fill up all of the threads and potentially (perhaps not likely?) overload the sql server. |
@ogenstad Wouldn't the For example, for the NetBox plugin: |
I'm 100% for having a minimal inventory but I still thing there is a valid use case for having async support for the inventory. I think the argument about having too many concurrents calls that will overload the API isn't really the point here, it's easy to limit that in the code Also as @ogenstad mentioned when you are running an async code, it's best if all I/O functions support async. so if the goal is to run Nornir Task with Async it make a lot of sense to have the inventory support Async too One more thing to consider, I think it make sense to use a single library/client/plugin within Nornir to connect to a SOT system like Netbox/Nautobot. So if we agree that it make sense to use Async for the task, it wouldn't make sense to use a different library/client/plugin for the inventory. |
Yes this would need to support async as well |
I opened PR #882 with a proposal to support both Sync and Async inventory with very minimal changes. |
Sure, there would be a bit more boiler plate, though the Nornir object itself only cares about the Inventory data it gets from InitNornir and the loader. What I meant was that it could be loaded directly into the object. |
@dgarros Can you re-state this? I didn't follow what you were saying here?
|
@ktbyers if I take our SDK and our nornir plugin as an example
So if I want to use Async in the Tasks, it wouldn't make sense to use the non async client for the inventory because it would create the wrong objects, without Async support. Also the client itself is usually passed from the inventory to the tasks so it would have to be re-initialized later in the code. |
Apologies, it was not meant to be an argument, just a point of consideration. |
@itdependsnetworks It is all good...we are just trying to see the pros and cons of various options. |
I am convinced now there is value in async plugins if anything due to pagination (even though IMO inventory plugins should come with some pre-filtering capabilities so you can download only devices you may be interested in). However, this raises the following question; why support both synch/asynch? For instance, the following comment from @dgarros:
This comment basically suggests that nornir all of a sudden is going to become a fragmented framework where you need to carefully combine your inventories and plugins to make sure they match and can work together. Or did I misunderstand the comment? Assuming I didn't, why are we trying to support both together, then? Isn't this going to be confusing and/or lead to a lot of complexity within nornir? Why not consider releasing nornir 4 as a fully async framework exclussively in that case? I am personally not a fan of having nornir support a disjoint set of plugins that aren't compatible with each other. |
Unfortunately I think the industry is not ready to completely migrate to Async and a lot of people are using Nornir without Async successfully so I think it makes sense to support both for the time being. |
That's fine, but then plugins should be compatible with each other. I don't think nornir users should be put in a position where they need to pick and match plugins and try to understand if they are compatible with each other. This is just going to be bad UX and cause problems to maintainers as people try to figure out why things don't work. To clarify, what I mean here is that a user should be able to use |
Maybe we need to take a step back and decide goals and non-goals and how the ecosystem is going to work in this brave new world before we start discussing architecture/implementation details. |
I will create a set of goals/non-goals in #881. |
I did a very rough inventory proof-of-concept here: https://github.com/ktbyers/nornir/pull/1/files This is with a separate InitNornirAsync and a separate I didn't really worry about code duplication or making all of the parts async (i.e. was focussing on the NetBox http calls and converting them to async). I used NetBox as I have a new version of Netbox in my environment (so it was easier for me to test). I also wanted to get a better sense of some of the async code implications (on the inventory side). |
Looking at your example I am assuming my interpretation of the comment below was wrong:
I assumed the comment meant that the expectation was that an async inventory plugin wouldn't work with a sync task but your |
Yes, I would definitely want the Nornir inventory objects (Hosts, Groups, Defaults) to not care about their load mechanism (i.e. sync vs async). I think we could accomplish both though (i.e. add a field to the Host objects indicating whether they were loaded via async or not) and then some form of a config option (or potentially logic from the Nornir end-user) disallowing any use of sync tasks (for async inventory objects). |
This issue is
async inventory
related only, see:#881
For additional async related topics and links to separate issues on these topics.
The text was updated successfully, but these errors were encountered: