Skip to content

Software

Zach Day edited this page Feb 24, 2020 · 13 revisions

sunneed

sunneed (sunneed users need no external energy daemon) is the main software component of this project.

Goals

  • Keep-alive: sunneed does not let the battery die.
  • Power and time efficiency: if the system is at full battery while charging, it means any additional charge will go to waste; thus, it is favorable to always be below maximum charge while charging. If there is not enough power to do something, client applications should have some recourse beyond waiting for enough power if they have other things to do, since time spent waiting is time wasted.
  • Multi-tenancy: sunneed primarily targets multi-tenant systems, so care must be taken to hide as much tenant-specific information as possible.

Design basis

If the battery dies, then the system dies with it; therefore, making sure the battery never reaches zero is the fundamental goal of sunneed. To that end, it should always be keeping an eye on the battery to take emergency measures if it gets in the red.¹ But obviously, we want to come up with ways to achieve that fundamental goal preemptively rather than reactively, so we need to take the first step and limit the power draw of an application well in advance of the battery reaching a critical state. We will do this with two techniques: power-dependent CPU scheduling, and power-based resource access control for external peripherals.

We want to impose low programming overhead on client applications. Therefore, sunneed should play a passive role whenever possible, observing the power behaviors of currently-running processes and handling them appropriately. Applications already operate under the assumption that they can be halted at any moment, for any length of time, if the scheduler has more pressing matters to attend to. So let's have sunneed act as an auxiliary scheduler, stopping applications based on power usage (the serious logic behind this is below). This way, sunneed is completely invisible to the process (until they need to use an external device).

Accesses to external devices will be handled separately from other power-consuming acts like using CPU or memory. The rationale behind this is that execution of instructions can be halted at any moment, meaning that pure-CPU computations can be divided into VERY granular pieces. Compare this to taking a photo with the camera, which requires a sudden burst of power to the camera that cannot be stopped in the middle.

Although we do not want to impose too heavily upon application programmers, we can facilitate power management and help clients intelligently respond to power situations if they have some kind of interaction with sunneed beyond just being watched. For example, an application that wants to take a photo might find that there's not enough budget to do so. If that's the case, they might still have some CPU-based processing to do. Since using the CPU is a much more granular operation than taking a photo, it makes sense that they would try to run the processing and let sunneed pause them mid-computation to make more effective use of their time, especially if the alternative is to sit there doing nothing.

Quantums

Let's call each time-of-day stretch a drain epoch (simply put, a day would have two drain epochs; a ~16-hour one during suntime and an ~8-hour one during moontime). At the start of each drain epoch, the availableCharge will be calculated based on what kind of epoch it is. The system should try to consume the epoch's availableCharge over the course of the epoch. To do this, we will divide the epoch into x quantums; then, our goal becomes to consume availableCharge/x each quantum. Each tenant is allotted a fraction of power each quantum relative to that tenant's priority (probably determined by how much money that tenant pays us). If a tenant asks to do a resource op that costs more than their remaining budget for that quantum, their request is rejected. A tenant that runs out of power budget doing CPU operations will be paused until they get another budget when the next quantum begins.

Power debt/credit

The above rules about rejecting power requests and pausing processes represent the general line of thinking for power budget management, but we will need to copy some features from similar systems to cover certain cases. We gave the example above of a process whose camera request is denied, but can still do some processing. What if it doesn't have any processing left to do, though? Our thinking there is within the bounds of the quantum, but we know that we actually have more power than that -- we have availablePower, minus whatever we've used this epoch! To save time, we can allot the camera request to the client, and simply decrement their powerCredit by however much is consumed. Conversely, the client can choose to manually go to sleep to save power. Then, we add whatever power they didn't consume to their powerCredit. When the next quantum rolls around, we add their powerCredit to whatever power they would have received normally. With this, we can give applications some flexibility in how they deal with power-constrained situations. It also solves the edge case of a single operation taking more power than available in a single quantum, which would be impossible as the process' budget would never be high enough to perform it.

Power management math

Note that the solutions I present here ignore a lot of miscellaneous things that need to be accounted for, mostly related to error compensation.

At the beginning of a solar drain epoch (aka the morning but it sounds super cool right), we have startCharge units of power in the battery. We want to hit maxCharge by the end of the night. Using predictions, we can know in advance the dailyChargeIntake. With this we can have an availableCharge = excessCharge = dailyChargeIntake - (maxCharge - startCharge).² Our goal is therefore to distribute the availableCharge to applications fairly. In a lunar drain epoch the situation is actually quite similar; we have a startCharge and we want to bring the battery down to a minCharge. Thus, our "availableCharge" in this situation is just startCharge - minCharge.

Power proportions

At the beginning of runtime, each tenant has a power proportion of 1/N where N is the number of tenants. At the start of a quantum, availableCharge is set to the present available charge of the system's battery, and each tenant's chargeUsed is set to 0. A process is allowed to use at most availableCharge * powerProportion charge in the coming quantum. At the end of a quantum, the power proportion of tenants is recalculated to 1 - (chargeUsed / (availableCharge * powerProportion)).

Code stuff

Kernel interaction

All this talk is great, but how are we going to control the power consumption of applications? The answer is two parts: first, tracking CPU usage. Processes can spend their power budget on computations (rather than communications), which would consume CPU time power. The second level of control is on the communications level. Processes will be blocked from accessing any external resources. Devices are communicated with on Linux via read and write system calls to specific device paths. If we prevent those calls from being made, we can force the process to go through sunneed to make the call, where we can perform additional bookkeeping on the process' power usage. (**TODO: Should we use a full-on container like Docker to isolate the processes, or go with a more lightweight solution with seccomp-bpf?)

Resource profiler

NOTE: This is all tentative; anywhere I say something definitively, it is implied that the point is open for discussion

Physical resource requests will constitute a significant portion of power used. The model that we describe above necessitates knowing the cost of a resource use to know whether to grant or deny the requests. Thus, the system should at startup, perform some sample request on each device. This will require some investment on the part of the people deploying the sunneed-powered system; they will have to write code to register each physical resource they want to use. The deployer will write this code in a special file/function/whatever that is run on sunneed startup; sunneed will perform each given call, calculate the difference in remaining battery before and after the call, and then use that metric during its regular runtime when the call is made to check if the call is within budget.

I see the issues with this approach. We would need a way to ensure constant (or near-constant) power usage bounds on the calls, or rather, the deployers themselves would have to make sure that power cost doesn't fluctuate between subsequent calls. This might not entirely be a bad thing, as it encourages resource access functions that are granular and avoid excessive logic.

Scheduler

TODO: More solid distinction between clients and processes—should clients specify specific weights for budgets for individual processes?

TODO: Deeper implementation details; how should CPU usage be recorded?

TODO: Get a better sense of time scale, i.e. how often should sunneed check CPU usage, how small should quantums be

The scheduler portion of sunneed is quite simple. At the start of a quantum, the power budget for the quantum is calculated as described above. Each process is given a fraction of the power budget proportional to their tenant's investment. sunneed checks CPU usage of each tenant process and subtracts from their budget the CPU usage multiplied by the power cost of the CPU. If a process uses up all of its power budget, it is put to sleep until it gets its budget back when the next quantum rolls around.

Footnotes

1: Although certainly not within the realm of possibility anytime soon, I can theoretically see a system that doesn't monitor its battery, but instead uses known power consumption metrics to accurately predict the charge based on the total power consumed since a full charge. I don't know if there's any benefit to ditching the battery monitor, though.

2: There are some additional concerns, such as the fact that a battery at its maxCharge cannot be charged any further, so we would essentially be wasting dailyChargeIntake if we don't make use of it at the time. The dumb solution to this is pretty much to tell applications to go crazy and gobble up as much power as they want (for a brief duration) if the battery is at maxCharge during the daytime.