-
-
Notifications
You must be signed in to change notification settings - Fork 353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mac OS X Sonoma / Sequoia on Mac Mini M2 - NUT 2.8.2 upsdrvctl / usbhid-ups fail to fork the driver as a daemon process. #2642
Comments
Can you please try that later command with more verbosity e.g. Trying to reproduce it on a VM from the CI farm, so with a
So the part where your log ends, in mine is followed by the |
https://stackoverflow.com/questions/31045575/how-to-trace-system-calls-of-a-program-in-mac-os-x has some suggestions about tracing programs (ktrace, dtruss, lowering system protection to be able to trace). Maybe this would expose what the forking in your system blocks on. The CI farm worker is on Monterey, I think, so maybe something in the platform has changed?.. |
Thanks Jim for looking at this. Here's the full log. Want me to try the dummy-ups on Sequoia? what ups.conf are you using? sudo -E ../bin/usbhid-ups -a CP1500PFCLCD -DDDDDD -B -u root |
BTW, I also disable System Integrity Protection ... have the same issue enabled or disabled. |
I tried your dummy-ups and it still fails on the fork. I'll have to look into dtrace. sudo -E ../bin/dummy-ups -a dummyups -DDDDDD -B -u root |
For dummy-ups setup, I used NUT sources and prepared the prerequisites for MacOS/Homebrew combo per https://github.com/networkupstools/nut/wiki/Building-NUT-for-in%E2%80%90place-upgrades-or-non%E2%80%90disruptive-tests and referenced docs. Then after building NUT, I just Afterwards I killed one of the spawned dummy-ups'es and started one with that config name manually. |
Another quick update ... Using dtrace for fork entry / exit results ... From the results below, it looks like the fork return is actually successful and returns PID of the forked usbhid-ups which matches the PID found in the usbhid-ups-CP1500PFCLCD.PID file. So I assume the child process is created and then must crash after forking. syscall::fork:entry ' matched 10 probes 4 173 fork:return 3764 sudo fork returned 3765 5 173 fork:return 3765 usbhid-ups fork returned 3766 |
Thinking of it, your "2.8.2" version of NUT is likely from packaging? Are you in position to build from git per instructions liked above? On one hand, there were some fixes and changes generally since the last release, on another - maybe the tooling/settings/libs used for package build differ from whatever the documented Homebrew-based approach provides. |
Further investigation ... So it appears that the usbhib-ups driver is crashing on a null pointer (Faulting instruction pointer: 0). Based on the error info, I assume it is using dynamic function pointer(s) into Core Foundation. Seeing this in the write sys call. I assume this is being write to an error or system log. But I looked in logs and did see any errors before I started debugging. But my take is that some functionality in CoreFoundation can't be used by a forked process. Sound likes it needs to an exec() call in order to use that functionality. I will look to see if there are any limitations using fork and Apple libraries. The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 176 write:entry PID 16949 TID 1526720 syscall write called 1 395514 .:fault Fault detected in process: usbhid-ups (PID: 16949, TID: 1526720)
1 395786 .:exit Process exited: PID=16949, TID=1526720, Executable=usbhid-ups, Exit Status=1 |
Let me know if someone can make these changes. If not, I'll need to spend some time getting a build/debug environment setup to build NUT for Mac. I'm not really familiar with the requirements to build/test any changes. Thanks, From ChatGPT so need to take that into account ... but it sounds reasonable especially since I was getting those messages in my dtrace output. The error message "The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec()" is typically seen in macOS (or any system using CoreFoundation) when a process uses certain CoreFoundation functions after a fork() but before an exec(). Why This Happens: Key Points: Call exec() After fork(): pid_t pid = fork(); |
Not really versed with Mac ecosystem, sorry. |
OK ... do you have any references/links on how to setup a build/debug environment for MacPorts and how someone goes about submitting / recommending any changes. I don't have a full suite of Mac OS versions so I would be limited to testing on a small number of systems / OS. |
If ChatGPT is right :) then we might have to add some I have no firm idea what it would do to the C variables already collected and expected to be populated post-detachment, but probably nothing good, given that it would cause execution of a new process from scratch, essentially. Some research may be needed to check what exactly is not fork-safe, perhaps it may suffice to close/open the syslog socket, or something. Given that forking to detach from console and go into background (or other sort of multi-threading purposes) is common practice for decades, probably there's a gazillion other programs that have solved this one way or another, including open-source ones to take inspiration from. I'd probably start at known multi-process stuff like Apache httpd, perhaps nginx or sendmail - staples of the ecosystem. Sadly, so far I have zero knowledge of MacPorts beside the name; setting up a cloud VM for CI and going with HomeBrew as the first option I saw is as much as I know of the platform. That said, if you do unravel this - additions to If those two ecosystems can co-exist on the same VM - I might add scenarios to NUT CI farm as well then... |
Also, if the two ecosystems can co-exist (or if you can spawn a Sonoma VM for experiments), and if you can check if current NUT source built with Homebrew works or not, it can help rule out OS problems (something in common libs shared by both ecosystems) vs. ecosystem problems (e.g. HomeBrew might have libs with an implementation of |
Why not just run the driver in the foreground under launchd? I thought libusb had some better protections against using CF in the wrong process, but I also haven't been doing much OS X + USB in the past few years. |
Exactly what I'm doing now ... but without documentation that I could find it was a process to get to this point to figure why upsdrvctl and usbhid-ups failed. BTW the forking in uspd and upsmon don't have the same issue. In this case, I'm using -FF option to force generating the PID so that my LaunchDaemon can shutdown the usbhid-ups driver. |
I ran into this, using the brew package. (As well some additional issues.) I've been tearing my hair out trying to figure out why I can't get the driver and upsd to run in background. It's almost a relief that there is a bug of some kind. Like the OP, I am able to run upsdrvctl, (only in the foreground):
And, I am also able to run upsd:
However, when I try to stop upsdrvctl, I do get a sigterm, but I also get this:
There should be a .pid file, shouldn't there? And I don't know if it's related, but I also am baffled by the permissions. I seem to only be able to start upsdrvctl (or usbhid-ups directly) and upsd successfully if I use sudo and -u root. What am I doing wrong (but almost right)? |
And, by the by, how did you get upsdrvctl and upsd to run as a launch agents? I've been struggling with that too... my .plists seems like they should work, but they're not. Maybe a permissions thing again? |
Can't really speak to MacOS as a platform to run things; in the cloud I only have a (Monterey?) VM to build them in CI. It may well be that by Sonoma (v12 vs. v14 major) they changed a lot of things that I might test on that VM and would "work for me" there :\ That said, so now regarding the general NUT code and behavior:
I am not sure if and how Apple limits access to USB devices. On most of the Unix-like platforms, there is eventually a device file system and each device has a file-like "node" there, with POSIX permissions attached (maybe ACL somewhere, haven't seen that). Most frequently the problem about starting not as With USB there may be a separate layer of complexity, that some other program or kernel handler grabbed the device node (especially if it poses as a HID - Human Interface Device) exclusively, and so won't let the NUT driver have a piece. This is also part that |
Unfortunately, my practice with the platform is limited to setting up the CI build agent. That said, I think the experience should be all documented at https://networkupstools.org/docs/user-manual.chunked/_build_prerequisites_to_make_nut_from_scratch_on_various_operating_systems.html#_macos_with_homebrew for the platform setup (with Homebrew as the one I tried - a chapter with Fink etc. would be welcome), and then running
|
With the tests prepared, you can This would at least rule out where exactly the forking problem is - NUT daemons themselves, or some third party code (e.g. libusb vs. CF as suggested above). With Note that depending on tooling, your actual libs and programs may be in |
So, the command
does start usbhid-ups in the foreground successfully, and it does spawn
with the correct process ID, owned by system. And, the foreground process receives a signal 15: exiting. But I still get
Maybe that's an erroneous error message? How else could the process be terminated successfully? |
I get the same behavior with MacPorts ... I also wondered about this message. I concluded that it is expected behavior. I've noticed that I don't get the fopen message until after the upsdrvctl with start has already exited before the upsdrvctl with stop gets the message. I concluded that the original upsdrvctl that was started once it receives the SIGTERM, it checks for and removes the PID file. The upsdrvctl that is executed with the stop command is following the same path on exit to clean up the PID file but it has already been removed and not finding it is expected if all goes well. I didn't pursue it further since it works without any issues. BTW, I'm using MacPorts daemondo to start and stop both upsdrvctl and upsd in the background on Mac OS X boot and shutdown which works out really well for clean startup and shutdown. I've uploaded what I'm using to start and stop at OS boot / shutdown. org.macports.upsdrvctl.plist.txt |
If you ran those tools and drivers with debug, it would be a bit less of guesswork. But in recent releases, NUT grew some ways to not depend on PID files (e.g. using the same socket/pipe as the one for driver-to- Conversely, recent versions also try to check the file name associated with a PID, to not signal/kill unsuspecting bystanders that happen to have the same PID (e.g. if a PID-file is written into location which survives a reboot). |
Hm, I wonder which program and when complains about |
@pjkerly : would you care to make a PR in your name, e.g. adding these files and a README to a new |
What's a PR? I'm not really familiar with MacPorts other than as a user. I would have attempted to actually fix both upsdrvctl and upsd (at least for the Mac OS fork bug) but I'm not familiar with how to actually setup a build environment, process for getting changes accepted, nor can I test on many different platforms. |
OK ... So a PR ... Pull Request. Let me look into it. Would you be the reviewer / maintainer to merge in? |
Yes. The "how to set up" is in this thread less than 10 posts above, at least as much as I know about MacOS as seen on workers for CI :) |
OK. I managed to get launch agents to run successfully, but I had to add a user-level sudo permission for upsd & upsdrvctl in order to do it. There is something funky that I'm definitely not seeing/understanding/both about the permissions. This is almost assuredly a dumb question, but how do I force the upsdrvctl agent to run before the upsd agent? Finally, after many days of farting around with this, it appears that the CyberPower oem software service can't connect to the UPS concurrently with the NUT service. What I wanted, was for the oem software to alert me with its (local) email notification service (is there another way to get this kind of notification?), and then be able to use an app I found (Mac, iOS) that connects to NUT so I can get real-time status information. Because, of course, the Cyber Power cloud service that does this is both 1) incredibly ugly and 2) behind a paywall. Ugh. |
FYI https://apps.apple.com/us/app/ups-power-monitor/id1500563567 https://apps.apple.com/us/app/ups-power-monitor/id1500180529?mt=12 I don't see any specific information regarding notifications... |
Order shouldn't matter ... They should sync up in either order. I used the oem software but haven't since switching. Have you look at upsmon - can use a command option to send e-mail or (to text)? Or have you looked at the UPS Power Monitor and Power Guard (from App Store for Mac). I personally have a Home Assistant setup with an integration for ups ... |
Oh I didn't know upsmon could do that. Christ, another NUT service to try to figure out how to run... |
several options for command line mail notifications ...
bash bash bash bash
bash bash
bash |
sysmon was easier since it doesn't have the same fork() bug that upsd and upsdrvctl has. But I still used the daemondo, plist, and wrapper approach just to keep all of them consistent in implementation. |
I'll be off for today but it seems you've started a fruitful discussion here, so just adding a few bits :)
On most systems only one program can fully attach to a USB device (or serial port), so yes - either one or another. Part of the issues discussed above with As for |
actually, I was just running it for upsd; the upsmon is superfluous for connecting to the software mentioned above. MacOS has a facility in the OS which talks to a UPS and can gracefully shut down. It just doesn't expose any of the granular information nor is it available remotely. But, I will now take a look at using upsmon as well and rolling my own notification, I guess. I gotta say, for a noob this is right on the border of what I can (maybe) manage... Not the upsmon specifically, just the whole thing. NUT is finicky AF. |
I spent many days debugging / root causing the background / fork() issue so I understand. I use sysmon because I actually have two systems plugged in my UPS. One is a server which I want to run longer. The other isn't critical so that if the UPS is running on battery for ~10 minutes, I have it shutdown more quickly to conserve battery. The server on the other hand runs longer until the battery is down to the last 15% and then it shutdowns a VM cleanly which is running my Home Assistant and shuts down the server cleanly. I can check the status of the UPS through my Home Assistant remotely. Notification is actually secondary because if I lose power, it's unlikely I will be able to send any notifications. I also have my internet router and cable modem on the UPS but the ISP may also be impacted. But if the internet is still available it notifies me if running on battery for more than 10 minutes and just before shutdown. When power is restored I have the UPS send power to the devices only if the power has been restored continuously for more than 10 minutes. |
Nice. I don't need any of that. This started off as just, hmmm I wonder if I can NOT use the Cyber Power dumpster fire OEM software, that also has a subscription (of course it does). I wan't really intending to get eyeballs deep in a terminal / launch agent project... But here we are. |
All right. Well, I've got upsmon going as well, set up to text me on NOTIFY events as per upsmon.conf. I won't lie it was pretty satisfying. It's not perfectly clean, as I'm basically sending messages to myself through iMessage, but I think I can live with that. Better than emails. Yuck. AND I even got the UPS Power Monitor software working on my phone (bounced through my DDNS). So that's pretty neat. I'm not super pumped about running the three processes via sudo, but I don't see what harm it could cause...? |
Semi-related question. I did the pull-the-plug test and I got the text messages (huzzah) but I noticed my run-time dropped from 47 min to 20 minutes and the unit was unplugged for like 30s. I assume that means the battery is toast? (It is 6 years old...) Do you have any experience or opinions about putting a LifePO4 batter in place of a Pb Acid? Im' thinking specifically of the Dakota Lithium: I pull about 100 watts at normal browsing usage and maybe 150 if both pcs are running at the same time at idle. Maybe 230+ if I'm doing stuff? What happens to a UPS with a 12v Dakota battery (rated to max 20a) if it tries to pull more than 240 watts...? |
Run-time estimates depend on calibration results that should be done with same sort of load as you're running, and repeated (becaude batteries do age). If the device supports such a test, you may be able to |
It's not unusual for a UPS to drop estimated runtime soon after it switches to battery particularly for an older battery as it ages. You have to start with what are your requirements / needs. My requirements are simple... 1) A must - Prevent system corruptions from catastrophic and instantaneous power failures. 2) A must / highly desirable - bridge power interruptions / flickers that last a few seconds to tens of minutes. If the power interruption last more than ~10 minutes, it's likely to be longer than my UPS will support any way. 3). Nice to have - Maintain internet connectivity for as long as possible for use with laptop / iPad / iPhones with no specific time-frame. If your projected run time drops from 47 minutes to 20 minutes, then I wouldn't call the battery toast. After the initial drop, do the rate of runtime seem constant with wall clock time? A shutdown sequence should be on the order of 60 secs or so depending on what you are running. And 20 minutes is definitely enough time to bridge power flickers/fluctuations during storms as well as support clean shutdown. But your requirements on how long you want to run at your desired load will dictate the size and performance of the UPS you needs. I'm not familiar with the battery you listed nor whether it is compatible with your UPS. I have a a CyberPower CP1500PFCLCD UPS from Costco which meets my needs. I can find replacement batteries on Amazon for it but haven't done so. My system is also probably ~4 yrs old. I also see a good drop in initial runtime right after switching to battery. |
from the product page: "LiFePO4 charger recommended" which I'm guessing doesn't include most consumer Pb-acid UPSes. 6 years is beyond the usual 3-5 year lifetime for an UPS battery. Maybe 20 minutes is still sufficient for your purposes, but it's like driving on old tires: they're not going to get any better over time, and you've hit the wear indicator. |
Mac OS X Sonoma / Sequoia (I have not tried other OS versions) on Mac Mini M2 - NUT 2.8.2 - I can start both upsdrvctl / usbhid-ups as long as it runs in the foreground. As long as I use -D -F -FF with runs in foreground all works as expected. If I try to run in background mode default or with -B, the driver will find the UPS device and run correctly all the way up until it forks the process. Right now, I have workaround but it does not work as expected.
Then I can stop it successfully using sudo ../sbin/upsdrvctl stop which means that it is generating the PID file success and can use it to stop the driver.
I can also start the usbhid-ups driver manually as well as long as it runs in the foreground. You can see that the driver does find the UPS device and processes correctly just to the point when it would fork the process.
The text was updated successfully, but these errors were encountered: