Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to log program crashes to the system log? #192

Open
alankm opened this issue Nov 19, 2019 · 3 comments
Open

Is it possible to log program crashes to the system log? #192

alankm opened this issue Nov 19, 2019 · 3 comments

Comments

@alankm
Copy link

alankm commented Nov 19, 2019

I have a program that uses lots of goroutines, and I suspect that a rare panic is possible. This program runs as a service on Windows, and as far as I can tell that means you cannot access stderr to view the stacktrace when this happens (I'm not a Windows expert).

I've been investigating how to try and capture one of these crashes. Obviously the brute force approach would be to put recovers everywhere, but this program is very complex and the crash is rare. Is there a better way to track down the issue?

I've got two ideas right now, neither of which is ideal. The first is to do something equivalent to dup2 to redirect stderr to a file, and bypass the Windows service logs entirely. The second is to make the service program a dumb wrapper that spawns the real program as a child process, then reads its stderr and logs it.

@shrinidhi111
Copy link

For logs within the service you can of course log to a file.

But afaik if the service itself crashes, Windows logs the message/error code to system events (Open Event Viewer). You can perhaps refactor your code to exit with specific error codes based on where the panic is happening and cross-check in system events.

@jeremyvisser
Copy link

But afaik if the service itself crashes, Windows logs the message/error code to system events

Based on my experience yesterday, I encountered a hard-to-debug problem where a panic occurred but wasn't logged to the event logs.

Windows of course detected the crash, and logged the fact that the service crashed, but the panic itself (and associated stack trace) was never logged. All I got was this log from Windows:

The <foo> service terminated unexpectedly.  It has done this 1 time(s).  The following corrective action will be taken in 30000 milliseconds: Restart the service.

When I ran the service interactively from a terminal, the actual error and panic was printed to stderr, but it appears Windows does not capture stdout/stderr from a service process. (Which makes sense; event logs are the defined way to log things for services, not stdout/stderr.)

I'm not sure whether the correct solution is to perform a recover() and try to print the panic to the nearest windowsService.Logger(), or whether the solution is to override os.Stdout and os.Stderr, pointing it to the event logger.

Either way, the current state of allowing panic logs to go silently unreported is a bad state of affairs.

@jeremyvisser
Copy link

jeremyvisser commented Aug 10, 2022

Hmmm, doing some further reading I found golang/go#42888 where it looks like the issue is more complex.

Basically, panic() does not write to os.Stderr (it writes directly to fd 3), and recover() only works for the current goroutine. So it's really impractical to solve this inside this module.

In the upstream issue, they are looking at solving this inside the Go runtime (and/or x/sys/windows package) by writing to the event logs in event of panic (possibly needing to call a SetCrashEvent() function) which does sound like a much better thing to wait for since it will catch very low-level problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants