-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add script checks #12
Changes from 8 commits
4aa5b29
dcfe397
2af902b
d900766
9de20d3
87fa1cb
770a126
2286fe7
02fe94c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
# health-checker | ||
|
||
A simple HTTP server that will return `200 OK` if the given TCP ports are all successfully accepting connections. | ||
A simple HTTP server that will return `200 OK` if the configured checks are all successful. If any of the checks fail, | ||
it will return `HTTP 504 Gateway Not Found`. | ||
|
||
## Motivation | ||
|
||
|
@@ -14,15 +15,23 @@ a single TCP port, or an HTTP(S) endpoint. As a result, our use case just isn't | |
We wrote health-checker so that we could run a daemon on the server that reports the true health of the server by | ||
attempting to open a TCP connection to more than one port when it receives an inbound HTTP request on the given listener. | ||
|
||
Using the `--script` -option, the `health-checker` can be extended to check many other targets. One concrete exeample is monitoring | ||
`ZooKeeper` node status during rolling deployment. Just polling the `ZooKeeper`'s TCP client port doesn't necessarily guarantee | ||
that the node has (re-)joined the cluster. Using the `health-check` with a custom script target, we can | ||
[monitor ZooKeeper](https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_monitoring) using the | ||
[4 letter words](https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands), ensuring we report health back to the | ||
[Load Balancer](https://aws.amazon.com/documentation/elastic-load-balancing/) correctly. | ||
|
||
## How It Works | ||
|
||
When health-checker is started, it will listen for inbound HTTP requests for any URL on the IP address and port specified | ||
by `--listener`. When it receives a request, it will attempt to open TCP connections to each of the ports specified by | ||
an instance of `--port`. If all TCP connections succeed, it will return `HTTP 200 OK`. If any TCP connection fails, it | ||
will return `HTTP 504 Gateway Not Found`. | ||
an instance of `--port` and/or execute the script target specified by `--script`. If all configured checks - all TCP | ||
connections and zero exit status for the script - succeed, it will return `HTTP 200 OK`. If any of the checks fail, | ||
it will return `HTTP 504 Gateway Not Found`. | ||
|
||
Configure your AWS Health Check to only pass the Health Check on `HTTP 200 OK`. Now when an HTTP Health Check request | ||
comes in, all desired TCP ports will be checked. | ||
comes in, all desired TCP ports will be checked and the script target executed. | ||
|
||
For stability, we recommend running health-checker under a process supervisor such as [supervisord](http://supervisord.org/) | ||
or [systemd](https://www.freedesktop.org/wiki/Software/systemd/) to automatically restart health-checker in the unlikely | ||
|
@@ -46,9 +55,13 @@ health-checker [options] | |
| `--listener` | The IP address and port on which inbound HTTP connections will be accepted. | `0.0.0.0:5000` | ||
| `--log-level` | Set the log level to LEVEL. Must be one of: `panic`, `fatal`, `error,` `warning`, `info`, or `debug` | `info` | ||
| `--help` | Show the help screen | | | ||
| `--script` | Path to script to run - will pass if it completes within configured timeout with a zero exit status. Specify one or more times. | | | ||
| `--script-timeout` | Timeout, in seconds, to wait for the scripts to exit. Applies to all configured script targets. | `5` | | ||
| `--version` | Show the program's version | | | ||
|
||
#### Example | ||
If you execute a shell script, ensure you have a `shebang` line in your script, otherwise the script will fail with an `exec format error`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could our code catch that? Seems like a simple There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Catch here is that we can execute other targets than shell scripts. We would then first have to identify the script as shell script (via the extension or some other way) and only after that check the prefix within. I think it would unnecessarily clutter the code and potentially lead to even more confusing errors 😄 |
||
|
||
#### Example 1 | ||
|
||
Run a listener on port 6000 that accepts all inbound HTTP connections for any URL. When the request is received, | ||
attempt to open TCP connections to port 5432 and 3306. If both succeed, return `HTTP 200 OK`. If any fails, return `HTTP | ||
|
@@ -58,3 +71,23 @@ attempt to open TCP connections to port 5432 and 3306. If both succeed, return ` | |
health-checker --listener "0.0.0.0:6000" --port 5432 --port 3306 | ||
``` | ||
|
||
#### Example 2 | ||
|
||
Run a listener on port 6000 that accepts all inbound HTTP connections for any URL. When the request is received, | ||
attempt to open TCP connection to port 5432 and run the script with a 10 second timout. If TCP connection succeeds and script exit code is zero, return `HTTP 200 OK`. If TCP connection fails or non-zero exit code for the script, return `HTTP | ||
504 Gateway Not Found`. | ||
|
||
``` | ||
health-checker --listener "0.0.0.0:6000" --port 5432 --script /path/to/script.sh --script-timeout 10 | ||
``` | ||
|
||
#### Example 3 | ||
|
||
Run a listener on port 6000 that accepts all inbound HTTP connections for any URL. When the request is received, | ||
attempt to run the configured scripts. If both return exit code zero, return `HTTP 200 OK`. If either returns non-zero exit code, return `HTTP | ||
504 Gateway Not Found`. | ||
|
||
``` | ||
health-checker --listener "0.0.0.0:6000" --script "/usr/local/bin/exhibitor-health-check.sh --exhibitor-port 8080" --script "/usr/local/bin/zk-health-check.sh --zk-port 2191" | ||
``` | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,21 +2,33 @@ package commands | |
|
||
import ( | ||
"fmt" | ||
"github.com/gruntwork-io/health-checker/options" | ||
"github.com/gruntwork-io/gruntwork-cli/logging" | ||
"github.com/urfave/cli" | ||
"github.com/gruntwork-io/health-checker/options" | ||
"github.com/sirupsen/logrus" | ||
"github.com/urfave/cli" | ||
"os" | ||
"strings" | ||
) | ||
|
||
const DEFAULT_LISTENER_IP_ADDRESS = "0.0.0.0" | ||
const DEFAULT_LISTENER_PORT = 5500 | ||
const DEFAULT_SCRIPT_TIMEOUT_SEC = 5 | ||
const ENV_VAR_NAME_DEBUG_MODE = "HEALTH_CHECKER_DEBUG" | ||
|
||
var portFlag = cli.IntSliceFlag{ | ||
Name: "port", | ||
Usage: fmt.Sprintf("[Required] The port number on which a TCP connection will be attempted. Specify one or more times. Example: 8000"), | ||
Name: "port", | ||
Usage: fmt.Sprintf("[One of port/script Required] The port number on which a TCP connection will be attempted. Specify one or more times. Example: 8000"), | ||
} | ||
|
||
var scriptFlag = cli.StringSliceFlag{ | ||
Name: "script", | ||
Usage: fmt.Sprintf("[One of port/script Required] The path to script that will be run. Specify one or more times. Example: \"/usr/local/bin/health-check.sh --http-port 8000\""), | ||
} | ||
|
||
var scriptTimeoutFlag = cli.IntFlag{ | ||
Name: "script-timeout", | ||
Usage: fmt.Sprintf("[Optional] Timeout, in seconds, to wait for the scripts to complete. Example: 10"), | ||
Value: DEFAULT_SCRIPT_TIMEOUT_SEC, | ||
} | ||
|
||
var listenerFlag = cli.StringFlag{ | ||
|
@@ -33,6 +45,8 @@ var logLevelFlag = cli.StringFlag{ | |
|
||
var defaultFlags = []cli.Flag{ | ||
portFlag, | ||
scriptFlag, | ||
scriptTimeoutFlag, | ||
listenerFlag, | ||
logLevelFlag, | ||
} | ||
|
@@ -58,19 +72,27 @@ func parseOptions(cliContext *cli.Context) (*options.Options, error) { | |
logger.SetLevel(level) | ||
|
||
ports := cliContext.IntSlice("port") | ||
if len(ports) == 0 { | ||
return nil, MissingParam(portFlag.Name) | ||
|
||
scriptArr := cliContext.StringSlice("script") | ||
scripts := options.ParseScripts(scriptArr) | ||
|
||
if len(ports) == 0 && len(scripts) == 0 { | ||
return nil, OneOfParamsRequired{portFlag.Name, scriptFlag.Name} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice error handling 👍 |
||
} | ||
|
||
scriptTimeout := cliContext.Int("script-timeout") | ||
|
||
listener := cliContext.String("listener") | ||
if listener == "" { | ||
return nil, MissingParam(listenerFlag.Name) | ||
} | ||
|
||
return &options.Options{ | ||
Ports: ports, | ||
Listener: listener, | ||
Logger: logger, | ||
Ports: ports, | ||
Scripts: scripts, | ||
ScriptTimeout: scriptTimeout, | ||
Listener: listener, | ||
Logger: logger, | ||
}, nil | ||
} | ||
|
||
|
@@ -95,4 +117,13 @@ type MissingParam string | |
|
||
func (paramName MissingParam) Error() string { | ||
return fmt.Sprintf("Missing required parameter --%s", string(paramName)) | ||
} | ||
} | ||
|
||
type OneOfParamsRequired struct { | ||
param1 string | ||
param2 string | ||
} | ||
|
||
func (paramNames OneOfParamsRequired) Error() string { | ||
return fmt.Sprintf("Missing required parameter, one of --%s / --%s required", string(paramNames.param1), string(paramNames.param2)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/exeample/example/