Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scope_probe_garden failed on diego-cell #5

Open
therealqxing opened this issue Mar 28, 2018 · 8 comments
Open

scope_probe_garden failed on diego-cell #5

therealqxing opened this issue Mar 28, 2018 · 8 comments

Comments

@therealqxing
Copy link

weave-scope-release: 0.0.17

I deployed Cloudfoundry on top of vsphere using cf-deployment, all other components work well with weave-scope except for diego-cell.

diego-cell/602c1bca-0041-43d9-9d30-b9d0bc90fae5:~# monit summary
The Monit daemon 5.2.5 uptime: 10m

Process 'consul_agent'              running
Process 'garden'                    running
Process 'rep'                       running
Process 'route_emitter'             running
Process 'netmon'                    running
Process 'vxlan-policy-agent'        running
Process 'silk-daemon'               running
Process 'nfsv3driver'               running
Process 'metron_agent'              running
Process 'scope_probe'               running
Process 'scope_probe_bosh'          running
Process 'scope_probe_garden'        Does not exist
System 'system_localhost'           running

diego-cell/602c1bca-0041-43d9-9d30-b9d0bc90fae5:~# tail -f /var/vcap/sys/log/scope/probe.stderr.log
<probe> ERRO: 2018/03/28 15:33:42.594926 plugins: /var/vcap/data/scope/plugins/garden/garden.sock: /report error: Get http://plugin/report?api_version=1&probe_id=6b4fbdd0eeed593d: dial unix /var/vcap/data/scope/plugins/garden/garden.sock: connect: connection refused
<probe> ERRO: 2018/03/28 15:33:43.604353 plugins: /var/vcap/data/scope/plugins/garden/garden.sock: /report error: Get http://plugin/report?api_version=1&probe_id=6b4fbdd0eeed593d: dial unix /var/vcap/data/scope/plugins/garden/garden.sock: connect: connection refused
......
@st3v
Copy link
Owner

st3v commented Mar 28, 2018

@qxing3 Which version of cf-deployment are you using?

@therealqxing
Copy link
Author

@st3v tagv1.19.0

@st3v
Copy link
Owner

st3v commented Mar 28, 2018

On the cell VM, could you check if /var/vcap/data/garden/garden.sock exists? By default Garden should listen on that UNIX socket (see https://github.com/cloudfoundry/garden-runc-release/blob/b15bb2570a305fb36156380e99619e5fca136f1b/jobs/garden/spec#L36) and I don't see anything in cf-deployment that overrides those settings, unless you're running a Windows cell.

@st3v
Copy link
Owner

st3v commented Mar 28, 2018

Sorry, I might have slightly misread the log you pasted. Could you instead get us the logs for the scope_probe_garden job first?

@therealqxing
Copy link
Author

diego-cell/602c1bca-0041-43d9-9d30-b9d0bc90fae5:~# ls -lh /var/vcap/data/garden/garden.sock
srwxrwxrwx 1 root root 0 Mar 28 15:26 /var/vcap/data/garden/garden.sock

/var/vcap/sys/log/scope/probe_garden.stderr.log:
2018/03/28 13:06:05 Starting on diego-cell/0...
2018/03/28 13:06:05 Listening on unix:///var/vcap/data/scope/plugins/garden/garden.sock
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x63d1fd]

goroutine 24 [running]:
github.com/st3v/scope-garden/vendor/github.com/cloudfoundry-community/go-cfclient.(*Client).DoRequest(0xc420128410, 0xc4200475b0, 0x17, 0xc42024ccf0, 0x21)
        /var/vcap/packages/scope-garden/src/github.com/st3v/scope-garden/vendor/github.com/cloudfoundry-community/go-cfclient/client.go:271 +0xcd
github.com/st3v/scope-garden/vendor/github.com/cloudfoundry-community/go-cfclient.(*Client).ListAppsByQuery(0xc420128410, 0xc42024ccc0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /var/vcap/packages/scope-garden/src/github.com/st3v/scope-garden/vendor/github.com/cloudfoundry-community/go-cfclient/apps.go:191 +0x227
github.com/st3v/scope-garden/vendor/github.com/cloudfoundry-community/go-cfclient.(*Client).ListApps(0xc420128410, 0xc420116b40, 0x0, 0x1, 0x0, 0x0)
        /var/vcap/packages/scope-garden/src/github.com/st3v/scope-garden/vendor/github.com/cloudfoundry-community/go-cfclient/apps.go:227 +0x10a
github.com/st3v/scope-garden/cf.(*directory).fetch.func1(0x12a05f200, 0xc420102d80)
        /var/vcap/packages/scope-garden/src/github.com/st3v/scope-garden/cf/cf.go:82 +0xd8
created by github.com/st3v/scope-garden/cf.(*directory).fetch
        /var/vcap/packages/scope-garden/src/github.com/st3v/scope-garden/cf/cf.go:93 +0x49
2018/03/28 13:06:16 Starting on diego-cell/0...
2018/03/28 13:06:16 Listening on unix:///var/vcap/data/scope/plugins/garden/garden.sock
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x63d1fd]
......

@st3v
Copy link
Owner

st3v commented Mar 28, 2018

Looks like a nil pointer problem in the version of go-cfclient that the probe is using to query CF. If you look at this line you see we capture err but we don't handle it in any way. That causes us to always look at resp in line 271 which in the error case is nil.

It looks like this issue has been fixed in the latest version of go-cfclient (see line 268)

Since there's a fix in the latest version, we should bump the vendor-ed package in scope-garden and cut a new release.

Two questions:

  • How urgent is this for you?
  • Given the explanation above, how would you feel about submitting a PR to both repos? 😉

@therealqxing
Copy link
Author

It's a little urgent for me. Hope you can fix this issue asap, thanks in advance : )

@st3v
Copy link
Owner

st3v commented Mar 28, 2018

Realistically, I won't get to this for a while. At the very earliest this weekend. Just to put your expectations straight. This is OSS and that's why I was hoping you could help out and provide a PR. But I understand if you don't feel comfortable enough to do that and that's perfectly fine. In that case, it will require some patience on your end though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants