Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: update to boxo without goprocess #10567

Merged
merged 7 commits into from
Nov 19, 2024
Merged

chore: update to boxo without goprocess #10567

merged 7 commits into from
Nov 19, 2024

Conversation

gammazero
Copy link
Contributor

@gammazero gammazero commented Oct 31, 2024

@gammazero gammazero force-pushed the boxo-without-goprocess branch 2 times, most recently from 0089a37 to d63ac5e Compare November 5, 2024 20:38
@gammazero gammazero added the skip/changelog This change does NOT require a changelog entry label Nov 7, 2024
@gammazero gammazero marked this pull request as ready for review November 7, 2024 21:13
@gammazero gammazero requested a review from a team as a code owner November 7, 2024 21:13
@lidel
Copy link
Member

lidel commented Nov 12, 2024

Triage note: ok, just make sure CI is green

Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've tracked down why prometheus metrics tests were failing.

When manually starting ipfs daemon, it produces errors:

2024-11-18T21:07:43.009+0100	ERROR	metrics-prometheus	[email protected]/binding.go:61	Registering prometheus collector, name: <no-scope>_pending_tasks, error: descriptor Desc{fqName: "<no-scope>_pending_tasks", help: "Total number of pending tasks", constLabels: {}, variableLabels: {}} is invalid: "<no-scope>_pending_tasks" is not a valid metric name
2024-11-18T21:07:43.013+0100	ERROR	metrics-prometheus	[email protected]/binding.go:61	Registering prometheus collector, name: <no-scope>_active_tasks, error: descriptor Desc{fqName: "<no-scope>_active_tasks", help: "Total number of active tasks", constLabels: {}, variableLabels: {}} is invalid: "<no-scope>_active_tasks" is not a valid metric name
2024-11-18T21:07:43.013+0100	ERROR	metrics-prometheus	[email protected]/binding.go:61	Registering prometheus collector, name: <no-scope>_pending_block_tasks, error: descriptor Desc{fqName: "<no-scope>_pending_block_tasks", help: "Total number of pending blockstore tasks", constLabels: {}, variableLabels: {}} is invalid: "<no-scope>_pending_block_tasks" is not a valid metric name
2024-11-18T21:07:43.013+0100	ERROR	metrics-prometheus	[email protected]/binding.go:61	Registering prometheus collector, name: <no-scope>_active_block_tasks, error: descriptor Desc{fqName: "<no-scope>_active_block_tasks", help: "Total number of active blockstore tasks", constLabels: {}, variableLabels: {}} is invalid: "<no-scope>_active_block_tasks" is not a valid metric name

Likely due to how https://github.com/ipfs/boxo/blob/625aadd0c7a669f2df698c687a4cd22c2f78ffd1/bitswap/metrics/metrics.go#L33 works (called before context has all info?).

We may need to refactor the way bitswap metrics are registered, avoid magic, use prometheus.DefaultRegisterer by default and then allow override via WithPrometheusRegistry(prometheus.NewRegistry()) if needed (eg. in parallel tests) – some prior art in ipfs/boxo#722.


Unsure if related, second problem is shutdown crashing.

Trying to kill ipfs deamon with ctrl+c produces panic, there seem to be goprocess still somewhere:

Run 'ipfs id' to inspect announced and discovered multiaddrs of this node.
panic: Process cannot add children after being closed

goroutine 1 [running]:
github.com/jbenet/goprocess.(*process).AddChild(0xc000c38d80, {0x3663ea8, 0xc001d1ab40})
	github.com/jbenet/[email protected]/impl-mutex.go:99 +0x24d
github.com/ipfs/kubo/cmd/ipfs/kubo.daemonFunc(0xc0003f0000, {0x6?, 0xc0000f0050?}, {0x2e15620, 0xc0004b40a0})
	github.com/ipfs/kubo/cmd/ipfs/kubo/daemon.go:515 +0x1aa8
github.com/ipfs/go-ipfs-cmds.(*executor).Execute(0x489a100?, 0xc0003f0000, {0x7ff5b2869578, 0xc0000c7c80}, {0x2e15620, 0xc0004b40a0})
	github.com/ipfs/[email protected]/executor.go:88 +0x126
github.com/ipfs/kubo/cmd/ipfs/kubo.tracingWrappedExecutor.Execute({{0x36337a0?, 0xc00028a078?}}, 0xc0003f0000, {0x7ff5b2869578, 0xc0000c7c80}, {0x2e15620, 0xc0004b40a0})
	github.com/ipfs/kubo/cmd/ipfs/kubo/start.go:375 +0x3eb
github.com/ipfs/go-ipfs-cmds/cli.Run({0x364f258?, 0xc0004434f0?}, 0x49faf20, {0xc000160000, 0x2, 0x2}, 0x6e0?, 0xc00012e028, 0xc00012e030, 0x33b6370, ...)
	github.com/ipfs/[email protected]/cli/run.go:137 +0x99c
github.com/ipfs/kubo/cmd/ipfs/kubo.Start(0x33b6370)
	github.com/ipfs/kubo/cmd/ipfs/kubo/start.go:213 +0x516
main.main()
	github.com/ipfs/kubo/cmd/ipfs/main.go:10 +0x1a

@gammazero
Copy link
Contributor Author

gammazero commented Nov 19, 2024

I updated this PR to use a "fixed" PR version of boxo to fix the metrics registration issue. Another small fix in the daemon code to remove the panic on early ctrl-c before daemon ready.

The change in boxo was to pass the context into the server decision engine. I would prefer not relying on long-lived contexts to do things like metrics registration.

@lidel lidel mentioned this pull request Nov 19, 2024
25 tasks
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, both issues are fixed, and helia-interop was fixed upstream.

Switched to commit from boxo main. I'm going to merge this so we have more time to catch issues (if any).

@lidel lidel merged commit 3a1b8ee into master Nov 19, 2024
17 checks passed
@lidel lidel deleted the boxo-without-goprocess branch November 19, 2024 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip/changelog This change does NOT require a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants