Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsqd: switch to monotonic clocks for ID generation #1249

Closed
wants to merge 1 commit into from

Conversation

slayercat
Copy link

  1. nsqd/guid.go: uses pseudo-random generate method. will not fail to wait-retry loop when time gone backwards
  2. time diff: use time.Time for diff. with monotonic clock to measure elapsed time [1]
  3. pqueue(s): use time.Time for heap / priority compare
  4. fix tests.

See also:
[1] golang Monotonic Clocks: https://golang.org/pkg/time

1. nsqd/guid.go: uses pseudo-random generate method. will not fail to wait-retry loop when time gone backwards
2. time diff: use time.Time for diff. with monotonic clock to measure elapsed time [1]
3. pqueue(s): use time.Time for heap / priority compare
4. fix tests.

See also:
[1] golang Monotonic Clocks: https://golang.org/pkg/time
@slayercat
Copy link
Author

manually test:

open a server

 ./nsqd -log-level 'debug'

pub

sh -c 'while true; do uuidgen; done' |./to_nsq -topic 'asdf#ephemeral' --nsqd-tcp-address '127.0.0.1:4150'

sub

./nsq_tail -topic 'asdf#ephemeral' --nsqd-tcp-address 127.0.0.1:4150

change time

 date -s '2020-01-01 00:00:00'

expected result

change server time will not break the nsqd data stream.

@ploxiln
Copy link
Member

ploxiln commented May 6, 2020

previously discussed in #1124

Looks like good stuff. I'm not sure about the random ID algo ... but it might be time to break previous behavior and make the switch. I'll properly review eventually ...

@slayercat
Copy link
Author

slayercat commented May 7, 2020

UUID generate.

A similar algorithm is UUIDv4 (rfc4122) generate alg. see this for an implement.

We may read rand.Reader for a better seed either. But i think nanosecond mixs nodeID will be good enough to make unique ids (orderless of course).

The nodeID

After a futher dig in this day, The nodeID is much smaller then initially thought...

The nodeID is defaultly assigned here. Could not understand why it's limited in [0,1024). It will much easy to collision with other nodes in such a small range.

This may be better:

	h := md5.New()
	io.WriteString(h, hostname)
	md5hash := h.Sum(nil)
	var defaultID int64
	binary.Read(bytes.NewBuffer(md5hash), binary.LittleEndian, &defaultID)

UUID generate improvement with such a small nodeID

The code uses nanosecond for rand seed, It is unlikely to have any collision in production. However, here's some improvement to discuss.

  • method 1, turn to rand.Reader for seed. discard the nodeID
  • method 2, make nodeID a cycle shift to the higher bits. cycle shift for avoid lose precision. shift 40 bits seems good enough for avoid the nanosecond part.
  • method 3, turn default nodeID to a better way. But I'm not sure if it'll break any cluster existing.

@mreiferson mreiferson added the bug label Jun 14, 2020
@mreiferson mreiferson changed the title nsqd: make backward time turning available nsqd: switch to monotonic clocks for ID generation Jun 14, 2020
@mreiferson mreiferson self-assigned this Jun 14, 2020
@mreiferson
Copy link
Member

Did a little digging on this. The historical requirements here are two-fold:

  1. Per-nsqd / per-topic "globally" unique ID
  2. Unique across nsqd restarts

Regarding (1), channels re-use the ID generated when the message is published to a topic. IDs are persisted via the diskqueue, which results in (2). The current algorithm satisfies (2) by assuming that time only ever moves forward, and generates IDs that are roughly time sorted.

IMO, the "right" answer here is an actual GUID (across all nsqd) for each message in a topic, but that obviously requires some persistent state and coordination. This is possible if we do things like #625 and #510, but practically those may never happen.

The question then becomes: is breaking backwards compatibility to be able to handle time going backwards actually worth it?

The argument in favor is that that you already can't really depend on these message IDs outside of NSQ, given the scarcely documented and limited use of --node-id, so what value do they have as is?

The argument against seems to be that we're worried that newly generated message IDs may conflict with diskqueue persisted messages upon upgrade? Is that all?

@slayercat
Copy link
Author

slayercat commented Jul 1, 2020

If we do not need ID to be time sorted, use rand.Reader to generate it could certainly make the ID globally unique (or at least make the probability of conflict to be negligible. )

(edited)

I've found some information about the probabilities of collision randomly generated values:
https://preshing.com/20110504/hash-collision-probabilities/

In our case, a collision of id is just like a collision of hash. If a nsq cluster have about 2 million message storage, the probabilities of new generate id collision will be 1 in 10million.

refer the article for detail, and there's a picture of collision probabilities here.

图片

@slayercat slayercat closed this Jul 19, 2021
@daoxuans
Copy link

Isn't that a good solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants