Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussions of November #19

Open
puncsky opened this issue Nov 29, 2019 · 4 comments
Open

discussions of November #19

puncsky opened this issue Nov 29, 2019 · 4 comments

Comments

@puncsky
Copy link
Owner

puncsky commented Nov 29, 2019

This thread records discussion happened in the chat group in November.

@puncsky
Copy link
Owner Author

puncsky commented Nov 30, 2019

Delivery Guarantees

How to achieve exactly-once delivery?

Storm: Edge, Path, XOR

Imaging a data item flows through a DAG, how to ensure each data item go through each vertex only once?

Along the path, each edge has one starting vertex and one ending vertex. Like A -> C in the following graph.

350px-Tred-G svg

And per one edge and corresponding two vertices, one data item D1 will generate one rand ID 0010. When the item goes through them, it will emit the same ID twice at those two vertices (0010 at a and c).

Similarly, after c, another derived data D2 will have 1011 at c and e. D3 will have 1101 at c and d.

Finally, in the DAG, given one data item and its derived ones should have all rand IDs emit from those vertices XOR equal to 0.

MillWheel: starting vertex retry, ending vertex dup-check & ack

In DAG, imaging a -> b, a will keep retrying until receiving ack from b. b will filter out duplicate messages received and then process, and finally, return ack.

@puncsky
Copy link
Owner Author

puncsky commented Nov 30, 2019

In GFS, how to ensure the serial order of concurrent writes?

https://users.cs.duke.edu/~chase/cps510/slides/gfs-etc.pdf

  1. The client asks the master for a list of replicas, and which replica holds the lease to
    act as primary.
  2. The primary will assign serial numbers for write requests, in case there are multiple clients requesting at the same time.
  3. For replication, primary forwards write requests with the same serial number to secondaries.

@puncsky
Copy link
Owner Author

puncsky commented Nov 30, 2019

Gary

咨询技术大拿们个问题,有知道oracle 数据库(12C)数据,如何导入ElasticSearch 7么?历史数据存储在oracle里,计划导入elasticsearch做搜索,分析。以前2.X版本能用elasticsearch-jdbc 导入oracle 的数据。新版本不支持了。

@puncsky
Copy link
Owner Author

puncsky commented Nov 30, 2019

CAP Recap

Kafka is CA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant