-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ranger.Envelope.Merge: ensure uniform offset distribution #269
Comments
mattnibs
referenced
this issue
in brimdata/super
Aug 13, 2020
Introduce ranger.Envelope.Merge that merges two Envelopes into a single Envelope. This fixes bug where indexing a large pcap causes the system to oom panic. When constructing the time index for a pcap, compress the array of offset points to an Envelope when the size of the array reaches a certain threshold. Subsequent compressions will be merged into the section's Envelope keeping the memory footprint low. The downside to this approach is for the indexes of large pcap files the difference between adjacent X values starts out very wide then narrows as one iterate through the Bins. This will result in larger pcap scans (i.e. slow searches) for hits at the beginning of the file and smaller scans (i.e. faster searches) towards the end. Consensus was that the difference in search times probably won't be noticeable enough to warrant introducing a fancier algorithm. Filed #1095 to revisit. Closes #1039
mattnibs
referenced
this issue
in brimdata/super
Aug 13, 2020
Introduce ranger.Envelope.Merge that merges two Envelopes into a single Envelope. This fixes bug where indexing a large pcap causes the system to oom panic. When constructing the time index for a pcap, compress the array of offset points to an Envelope when the size of the array reaches a certain threshold. Subsequent compressions will be merged into the section's Envelope keeping the memory footprint low. The downside to this approach is for the indexes of large pcap files the difference between adjacent X values starts out very wide then narrows as one iterate through the Bins. This will result in larger pcap scans (i.e. slow searches) for hits at the beginning of the file and smaller scans (i.e. faster searches) towards the end. Consensus was that the difference in search times probably won't be noticeable enough to warrant introducing a fancier algorithm. Filed #1095 to revisit. Closes #1039
mattnibs
referenced
this issue
in brimdata/super
Aug 13, 2020
Introduce ranger.Envelope.Merge that merges two Envelopes into a single Envelope. This fixes bug where indexing a large pcap causes the system to oom panic. When constructing the time index for a pcap, compress the array of offset points to an Envelope when the size of the array reaches a certain threshold. Subsequent compressions will be merged into the section's Envelope keeping the memory footprint low. The downside to this approach is for the indexes of large pcap files the difference between adjacent X values starts out very wide then narrows as one iterate through the Bins. This will result in larger pcap scans (i.e. slow searches) for hits at the beginning of the file and smaller scans (i.e. faster searches) towards the end. Consensus was that the difference in search times probably won't be noticeable enough to warrant introducing a fancier algorithm. Filed #1095 to revisit. Closes #1039
mattnibs
referenced
this issue
in brimdata/super
Aug 14, 2020
Introduce ranger.Envelope.Merge that merges two Envelopes into a single Envelope. This fixes bug where indexing a large pcap causes the system to oom panic. When constructing the time index for a pcap, compress the array of offset points to an Envelope when the size of the array reaches a certain threshold. Subsequent compressions will be merged into the section's Envelope keeping the memory footprint low. The downside to this approach is for the indexes of large pcap files the difference between adjacent X values starts out very wide then narrows as one iterate through the Bins. This will result in larger pcap scans (i.e. slow searches) for hits at the beginning of the file and smaller scans (i.e. faster searches) towards the end. Consensus was that the difference in search times probably won't be noticeable enough to warrant introducing a fancier algorithm. Filed #1095 to revisit. Closes #1039
brim-bot
referenced
this issue
in brimdata/zui
Aug 14, 2020
…" by mattnibs This is an auto-generated commit with a zq dependency update. The zq PR brimdata/super#1096, authored by @mattnibs, has been merged. pcap index: Compress offsets that exceed threshold Introduce ranger.Envelope.Merge that merges two Envelopes into a single Envelope. This fixes bug where indexing a large pcap causes the system to oom panic. When constructing the time index for a pcap, compress the array of offset points to an Envelope when the size of the array reaches a certain threshold. Subsequent compressions will be merged into the section's Envelope keeping the memory footprint low. The downside to this approach is for the indexes of large pcap files the difference between adjacent X values starts out very wide then narrows as one iterate through the Bins. This will result in larger pcap scans (i.e. slow searches) for hits at the beginning of the file and smaller scans (i.e. faster searches) towards the end. Consensus was that the difference in search times probably won't be noticeable enough to warrant introducing a fancier algorithm. Filed brimsec/zq#1095 to revisit. Closes brimdata/super#1039
alfred-landrum
referenced
this issue
in brimdata/super
Aug 17, 2020
Introduce ranger.Envelope.Merge that merges two Envelopes into a single Envelope. This fixes bug where indexing a large pcap causes the system to oom panic. When constructing the time index for a pcap, compress the array of offset points to an Envelope when the size of the array reaches a certain threshold. Subsequent compressions will be merged into the section's Envelope keeping the memory footprint low. The downside to this approach is for the indexes of large pcap files the difference between adjacent X values starts out very wide then narrows as one iterate through the Bins. This will result in larger pcap scans (i.e. slow searches) for hits at the beginning of the file and smaller scans (i.e. faster searches) towards the end. Consensus was that the difference in search times probably won't be noticeable enough to warrant introducing a fancier algorithm. Filed #1095 to revisit. Closes #1039
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The solution to brimdata/super#1039 introduces a curious behavior for generated pcap indexes: For the indexes of large pcap files the difference between adjacent X values starts out very wide then narrows as one iterate through the Bins. This will result in larger pcap scans (i.e. slow searches) for hits at the beginning of the file and smaller scans (i.e. faster searches) towards the end. Consensus was that the difference in search times probably won't be noticeable enough to warrant introducing a fancier algorithm.
This ticket is to revisit the change to ranger.Envelope and find a solution that generates merged Envelopes with more uniform distance between adjacent offsets.
The text was updated successfully, but these errors were encountered: