Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Offset to libbeat/reader.Message #39873

Merged
merged 13 commits into from
Jun 21, 2024

Conversation

AndersonQ
Copy link
Member

@AndersonQ AndersonQ commented Jun 12, 2024

Proposed commit message

libbeat: add Offset to libbeat/reader.Message

This commit introduces the Offset property to libbeat/reader.Message, which stores the total number of bytes read and discarded before generating the message. The Offset field allows inputs to accurately determine how much data has been read up to the message, calculated as Message.Bytes + Message.Offset.

With this new Offset field, the filestream input correctly updates its state to account for data read but discarded by the include_message parser.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
    -~~ [ ] I have made corresponding change to the default configuration files~~
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

None

How to test this PR locally

Follow the instructions on #39653

Related issues

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 12, 2024
Copy link
Contributor

mergify bot commented Jun 12, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @AndersonQ? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@AndersonQ AndersonQ added Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team bugfix backport-v8.14.0 Automated backport with mergify labels Jun 13, 2024
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jun 13, 2024
@AndersonQ AndersonQ changed the title wip Add Offset to libbeat/reader.Message Jun 17, 2024
@AndersonQ AndersonQ marked this pull request as ready for review June 17, 2024 16:28
@AndersonQ AndersonQ requested a review from a team as a code owner June 17, 2024 16:28
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@pierrehilbert pierrehilbert requested review from rdner and VihasMakwana and removed request for fearful-symmetry June 17, 2024 16:30
@pierrehilbert
Copy link
Collaborator

@AndersonQ could you please take care of the linter issues?

@@ -94,6 +94,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Fix cache processor expiries infinite growth when large a large TTL is used and recurring keys are cached. {pull}38561[38561]
- Fix parsing of RFC 3164 process IDs in syslog processor. {issue}38947[38947] {pull}38982[38982]
- Rename the field "apache2.module.error" to "apache.module.error" in Apache error visualization. {issue}39480[39480] {pull}39481[39481]
- Add the Offset property to libbeat/reader.Message to store the total number of bytes read and discarded before generating the message. This enables inputs to accurately determine how much data has been read up to the message, using Message.Bytes + Message.Offset. {pull}39873[39873] {issue}39653[39653]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this in the user facing changelog, this is more of a developer only detail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Craig, the other item you wrote is enough.

Suggested change
- Add the Offset property to libbeat/reader.Message to store the total number of bytes read and discarded before generating the message. This enables inputs to accurately determine how much data has been read up to the message, using Message.Bytes + Message.Offset. {pull}39873[39873] {issue}39653[39653]

}
time.Sleep(10 * time.Millisecond)
}
fmt.Fprintf(msg, "unexpected number of events; expected: %d, actual: %d",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Fprintf(msg, "unexpected number of events; expected: %d, actual: %d",
fmt.Fprintf(msg, "unexpected number of events; expected: %d, actual: %d\n",

@@ -94,6 +94,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Fix cache processor expiries infinite growth when large a large TTL is used and recurring keys are cached. {pull}38561[38561]
- Fix parsing of RFC 3164 process IDs in syslog processor. {issue}38947[38947] {pull}38982[38982]
- Rename the field "apache2.module.error" to "apache.module.error" in Apache error visualization. {issue}39480[39480] {pull}39481[39481]
- Add the Offset property to libbeat/reader.Message to store the total number of bytes read and discarded before generating the message. This enables inputs to accurately determine how much data has been read up to the message, using Message.Bytes + Message.Offset. {pull}39873[39873] {issue}39653[39653]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Craig, the other item you wrote is enough.

Suggested change
- Add the Offset property to libbeat/reader.Message to store the total number of bytes read and discarded before generating the message. This enables inputs to accurately determine how much data has been read up to the message, using Message.Bytes + Message.Offset. {pull}39873[39873] {issue}39653[39653]

@pierrehilbert pierrehilbert requested a review from belimawr June 19, 2024 13:12
@AndersonQ AndersonQ enabled auto-merge (squash) June 21, 2024 13:28
@AndersonQ AndersonQ merged commit 535a174 into elastic:main Jun 21, 2024
121 checks passed
mergify bot pushed a commit that referenced this pull request Jun 21, 2024
This commit introduces the Offset property to libbeat/reader.Message, which stores the total number of bytes read and discarded before generating the message. The Offset field allows inputs to accurately determine how much data has been read up to the message, calculated as Message.Bytes + Message.Offset.

With this new Offset field, the filestream input correctly updates its state to account for data read but discarded by the include_message parser.

(cherry picked from commit 535a174)

# Conflicts:
#	filebeat/input/filestream/environment_test.go
pierrehilbert added a commit that referenced this pull request Jun 22, 2024
* libbeat: add Offset to libbeat/reader.Message (#39873)

This commit introduces the Offset property to libbeat/reader.Message, which stores the total number of bytes read and discarded before generating the message. The Offset field allows inputs to accurately determine how much data has been read up to the message, calculated as Message.Bytes + Message.Offset.

With this new Offset field, the filestream input correctly updates its state to account for data read but discarded by the include_message parser.

(cherry picked from commit 535a174)

# Conflicts:
#	filebeat/input/filestream/environment_test.go

* Fixing conflicts

---------

Co-authored-by: Anderson Queiroz <[email protected]>
Co-authored-by: Pierre HILBERT <[email protected]>
@AndersonQ AndersonQ deleted the 39653-filestream-include-msg-parser branch July 1, 2024 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.14.0 Automated backport with mergify bugfix Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filestream include_message do not correctly track the offset of a file
6 participants