-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Secor uploads (different) files with the same "name" into different "days". #2126
Comments
This is expected. You most likely have messages with timestamps jumping
back and forth between old and new dates. In that case multiple data files
will be created (one for each time bucket), the records will be written to
respective files depending on which time bucket it belongs to. There are
no duplicate records between those files. The file name convention is
<file-generation-number>_<kafka-partition-number>_<previous-persisted-kafka-offset>.
When we
get enough content in the files, we will upload all of them at once.
…On Mon, Jun 21, 2021 at 10:42 AM glebsam ***@***.***> wrote:
Secor uploads files (different content) with the same "name" into
different days when first day ends and the next begins.
In the example below, I have two files:
/topic-name/dt=2021-06-15/1_0_00000000004302033536.gz
/topic-name/dt=2021-06-16/1_0_00000000004302033536.gz
2021-06-16 00:01:05,444 [Thread-4] (com.pinterest.secor.uploader.S3UploadManager) INFO uploading file /mnt/secor_data/message_logs/partition/9_13/topic-name/dt=2021-06-15/1_0_00000000004302033536.gz to s3://kafka-backup.s3.domain/dumps/topic-name/dt=2021-06-15/1_0_00000000004302033536.gz with no encryption
2021-06-16 00:01:05,444 [Thread-4] (com.pinterest.secor.uploader.S3UploadManager) INFO uploading file /mnt/secor_data/message_logs/partition/9_13/topic-name/dt=2021-06-16/1_0_00000000004302033536.gz to s3://kafka-backup.s3.domain/dumps/topic-name/dt=2021-06-16/1_0_00000000004302033536.gz with no encryption
Is it Ok? Where I can read more details about such behaviour?
At least, I need to know, which offsets contains which file and possible
ways to maybe set it explicitly in the file name (I expect that the first
offset of the file is the offset specified in its name, but in described
case it is not true).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2126>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABYJP75EAE6UOUU6JPTJEF3TT52Y5ANCNFSM47CA6YKA>
.
|
@HenryCaiHaiying thank you for the answer, but I still can't get, why |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Secor uploads files (different content) with the same "name" into different days when first day ends and the next begins.
In the example below, I have two files:
Is it Ok? Where I can read more details about such behaviour?
At least, I need to know, which offsets contains which file and possible ways to maybe set it explicitly in the file name (I expect that the first offset of the file is the offset specified in its name, but in described case it is not true).
The text was updated successfully, but these errors were encountered: