Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance comparison with the sh version #2

Open
Mek101 opened this issue Sep 4, 2021 · 15 comments
Open

Performance comparison with the sh version #2

Mek101 opened this issue Sep 4, 2021 · 15 comments

Comments

@Mek101
Copy link

Mek101 commented Sep 4, 2021

[root@home-server /home/mek101/Progetti/go/bin]# time ./btrfs-diff-go /mountpoints/raid1/anime/data_snapshot.20210816/ /mountpoints/raid1/anime/bk_data_snapshot.2021094/
unexpected ChangeType on original /MyNextLifeAsAVillainess/wget-log: !!!unexpected ChangeType on original /MyNextLifeAsAVillainess/Hamefura_Ep_01_SUB_ITA.mp4: !!!       !!!: /MyNextLifeAsAVillainess/Hamefura_Ep_01_SUB_ITA.mp4
       !!!: /MyNextLifeAsAVillainess/wget-log
     added: /86
     added: /Kobayashi-san Chi no Maid Dragon
     added: /Kyousougiga
     added: /Otome Game no Hametsu Flag shika Nai Akuyaku Reijou ni Tensei shiteshimatta (Hamefura)
...
495G	data_snapshot.20210816/
535G	bk_data_snapshot.20210904

with a time of only!

real	0m2,062s
user	0m0,364s
sys	0m2,162s
@Mek101
Copy link
Author

Mek101 commented Sep 4, 2021

Although

Due to BTRFS implementation, some files appear as changed, when they are not (according to diff utility). I have absolutely no idea why BTRFS is acting like this… If someone can help me figures this out, I'll be glad.

Makes it hard to use: I redirected the output of both versions into files, and...

[mek101@home-server ~/Progetti/go/bin]$ wc -l sh_out go_out 
    33 sh_out
  1723 go_out

@Mek101
Copy link
Author

Mek101 commented Sep 4, 2021

In regards to that:
./btrfs-diff-go --debug /mountpoints/raid1/anime/data_snapshot.20210816/ /mountpoints/raid1/anime/bk_data_snapshot.2021094/
All debug outputs are a variant of this

[DEBUG] Cmd [15 0 54 0 49 49 101 121 101 115 47 91 102 114 111 122 101 110 100 97 108 101 93 49 49 95 101 121 101 115 95 48 49 95 91 120 118 105 100 95 105 116 97 93 91 53 56 49 100 52 52 55 102 93 46 97 118 105 11 0 12 0 241 74 30 97 0 0 0 0 160 65 202 43 10 0 12 0 144 150 206 95 0 0 0 0 154 157 48 44 9 0 12 0 109 53 191 96 0 0 0 0 131 75 240 22]; type BTRFS_SEND_C_UTIMES
[DEBUG] TRACE       BTRFS_SEND_C_UTIMES 11eyes/[frozendale]11_eyes_01_[xvid_ita][581d447f].avi
[DEBUG] TRACE    changed 11eyes/[frozendale]11_eyes_01_[xvid_ita][581d447f].avi
[DEBUG] peekAndDiscard() need to read more bytes '4' than there are buffered '0'
[DEBUG] peekAndDiscard() increasing the buffer size to match the need

And with ./btrfs-diff-go --debug /mountpoints/raid1/anime/data_snapshot.20210816/ /mountpoints/raid1/anime/bk_data_snapshot.2021094/ 2>1 | grep " BTRFS_SEND_"

Only false positives appear in the output

@mbideau
Copy link
Owner

mbideau commented Sep 5, 2021

Hey @Mek101,

Regarding your first post in that thread, it reveals that there are still a bug (this one is really tricky) and I created an issue to track it. It also shows an incredible speed comparing to the shell version, which is good news : it makes it practical.

About your second post, it seems that there are inconsistencies between the go and shell report. Which one do you think has the right report ?

Finally regarding your last comment, I would love to be able to reproduce all of that, to debug both issues, and stop bothering you 😅 To help me reproduce it, could you produce a btrfs send stream file and send it to me (either posting it here, if making the file names public is okay for you, or by send it to me privately, I won't publish it).

The command to run is like the following (timing it is also interesting) :

~> sudo time btrfs send --quiet --no-data -f btrfs.stream -p older_subvolume newer_subvolume

For the first post, (the one with the tricky bug), it would be:

~> sudo time btrfs send --quiet --no-data -f btrfs.stream -p /mountpoints/raid1/anime/data_snapshot.20210816 /mountpoints/raid1/anime/bk_data_snapshot.2021094

If the btrfs.stream file is too big to be sent by email, you can drop it on a website like WeTransfer.

It would allow me to reproduce this bug that I cannot manage to get it until now ...

Thank you a lot in advance.

@Mek101
Copy link
Author

Mek101 commented Sep 5, 2021

About your second post, it seems that there are inconsistencies between the go and shell report. Which one do you think has the right report ?

It should be the shell one, since the go report includes supposed changes to files I know for sure haven't been modified since the creation of subvolume (ie, before data_snapshot.20210816)

@Mek101
Copy link
Author

Mek101 commented Sep 5, 2021

time btrfs send --quiet --no-data -f btrfs.stream -p /mountpoints/raid1/anime/data_snapshot.20210816 /mountpoints/raid1/anime/bk_data_snapshot.2021094

real	0m1,621s
user	0m0,015s
sys	0m1,991s

@mbideau
Copy link
Owner

mbideau commented Sep 5, 2021

time btrfs send --quiet --no-data -f btrfs.stream -p /mountpoints/raid1/anime/data_snapshot.20210816 /mountpoints/raid1/anime/bk_data_snapshot.2021094

real	0m1,621s
user	0m0,015s
sys	0m1,991s

OKay, your first run with btrfs-diff-go showed :

real	0m2,062s
user	0m0,364s
sys	0m2,162s

Which we could interpret as the Go processing overhead costing about 0m0,4xx, which is incredibly fast.
Good news.

Also, thank you very much for the btrfs.stream file 😉

@Mek101
Copy link
Author

Mek101 commented Sep 5, 2021

I'll try diffing a couple subvolumes on the OS ssd and see how it goes

@Mek101
Copy link
Author

Mek101 commented Sep 5, 2021

I was trying to benc both versions on the snapshots of my root ssd, but I encountered a couple of fatal errors:

4,9G	root_snapshot.20210816/
6,7G	root_snapshot.20210905/
[root@home-server /mountpoints/root/snapshots/root]# time sudo btrfs-diff root_snapshot.20210816/ root_snapshot.20210905/
Warning: unknown raw line 'mksock          ./root_snapshot.20210905/o59141-618-0'
Fatal error: when renaming './root_snapshot.20210905/o59141-618-0' to './root_snapshot.20210905/etc/samba/private/msg.sock/6555', the source wasn't found in the objects buffer

real	12m40,858s
user	13m0,688s
sys	4m48,570s
# time ./btrfs-diff-go /mountpoints/root/snapshots/root/root_snapshot.20210816/ /mountpoints/root/snapshots/root/root_snapshot.20210905/
btrfsSendSyscall returns broken pipe

real	0m8,920s
user	0m1,639s
sys	0m10,044s

@mbideau
Copy link
Owner

mbideau commented Sep 5, 2021

OKay, you are definitely a good crash-tester 😉 😅

Regarding the bug of the shell version, I think I should read the btrfs receive sources to be sure that I haven't forgotten about another case... I wanted to avoid that... (lazy me !).

About the Go version, I am currently improving the debugging information, to better catch and resolve future issues.

I'll keep you posted.

Again, thank you very much for your contributions. I will add you to a Contributors section in the documentation, if you agree ... If so, how do you want me to name you ? Mek101 or your real name ?

@mbideau
Copy link
Owner

mbideau commented Sep 5, 2021

In the meantime, could you generate the btrfs stream of this diff please and send it to me ? It would be a great help to debug...

Command to do so :

~> sudo btrfs send --quiet --no-data -f btrfs.stream.1 -p root_snapshot.20210816 root_snapshot.20210905

Thank you in advance.

@Mek101
Copy link
Author

Mek101 commented Sep 5, 2021

Sent the btrfs.stream.1 file through email

@mbideau
Copy link
Owner

mbideau commented Sep 5, 2021

I have the pleasure to tell you that I have added an Authors and contributors section to the README (commit 63c329c) and you are appearing as the first contributor ever 😃 🍾

@mbideau
Copy link
Owner

mbideau commented Sep 10, 2021

Hey @Mek101,

No more bugs nor inconsistencies planned (all fixed), so I think you can do more testing if you want 😉
If you find another bug, I buy you a beer 🍺 Ahahah

@mbideau mbideau changed the title Performace comparison with the sh version Performance comparison with the sh version Oct 15, 2021
@mbideau
Copy link
Owner

mbideau commented Nov 25, 2021

Ciao @Mek101,
How are you ?
Did you had the chance to run another test with this Go version ?
I still don't have a real life size data set to test against (waiting to transfer my 3Tb to btrfs for that), so I kind of rely on you (for now). I would like to depreciate the sh version if this one is faster and without (known) bugs.
Thank you in advance.

@Mek101
Copy link
Author

Mek101 commented Oct 31, 2022

Excuse me, would it be possible to wipe my name from the repository history?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants