extract.cpp: Fixed handling multi-output files, added some debug #92

Michal-Ston-Mobica · 2023-12-20T16:01:42Z

Changed logic in handling multi-output files to write them in sequence, not chunk-by-chunk.
This should fix some errors in writing output ZIPs and general file handling for some installers.

mrcebryk · 2023-12-21T13:37:58Z

src/wasm/extract.hpp

@@ -20,6 +20,8 @@

 #include "wasm/fdzipstream/fdzipstream.h"

+#define wasm_debug(X) do { if (extractor::get().get_settings().debug_messages_enabled) log_info << X; } while(0)


Do we need that construction here?
do { ... } while(0)

It's a popular way to create a macro wrapper that doesn't produce any errors, no matter how it's used in the code.
wasm_info is just wrapped here with an additional conditional statement, making this output the log only if the debug output is enabled before starting the extraction.

mrcebryk · 2023-12-21T13:43:10Z

src/wasm/extract.cpp

-          json ret;
-          ret["status"] = "Aborted by user";
-          return ret.dump();
+          wasm_debug("pre-copy chunk=" << chunk.first.first_slice << " " << chunk.first.last_slice


I would rather consider using different format of logging function like:
wasm_log << "pre-copy chunk=" ....
or
conditional_log(extractor::get().get_settings().debug_messages_enabled) << "pre-copy chunk=" ....

Would be more readable in my opinion.

This function (or more specifically macro) was meant to be used only for debug and it's only used in wasm-related code, so I don't really see the point of making this more general or rewriting to be used like a stream.
It might be a good idea to make it more readable, but I assumed it's good enough for now and hoped that it could be addressed later, as important issues should resolved first and without much delay.

mrcebryk · 2023-12-21T13:44:17Z

src/wasm/extract.hpp

@@ -20,6 +20,8 @@

 #include "wasm/fdzipstream/fdzipstream.h"

+#define wasm_debug(X) do { if (extractor::get().get_settings().debug_messages_enabled) log_info << X; } while(0)


Since you use inside log_info I would recommend to change this macro name to: wasm_info

mrcebryk · 2023-12-21T13:46:00Z

src/wasm/extract.cpp

@@ -64,8 +70,12 @@ bool file_output::write(char* data, size_t n) {

  position_ += n;
  total_written_ += n;
-
-  if (is_complete()) {
+  if (extractor::get().get_settings().debug_messages_enabled) {


It would be good to unify logging handling, looks like you could use the "wasm_debug" here.

This was done on purpose to potentially save some execution time, as there are two consecutive log writes handled by a single if statement, which would make two ifs when used with a wasm_debug.
It's not obvious whether a simple if like this is optimized when compiling to WASM, and the main goal of this commit was to expand on actual extract functionality while keeping some of the added debug as an extra and not to overhaul the whole debug system.
The existing debug calls in extract.cpp weren't even operational before introducing wasm_debug, for an unknown reason.

mrcebryk · 2023-12-21T13:49:37Z

src/wasm/extract.cpp

-          throw std::runtime_error("Error writing file \"" + output->path().string() + '"');
-        }
+      bool success = output->write(buffer, extracted_n, file.size);
+      if (!success) {


Since you don't use that "success" variable anywhere below, I would recommend to simplify that code to:
if (!output->write(buffer, extracted_n, file.size)) {

Valid point, although you can probably see this part was already present in the code, probably to make this part a bit more readable.
I agree that it can and probably should be simplified at one point or another, although a general refactor wasn't my goal when making this PR.

mrcebryk

Interesting changes, I would consider improving the format of introduced wasm_debug macro and to add additional logs (or exceptions?) in unexpected cases

mrcebryk · 2023-12-21T13:54:16Z

src/wasm/extract.cpp

+           total_written_, file_->entry().size, size2, n);
+    std::cout << "write: is_complete():" << std::endl;
+  }
+  if (file_->entry().size != 0 && is_complete() || size2 == total_written_) {


We will have different combinations of situations when we don't call:
zs_entryend(zip_, zip_entry_, 0);

I assume some of them might be expected, others not, I would recommend to add proper logs, especially warnings/errors in unexpected cases.

The original is_complete() only checks whether file_->entry().size == total_written_, while this parameter is not always set for an unknown, probably installer-related reason. Thus an additional parameter of size2 with the actual file size that comes from another file descriptor was added to fix this issue and ensure the file is written properly.
This might've not been an issue with the original code, as it was only writing directly to the filesystem, and closing the files was always done at least by the OS when exiting the process, or somehow done via C++ interface.

To sum up: this whole if statement performs a single check - to see if the file was written completely. Maybe it could be rewritten to be a bit more clear, but it's not simply a matter of rewriting is_complete(), as it's not aware of the size2, and it's not a part of the object it's being called on.

The intention is to introduce them in dedicated commit while following the style of already existing logging functions.

Michal-Ston-Mobica self-assigned this Dec 20, 2023

mrcebryk reviewed Dec 21, 2023

View reviewed changes

Michal-Ston-Mobica and others added 2 commits February 5, 2024 11:27

extract.cpp: Fixed handling multi-output files, added some debug

950ca24

Remove extra debug messages

a945650

The intention is to introduce them in dedicated commit while following the style of already existing logging functions.

Jakub-Chylkowski-Mobica force-pushed the 212-fix_multi_output branch from 40223b7 to a945650 Compare February 5, 2024 10:28

Jakub-Chylkowski-Mobica requested a review from Krzysztof-Bak-Mobica February 5, 2024 10:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extract.cpp: Fixed handling multi-output files, added some debug #92

extract.cpp: Fixed handling multi-output files, added some debug #92

Michal-Ston-Mobica commented Dec 20, 2023

mrcebryk Dec 21, 2023

Michal-Ston-Mobica Jan 4, 2024

mrcebryk Dec 21, 2023

Michal-Ston-Mobica Jan 4, 2024

mrcebryk Dec 21, 2023

mrcebryk Dec 21, 2023

Michal-Ston-Mobica Jan 4, 2024

mrcebryk Dec 21, 2023

Michal-Ston-Mobica Jan 4, 2024

mrcebryk left a comment •

edited

Loading

mrcebryk Dec 21, 2023

Michal-Ston-Mobica Jan 4, 2024

		@@ -20,6 +20,8 @@

		#include "wasm/fdzipstream/fdzipstream.h"

		#define wasm_debug(X) do { if (extractor::get().get_settings().debug_messages_enabled) log_info << X; } while(0)

extract.cpp: Fixed handling multi-output files, added some debug #92

Are you sure you want to change the base?

extract.cpp: Fixed handling multi-output files, added some debug #92

Conversation

Michal-Ston-Mobica commented Dec 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrcebryk left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrcebryk left a comment •

edited

Loading