Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare v3.19 #2284

Merged
merged 166 commits into from
Oct 22, 2023
Merged

Prepare v3.19 #2284

merged 166 commits into from
Oct 22, 2023

Conversation

avagin
Copy link
Member

@avagin avagin commented Oct 12, 2023

  • The loongarch64 support
  • membarrier() registration c/r
  • Many fixes and improvements from the Google team
  • Drop python 2 compatibility
  • Support XSAVE on newer Intel CPUs
  • Fixes here and there

@znley I suggest you to choose a name for this release.

h0lyalg0rithm and others added 14 commits October 12, 2023 08:36
The TOS(type of service) field in the ip header allows you specify the
priority of the socket data.

Signed-off-by: Suraj Shirvankar <[email protected]>
The pipe_size type is unsigned int, when the fcntl call fails and
return -1, it will cause a negative rollover problem.

Signed-off-by: zhoujie <[email protected]>
Newer Intel CPUs (Sapphire Rapids) have a much larger xsave area than
before. Looking at older CPUs I see 2440 bytes.

    # cpuid -1 -l 0xd -s 0
    ...
        bytes required by XSAVE/XRSTOR area     = 0x00000988 (2440)

On newer CPUs (Sapphire Rapids) it grows to 11008 bytes.

    # cpuid -1 -l 0xd -s 0
    ...
        bytes required by XSAVE/XRSTOR area     = 0x00002b00 (11008)

This increase the xsave area from one page to four pages.

Without this patch the fpu03 test fails, with this patch it works again.

Signed-off-by: Adrian Reber <[email protected]>
Using the fact that we know criu_pid and criu is a parent of restored
process we can create pidfile with pid on caller pidns level.

We need to move mount namespace creation to child so that criu-ns can
see caller pidns proc.

Signed-off-by: Pavel Tikhomirov <[email protected]>
By default, the file name 'amdgpu_plugin.txt' is used also as the name
for the corresponding man page (`man amdgpu_plugin`). However, when
this man page is installed system-wide it would be more appropriate
to have a prefix 'criu-' (e.g., `man criu-amdgpu-plugin`).

Signed-off-by: Radostin Stoyanov <[email protected]>
crun wants to set empty_ns and this interface is missing from the
library. This adds it to libcriu.

Signed-off-by: Adrian Reber <[email protected]>
--criu-binary argument provides a way to supply the CRIU binary
location to run_criu().

Related to: checkpoint-restore#1909

Signed-off-by: Dhanuka Warusadura <[email protected]>
These changes remove and update the changes introduced in
7177938 in favor of the
Python version in CI.

os.waitstatus_to_exitcode() function appeared in Python 3.9

Related to: checkpoint-restore#1909

Signed-off-by: Dhanuka Warusadura <[email protected]>
These changes add test implementations for criu-ns script.

Fixes: checkpoint-restore#1909

Signed-off-by: Dhanuka Warusadura <[email protected]>
These changes fix the `ImportError: No module named pathlib`
error when executing criu-ns tests located at criu/test/others/criu-ns

Signed-off-by: Dhanuka Warusadura <[email protected]>
CentOS 7 CI environment uses Python 2. To execute criu-ns
script in CentOS 7 changing the current shebang line to
python is required.

This reverse the changes made in a15a63f

Signed-off-by: Dhanuka Warusadura <[email protected]>
@avagin avagin requested review from a team, xemul and Snorch and removed request for a team October 12, 2023 15:54
@avagin avagin changed the base branch from criu-dev to master October 12, 2023 15:55
@rst0git
Copy link
Member

rst0git commented Oct 12, 2023

@avagin Would it be possible to include the patch from #2282 in this release?

@avagin
Copy link
Member Author

avagin commented Oct 12, 2023

@rst0git I will add it when it is merged to criu-dev.

@znley
Copy link
Contributor

znley commented Oct 13, 2023

Daming Rosefinch

Thank you for your invitation. This name comes from a novel I like. Rosefinch is a mythical animal and Daming Rosefinch is a sword shaped like rosefinch, its Chinese name is "大明朱雀". What is your oponion?

@rst0git
Copy link
Member

rst0git commented Oct 13, 2023

Daming Rosefinch

The word "daming" at first glance looks very similar to damning, which has a different meaning.
What do you think about "Vermilion Rosefinch"?

osctobe and others added 28 commits October 22, 2023 13:28
At least in Google's VM environment, the kernel taints are unrelated to CRIU
runs.  Don't fail tests if taints change, if kernel taints are ignored.

Signed-off-by: Michał Mirosław <[email protected]>
Make the errno values reported by cgroup04 always correct and showing
relevant parameters.
Constify constant strings, while at it.

Signed-off-by: Michał Mirosław <[email protected]>
cgroup04 test needs full control over mem and devices cgroup hierarchies.
Make the test's .checkskip script better at detecting if the cgroups are
available for use.

Signed-off-by: Michał Mirosław <[email protected]>
This fixes a failure to clean up after a failed test, where CRIU didn't start properly.

```
===================== Run zdtm/transition/socket-tcp in h ======================
Start test
./socket-tcp --pidfile=socket-tcp.pid --outfile=socket-tcp.out
Traceback (most recent call last):
  File ".../zdtm_py.py", line 1906, in do_run_test
    cr(cr_api, t, opts)
  File ".../zdtm_py.py", line 1584, in cr
    cr_api.dump("dump")
  File ".../zdtm_py.py", line 1386, in dump
    self.__dump_process = self.__criu_act(action,
  File ".../zdtm_py.py", line 1224, in __criu_act
    raise test_fail_exc("CRIU %s" % action)
test_fail_exc: CRIU dump

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<embedded module '_launcher'>", line 182, in run_filename_from_loader_as_main
  File "<embedded module '_launcher'>", line 34, in _run_code_in_main
  File ".../zdtm_py.py", line 2790, in <module>
    fork_zdtm()
  File ".../zdtm_py.py", line 2782, in fork_zdtm
    do_run_test(tinfo[0], tinfo[1], tinfo[2], tinfo[3])
  File ".../zdtm_py.py", line 1922, in do_run_test
    t.kill()
  File ".../zdtm_py.py", line 509, in kill
    os.kill(int(self.__pid), sig)
ProcessLookupError: [Errno 3] No such process
```

Signed-off-by: Michał Mirosław <[email protected]>
When -- after restore -- sockets can't communicate, the test times out
while waiting on recvfrom(). Since the communication is local, send()
works instantaneously - so mark sockets with SOCK_NONBLOCK and report
failure if the message is not received immediately.

Signed-off-by: Michał Mirosław <[email protected]>
All test logs are flooded with the "userns is supported" messages...

Signed-off-by: Andrei Vagin <[email protected]>
Currently page_size() returns unsigned int value that is after "bitwise
not" is promoted to unsigned long value e.g. in uffd.c
handle_page_fault. Since the value is unsigned promotion is done with 0
MSB that results in lost of MSB pagefault address bits. So make
page_size to return  unsigned long to avoid such situation.

Signed-off-by: Vladislav Khmelevsky <[email protected]>
Currently most of the times we don't have problems with VVAR segment and
lazy restore because when VDSO is parked there is an munmap call that
calls UFFDIO_UNREGISTER on the destination address.
But we don't want to enable userfaultfd for VDSO and VVAR at the first
place.

Signed-off-by: Vladislav Khmelevsky <[email protected]>
It means CRIU has to close it when it is not needed.

It looks more logically correct and matches the behaviour of
the RESTORE_EXT_FILE callback.

Signed-off-by: Andrei Vagin <[email protected]>
This patch adds the `libdrm-dev` package to the list of CRIU
dependencies installed in CI to build CRIU with amdgpu plugin.

Signed-off-by: Radostin Stoyanov <[email protected]>
amdgpu_plugin.c:930:6: error: variable 'buffer' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
        if (ret) {
            ^~~
amdgpu_plugin.c:988:8: note: uninitialized use occurs here
        xfree(buffer);

Signed-off-by: Radostin Stoyanov <[email protected]>
One memfd can be shared by a few restored files. Only of these files is
restored with a file created with memfd_open. Others are restored by reopening
memfd files via /proc/self/fd/.

It seems unnecessary for restoring memfd memory mappings. We can always use the
origin file.

Signed-off-by: Andrei Vagin <[email protected]>
The "ColumnLimit: 120" is not only allowing lines to be longer than 80
characters but it also forces line wrapping at 120 characters. If total
expression length is more than 120 characters, clang-format will try to
wrap it as close to 120 as it can, it would not even allow to wrap at 80
characters if we really want it. But as we all know 80 characters is
Linux kernel coding style default and as far as our coding style is
based on it it is really strange to prohibit wrapping lines at 80
characters...

Signed-off-by: Pavel Tikhomirov <[email protected]>
GCC's lto source:
> To avoid this problem the compiler must assume that it sees the
> whole program when doing link-time optimization.  Strictly
> speaking, the whole program is rarely visible even at link-time.
> Standard system libraries are usually linked dynamically or not
> provided with the link-time information.  In GCC, the whole
> program option (@option{-fwhole-program}) asserts that every
> function and variable defined in the current compilation
> unit is static, except for function @code{main} (note: at
> link time, the current unit is the union of all objects compiled
> with LTO).  Since some functions and variables need to
> be referenced externally, for example by another DSO or from an
> assembler file, GCC also provides the function and variable
> attribute @code{externally_visible} which can be used to disable
> the effect of @option{-fwhole-program} on a specific symbol.

As far as I read gcc's source, ipa_comdats() will avoid placing symbols
that are either already in a user-defined section or have
externally_visible attribute into new optimized gcc sections.

Signed-off-by: Dmitry Safonov <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
fork_and_ptrace_attach has to fork a child with CLONE_UNTRACED,
so that strace doesn't trace it.

Signed-off-by: Andrei Vagin <[email protected]>
read_ns_sys_file() can return an error, but we are trying to parse a
buffer before checking a return code.

CID 417395 (#3 of 3): String not null terminated (STRING_NULL)
2. string_null: Passing unterminated string buf to strtol, which expects
   a null-terminated string.

Signed-off-by: Andrei Vagin <[email protected]>
This check is redundant as line 201 checks for this condition.

Signed-off-by: Taemin Ha <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
The is_native field is a boolean. Therefore, else if() should can be
changed to a simple else{}.

Signed-off-by: Taemin Ha <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
The condition meant to check fd2 instead of fd1, which is checked in
line 24.

Signed-off-by: Taemin Ha <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
line 131 checks if (ret >= 0). line 133 could be replaced by a simple else statement

Signed-off-by: Taemin Ha <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
Eventpollentry's fields are set only when ret == 3 or ret == 6. The
remaining cases can be grouped together to an error

Signed-off-by: Taemin Ha <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
At this point the correct position is already restored, so reading from
the fd results in the position being moved forward by 5 bytes.

Fixes: 9191f87 ("criu/files-reg.c: add build-id validation functionality")
Signed-off-by: Michal Clapinski <[email protected]>
@avagin avagin merged commit 41938f1 into checkpoint-restore:master Oct 22, 2023
24 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.