Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental Live ISO: 'pkg remove' reboots the system #303

Closed
probonopd opened this issue Nov 14, 2021 · 10 comments
Closed

Experimental Live ISO: 'pkg remove' reboots the system #303

probonopd opened this issue Nov 14, 2021 · 10 comments
Labels
bug Something isn't working freebsd Needs FreeBSD expertise and/or work help wanted Extra attention is needed

Comments

@probonopd
Copy link
Member

Experimental Live ISO: pkg remove rebots the system.
Tested with 12.2 and 13.0.
Related to the use of unionfs? (Need to try without, once we can easily monkey patch ISOs.)

Possibly FreeBSD can be set up to send out crash dumps over the serial console, which can be viewed in QEMU?

@probonopd probonopd added the bug Something isn't working label Nov 14, 2021
@probonopd
Copy link
Member Author

Same when trying to delete any file in /usr/local/ on the Live system.
Probably because we are currently mounting the unionfs under the real tree.

@probonopd probonopd pinned this issue Nov 22, 2021
@probonopd
Copy link
Member Author

probonopd commented Nov 22, 2021

pkg remove leads to the same result when using

mkdir -p /tmp/unionfs/usr/local
mount -t nullfs /media/uzip/usr/local /usr/local
mount -t unionfs /tmp/unionfs/usr/local /usr/local

instead of

mount -t tmpfs tmpfs /usr/local
mount -t unionfs -o below /media/uzip/usr/local /usr/local

However, it seems to be possible to delete files from /usr/local/bin.
What might be causing this?

Someone with the knowledge to investigate a kernel crash/instant reboot is needed to look into this.

@probonopd probonopd added the freebsd Needs FreeBSD expertise and/or work label Nov 22, 2021
@probonopd
Copy link
Member Author

Possibly we should make the unionfs mount first and then do the nullfs mount?

@probonopd probonopd changed the title Experimental Live ISO: 'pkg remove' rebots the system Experimental Live ISO: 'pkg remove' reboots the system Dec 1, 2021
@probonopd
Copy link
Member Author

probonopd commented Dec 1, 2021

FreeBSD can send kernel dumps over the network. This is especially useful to get kernel dumps from machines runnign Live ISOs.

On the machine that should act as the server on which the dumps will be stored:

ifconfig
sudo pkg install netdumpd
mkdir -p network/dumps
sudo netdumpd -D -d ./network/dumps

On the machine under test:

ifconfig
sudo dumpon -s 192.168.0.xxx -c 192.168.0.yyy <interface>

using the information from ifconfig for the server (-s), local network IP address (-c), and local interface name (e.g., en0).

@probonopd
Copy link
Member Author

probonopd commented Dec 1, 2021

Looking at the resulting vmcore with strings, I see toward the end:

panic: lockmgr_xlock_hard: recursing on non recursive lockmgr
0xfffff801197ce628 @ /usr/src/sys/kern/vfs_subr.c:2974
cpuid = 0
time = 1638351901
KDB: stack backtrace:
#0 0xffffffff80c57345 at kdb_backtrace+0x65
#1 0xffffffff80c09d21 at vpanic+0x181
#2 0xffffffff80c09b93 at panic+0x43
#3 0xffffffff80bdd114 at lockmgr_xlock_hard+0x484
#4 0xffffffff80cfc1b8 at _vn_lock+0x48
#5 0xffffffff80ce4621 at vget_finish+0x21
#6 0xffffffff80b4437d at tmpfs_alloc_vp+0x12d
#7 0xffffffff80b41d31 at tmpfs_lookup1+0x181
#8 0xffffffff80cc9d4d at vfs_cache_lookup+0xad
#9 0xffffffff80cd8120 at relookup+0x90
#10 0xffffffff82cff189 at unionfs_relookup+0xf9
#11 0xffffffff82cff31d at unionfs_relookup_for_delete+0x4d
#12 0xffffffff82d03b05 at unionfs_rmdir+0xa5
#13 0xffffffff8114d717 at VOP_RMDIR_APV+0x27
#14 0xffffffff80cf825d at kern_frmdirat+0x2ed
#15 0xffffffff8108ba8c at amd64_syscall+0x10c
#16 0xffffffff810620ce at fast_syscall_common+0xf8
Uptime: 2m48s

Full vmcore:
https://github.com/helloSystem/ISO/releases/download/assets/vmcore.192.168.0.208.0.tar.bz2

Possibly this might already give some insights as to what is going on? It clearly seems ot be tripping over something unionfs related while trying to remove a directory.

A quick search brings up
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=242369
which might be related, as it is also about the combination of tmpfs and unionfs, like the helloSystem Live ISO is using.

@probonopd
Copy link
Member Author

@michaeldexter says at https://twitter.com/michaeldexter/status/1586800937859158018

One rule of unionfs is: change the upper levels all you want but not the lower ones. Upper has been known to break too.

Is this the culprit?

mount -t unionfs -o below /media/.uzip/usr/local /usr/local

Would we be better off without -o below, and then mounting tmpfs atop?

@probonopd
Copy link
Member Author

probonopd commented Oct 31, 2022

@Stefar77 says at https://twitter.com/Stefar77/status/1587200402512019456

It's a bit less likely to put you in the debugger (panic) instantly when the lower layer is read-only. - I think -

@probonopd
Copy link
Member Author

@DarkHelmet433 says at https://twitter.com/karinjiri/status/1587509601011830784

I still wonder about the old altroot code. It was done at an entirely different layer (vfs_lookup) and is basically unionfs-lite. It adds a second / (root) fallback. Eg: put a jail OS layer in a common directory. It was the backend of geocities hosting. Vastly simpler code.

Sounds intriguing. I like simple. Does it still exist? Can it still be built?
Where does "the old altroot code" live, any pointers?

@probonopd
Copy link
Member Author

probonopd commented Nov 6, 2022

@michaeldexter: Your observation was spot on...
Let's see whether this fixes it.

@probonopd
Copy link
Member Author

Well... halfway:

You can now pkg remove packages that you installed while in the Live sesson. But you still can't delete or remove files that are part of the ISO without crashing.

@probonopd probonopd unpinned this issue Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working freebsd Needs FreeBSD expertise and/or work help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant