Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rump_server: make nuse-syscalls.c and rump_syscalls.c Linux-ize #39

Closed
thehajime opened this issue Feb 9, 2015 · 1 comment
Closed

Comments

@thehajime
Copy link
Member

current system call proxy uses rumpkernel system call template, which derived from NetBSD system calls. It should be from Linux's one.

@thehajime
Copy link
Member Author

This issue was moved to libos-nuse/linux-libos-tools#14

thehajime pushed a commit that referenced this issue May 26, 2015
memset() to 0 interfaces array before reusing
usb_configuration structure.

This commit fix bug:

ln -s functions/acm.1 configs/c.1
ln -s functions/acm.2 configs/c.1
ln -s functions/acm.3 configs/c.1
echo "UDC name" > UDC
echo "" > UDC
rm configs/c.1/acm.*
rmdir functions/*
mkdir functions/ecm.usb0
ln -s functions/ecm.usb0 configs/c.1
echo "UDC name" > UDC

[   82.220969] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   82.229009] pgd = c0004000
[   82.231698] [00000000] *pgd=00000000
[   82.235260] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[   82.240638] Modules linked in:
[   82.243681] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0-rc2 #39
[   82.249926] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[   82.256003] task: c07cd2f0 ti: c07c8000 task.ti: c07c8000
[   82.261393] PC is at composite_setup+0xe3c/0x1674
[   82.266073] LR is at composite_setup+0xf20/0x1674
[   82.270760] pc : [<c03510d4>]    lr : [<c03511b8>]    psr: 600001d3
[   82.270760] sp : c07c9df0  ip : c0806448  fp : ed8c9c9c
[   82.282216] r10: 00000001  r9 : 00000000  r8 : edaae918
[   82.287425] r7 : ed551cc0  r6 : 00007fff  r5 : 00000000  r4 : ed799634
[   82.293934] r3 : 00000003  r2 : 00010002  r1 : edaae918  r0 : 0000002e
[   82.300446] Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
[   82.307910] Control: 10c5387d  Table: 6bc1804a  DAC: 00000015
[   82.313638] Process swapper/0 (pid: 0, stack limit = 0xc07c8210)
[   82.319627] Stack: (0xc07c9df0 to 0xc07ca000)
[   82.323969] 9de0:                                     00000000 c06e65f4 00000000 c07c9f68
[   82.332130] 9e00: 00000067 c07c59ac 000003f7 edaae918 ed8c9c98 ed799690 eca2f140 200001d3
[   82.340289] 9e20: ee79a2d8 c07c9e88 c07c5304 ffff55db 00010002 edaae810 edaae860 eda96d50
[   82.348448] 9e40: 00000009 ee264510 00000007 c07ca444 edaae860 c0340890 c0827a40 ffff55e0
[   82.356607] 9e60: c0827a40 eda96e40 ee264510 edaae810 00000000 edaae860 00000007 c07ca444
[   82.364766] 9e80: edaae860 c0354170 c03407dc c033db4c edaae810 00000000 00000000 00000010
[   82.372925] 9ea0: 00000032 c0341670 00000000 00000000 00000001 eda96e00 00000000 00000000
[   82.381084] 9ec0: 00000000 00000032 c0803a23 ee1aa840 00000001 c005d54c 249e2450 00000000
[   82.389244] 9ee0: 200001d3 ee1aa840 ee1aa8a0 ed84f4c0 00000000 c07c9f68 00000067 c07c59ac
[   82.397403] 9f00: 00000000 c005d688 ee1aa840 ee1aa8a0 c07db4b4 c006009c 00000032 00000000
[   82.405562] 9f20: 00000001 c005ce20 c07c59ac c005cf34 f002000c c07ca780 c07c9f68 00000057
[   82.413722] 9f40: f0020000 413fc090 00000001 c00086b4 c000f804 60000053 ffffffff c07c9f9c
[   82.421880] 9f60: c0803a20 c0011fc0 00000000 00000000 c07c9fb8 c001bee0 c07ca4f0 c057004c
[   82.430040] 9f80: c07ca4fc c0803a20 c0803a20 413fc090 00000001 00000000 01000000 c07c9fb0
[   82.438199] 9fa0: c000f800 c000f804 60000053 ffffffff 00000000 c0050e70 c0803bc0 c0783bd8
[   82.446358] 9fc0: ffffffff ffffffff c0783664 00000000 00000000 c07b13e8 00000000 c0803e54
[   82.454517] 9fe0: c07ca480 c07b13e4 c07ce40c 4000406a 00000000 40008074 00000000 00000000
[   82.462689] [<c03510d4>] (composite_setup) from [<c0340890>] (s3c_hsotg_complete_setup+0xb4/0x418)
[   82.471626] [<c0340890>] (s3c_hsotg_complete_setup) from [<c0354170>] (usb_gadget_giveback_request+0xc/0x10)
[   82.481429] [<c0354170>] (usb_gadget_giveback_request) from [<c033db4c>] (s3c_hsotg_complete_request+0xcc/0x12c)
[   82.491583] [<c033db4c>] (s3c_hsotg_complete_request) from [<c0341670>] (s3c_hsotg_irq+0x4fc/0x558)
[   82.500614] [<c0341670>] (s3c_hsotg_irq) from [<c005d54c>] (handle_irq_event_percpu+0x50/0x150)
[   82.509291] [<c005d54c>] (handle_irq_event_percpu) from [<c005d688>] (handle_irq_event+0x3c/0x5c)
[   82.518145] [<c005d688>] (handle_irq_event) from [<c006009c>] (handle_fasteoi_irq+0xd4/0x18c)
[   82.526650] [<c006009c>] (handle_fasteoi_irq) from [<c005ce20>] (generic_handle_irq+0x20/0x30)
[   82.535242] [<c005ce20>] (generic_handle_irq) from [<c005cf34>] (__handle_domain_irq+0x6c/0xdc)
[   82.543923] [<c005cf34>] (__handle_domain_irq) from [<c00086b4>] (gic_handle_irq+0x2c/0x6c)
[   82.552256] [<c00086b4>] (gic_handle_irq) from [<c0011fc0>] (__irq_svc+0x40/0x74)
[   82.559716] Exception stack(0xc07c9f68 to 0xc07c9fb0)
[   82.564753] 9f60:                   00000000 00000000 c07c9fb8 c001bee0 c07ca4f0 c057004c
[   82.572913] 9f80: c07ca4fc c0803a20 c0803a20 413fc090 00000001 00000000 01000000 c07c9fb0
[   82.581069] 9fa0: c000f800 c000f804 60000053 ffffffff
[   82.586113] [<c0011fc0>] (__irq_svc) from [<c000f804>] (arch_cpu_idle+0x30/0x3c)
[   82.593491] [<c000f804>] (arch_cpu_idle) from [<c0050e70>] (cpu_startup_entry+0x128/0x1a4)
[   82.601740] [<c0050e70>] (cpu_startup_entry) from [<c0783bd8>] (start_kernel+0x350/0x3bc)
[   82.609890] Code: 0a000002 e3530005 05975010 15975008 (e5953000)
[   82.615965] ---[ end trace f57d5f599a5f1bfa ]---

Most of kernel code assume that interface array in
struct usb_configuration is NULL terminated.

When gadget is composed with configfs configuration
structure may be reused for different functions set.

This bug happens because purge_configs_funcs() sets
only next_interface_id to 0. Interface array still
contains pointers to already freed interfaces. If in
second try we add less interfaces than earlier we
may access unallocated memory when trying to get
interface descriptors.

Signed-off-by: Krzysztof Opasiak <[email protected]>
Cc: <[email protected]> # 3.10+
Signed-off-by: Felipe Balbi <[email protected]>
thehajime pushed a commit that referenced this issue Aug 14, 2015
Nikolay has reported a hang when a memcg reclaim got stuck with the
following backtrace:

PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
  #0 __schedule at ffffffff815ab152
  #1 schedule at ffffffff815ab76e
  #2 schedule_timeout at ffffffff815ae5e5
  #3 io_schedule_timeout at ffffffff815aad6a
  #4 bit_wait_io at ffffffff815abfc6
  #5 __wait_on_bit at ffffffff815abda5
  #6 wait_on_page_bit at ffffffff8111fd4f
  #7 shrink_page_list at ffffffff81135445
  #8 shrink_inactive_list at ffffffff81135845
  #9 shrink_lruvec at ffffffff81135ead
 #10 shrink_zone at ffffffff811360c3
 #11 shrink_zones at ffffffff81136eff
 #12 do_try_to_free_pages at ffffffff8113712f
 #13 try_to_free_mem_cgroup_pages at ffffffff811372be
 #14 try_charge at ffffffff81189423
 #15 mem_cgroup_try_charge at ffffffff8118c6f5
 #16 __add_to_page_cache_locked at ffffffff8112137d
 #17 add_to_page_cache_lru at ffffffff81121618
 #18 pagecache_get_page at ffffffff8112170b
 #19 grow_dev_page at ffffffff811c8297
 #20 __getblk_slow at ffffffff811c91d6
 #21 __getblk_gfp at ffffffff811c92c1
 #22 ext4_ext_grow_indepth at ffffffff8124565c
 #23 ext4_ext_create_new_leaf at ffffffff81246ca8
 #24 ext4_ext_insert_extent at ffffffff81246f09
 #25 ext4_ext_map_blocks at ffffffff8124a848
 #26 ext4_map_blocks at ffffffff8121a5b7
 #27 mpage_map_one_extent at ffffffff8121b1fa
 #28 mpage_map_and_submit_extent at ffffffff8121f07b
 #29 ext4_writepages at ffffffff8121f6d5
 #30 do_writepages at ffffffff8112c490
 #31 __filemap_fdatawrite_range at ffffffff81120199
 #32 filemap_flush at ffffffff8112041c
 #33 ext4_alloc_da_blocks at ffffffff81219da1
 #34 ext4_rename at ffffffff81229b91
 #35 ext4_rename2 at ffffffff81229e32
 #36 vfs_rename at ffffffff811a08a5
 #37 SYSC_renameat2 at ffffffff811a3ffc
 #38 sys_renameat2 at ffffffff811a408e
 #39 sys_rename at ffffffff8119e51e
 #40 system_call_fastpath at ffffffff815afa89

Dave Chinner has properly pointed out that this is a deadlock in the
reclaim code because ext4 doesn't submit pages which are marked by
PG_writeback right away.

The heuristic was introduced by commit e62e384 ("memcg: prevent OOM
with too many dirty pages") and it was applied only when may_enter_fs
was specified.  The code has been changed by c3b94f4 ("memcg:
further prevent OOM with too many dirty pages") which has removed the
__GFP_FS restriction with a reasoning that we do not get into the fs
code.  But this is not sufficient apparently because the fs doesn't
necessarily submit pages marked PG_writeback for IO right away.

ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
submit the bio.  Instead it tries to map more pages into the bio and
mpage_map_one_extent might trigger memcg charge which might end up
waiting on a page which is marked PG_writeback but hasn't been submitted
yet so we would end up waiting for something that never finishes.

Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
before we go to wait on the writeback.  The page fault path, which is
the only path that triggers memcg oom killer since 3.12, shouldn't
require GFP_NOFS and so we shouldn't reintroduce the premature OOM
killer issue which was originally addressed by the heuristic.

As per David Chinner the xfs is doing similar thing since 2.6.15 already
so ext4 is not the only affected filesystem.  Moreover he notes:

: For example: IO completion might require unwritten extent conversion
: which executes filesystem transactions and GFP_NOFS allocations. The
: writeback flag on the pages can not be cleared until unwritten
: extent conversion completes. Hence memory reclaim cannot wait on
: page writeback to complete in GFP_NOFS context because it is not
: safe to do so, memcg reclaim or otherwise.

Cc: [email protected] # 3.9+
[[email protected]: corrected the control flow]
Fixes: c3b94f4 ("memcg: further prevent OOM with too many dirty pages")
Reported-by: Nikolay Borisov <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant