Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] z_slice_init is being called with capacity == 0 during z_open scout process #876

Open
lukebayes opened this issue Feb 1, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@lukebayes
Copy link

Describe the bug

I'm building for pico2_w on Ubuntu 24.04.1 using arm-none-eabi-gcc.

I've tried the 1.1.1 branch and master, both have the same behavior.

Something in the z_open process is attempting to allocate a z_slice with zero capacity and this is turning into a malloc failure that prevents initialization and causes a failure to open any connections on the Pico2.

I'm running a zenohd router on my laptop with 'RUST_LOG=info' and no logs are shown unless this patch is applied.

I'm quite sure this patch is not healthy, but it does allow the connection to proceed and messages to get published as expected.

I have added the following to _z_slice_init as follows:

z_result_t _z_slice_init(_z_slice_t *bs, size_t capacity) {
  if (capacity == 0) {
    printf("!!!!!!!!!!!!!!!!!!!!!!!!!! DANGER DANGER DANGER !!!!!!!!!!!!!!!!!!!!!!!!!!\n");
    capacity = 1;
  }
  ...

When I break on this printf statement, here's what GDB reports:

─── Output/messages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[rp2350.dap.core1] external reset detected
[New Thread 1]

Thread 1 "rp2350.dap.core1" hit Breakpoint 1, _z_slice_init (bs=0x2000dfc8 <ucHeap+2388>, capacity=0) at $PROJ/vendor/zenoh-pico/src/collections/slice.c:42
42          printf("!!!!!!!!!!!!!!!!!!!!!!!!!! DANGER DANGER DANGER !!!!!!!!!!!!!!!!!!!!!!!!!!\n");
─── Assembly ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 0x1002b1ee  _z_slice_init+6  str       r0, [r7, #12]
 0x1002b1f0  _z_slice_init+8  str       r1, [r7, #8]
 0x1002b1f2  _z_slice_init+10 ldr       r3, [r7, #8]
 0x1002b1f4  _z_slice_init+12 cmp       r3, #0
 0x1002b1f6  _z_slice_init+14 bne.n     0x1002b202 <_z_slice_init+26>
!0x1002b1f8  _z_slice_init+16 ldr       r0, [pc, #104]  @ (0x1002b264 <_z_slice_init+124>)
 0x1002b1fa  _z_slice_init+18 bl        0x100067de <__wrap_puts>
 0x1002b1fe  _z_slice_init+22 movs      r3, #1
 0x1002b200  _z_slice_init+24 str       r3, [r7, #8]
 0x1002b202  _z_slice_init+26 ldr       r0, [r7, #8]
─── Breakpoints ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[1] break at 0x1002b1f8 in $PROJ/vendor/zenoh-pico/src/collections/slice.c:42 for slice.c:42 hit 1 time
─── Expressions ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─── History ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─── Memory ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─── Registers ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
          r0 0x2000dfc8           r1 0x00000000           r2 0x2000dfc8           r3 0x00000000            r4 0x00000800          r5 0x2000e024           r6 0x00000000           r7 0x2000de78             r8 0x08080808           r9 0x09090909
         r10 0x10101010          r11 0x11111111          r12 0x20012a62           sp 0x2000de78            lr 0x1002b37b          pc 0x1002b1f8         xpsr 0x61000000        fpscr 0x00000000            msp 0x20080f98          psp 0x2000de78
      msp_ns 0x00000000       psp_ns 0xfffffffc        msp_s 0x20080f98        psp_s 0x2000de78       primask 0x00           basepri 0x00          faultmask 0x00            control 0x06             msplim_s 0x00000000     psplim_s 0x2000d6a8
   msplim_ns 0x00000000    psplim_ns 0x00000000    primask_s 0x00          basepri_s 0x00         faultmask_s 0x00         control_s 0x06         primask_ns 0x00         basepri_ns 0x00         faultmask_ns 0x00         control_ns 0x04      
─── Source ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 37  }
 38  
 39  /*-------- Slice --------*/
 40  z_result_t _z_slice_init(_z_slice_t *bs, size_t capacity) {
 41      if (capacity == 0) {
!42        printf("!!!!!!!!!!!!!!!!!!!!!!!!!! DANGER DANGER DANGER !!!!!!!!!!!!!!!!!!!!!!!!!!\n");
 43        capacity = 1;
 44      }
 45      bs->start = (uint8_t *)z_malloc(capacity);
 46      if (bs->start == NULL) {
─── Stack ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[0] from 0x1002b1f8 in _z_slice_init+16 at $PROJ/vendor/zenoh-pico/src/collections/slice.c:42
[1] from 0x1002b37a in _z_slice_copy+22 at $PROJ/vendor/zenoh-pico/src/collections/slice.c:99
[2] from 0x1004076c in _z_t_msg_copy_open+44 at $PROJ/vendor/zenoh-pico/src/protocol/definitions/transport.c:342
[3] from 0x10040844 in _z_t_msg_copy+116 at $PROJ/vendor/zenoh-pico/src/protocol/definitions/transport.c:372
[4] from 0x10042be6 in _z_link_recv_t_msg+362 at $PROJ/vendor/zenoh-pico/src/transport/common/rx.c:79
[5] from 0x100388d6 in _z_unicast_handshake_client+826 at $PROJ/vendor/zenoh-pico/src/transport/unicast/transport.c:217
[6] from 0x1003899c in _z_unicast_open_client+24 at $PROJ/vendor/zenoh-pico/src/transport/unicast/transport.c:328
[7] from 0x10035ae8 in _z_new_transport_client+106 at $PROJ/vendor/zenoh-pico/src/transport/manager.c:38
[8] from 0x10035cac in _z_new_transport+30 at $PROJ/vendor/zenoh-pico/src/transport/manager.c:114
[9] from 0x1002d970 in _z_open_inner+100 at $PROJ/vendor/zenoh-pico/src/net/session.c:123
[+]
─── Threads ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[2] id 1 name rp2350.dap.core0 from 0x1000b8c0 in prvIdleTask+16 at $PROJ/vendor/freertos/tasks.c:5846
[1] id 2 name rp2350.dap.core1 from 0x1002b1f8 in _z_slice_init+16 at $PROJ/vendor/zenoh-pico/src/collections/slice.c:42
─── Variables ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
arg bs = 0x2000dfc8 <ucHeap+2388>: {len = 0,start = 0x200109e8 <ucHeap+13172> "",_delete_context = {…, capacity = 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Any tips, pointers or help would be greatly appreciated!

To reproduce

  1. Build and run an application using zenoh-pico on the 1.1.1 branch (or master branch on Feb 1, 2025).
  2. Attempt to z_open a new connection with an empty / default config
  3. Trips FreeRTOS vApplicationMallocFailedHook (if that setting is enabled, otherwise it panics)

System info

  • Host Platform Ubuntu 24.0.4.1
  • Client Platform Pico2 W
  • Pico SDK 2.1.0
  • CYW43 / LWIP (Provided by Pico SDK)
  • FreeRTOS SMP - from origin/main branch (>V11.1.0) with submodules and CMAKE configs loaded from here:
        portable/ThirdParty/Community-Supported-Ports
        portable/ThirdParty/Partner-Supported-Ports
@lukebayes lukebayes added the bug Something isn't working label Feb 1, 2025
@sashacmc
Copy link
Member

sashacmc commented Feb 1, 2025

Hello @lukebayes, thanks for the report.
It looks like the zenoh handshake message received is incorrect (maybe empty zid?).
Could you please send the zenohd router config file as well as the zenohd log with RUST_LOG=trace.
Also the zenoh-pico log might be useful (enabled with cmake parameters -DCMAKE_BUILD_TYPE=Debug -DZENOH_DEBUG=3)

@lukebayes
Copy link
Author

Sure, thanks!

FWIW, the failure happens every single time for me.

I've attached the output of the zenohd with trace logging enabled.

Here is the printf log from my app (mix of yours and my log statements)

[INFO] (main.c:302) carz reset with branch: main and hash: 69a516c
[INFO] (main.c:306) cyw43_task created
[INFO] (main.c:311) blink_task created
[INFO] (main.c:315) starting scheduler
[INFO] (main.c:95) --------------------
[INFO] (main.c:96) cyw43_task started with priority 15 and core 0
[INFO] (main.c:106) -----------------------
[INFO] (main.c:109) cyw43_task initialized
Version: 7.95.49 (2271bb6 CY) CRC: b7a28ef3 Date: Mon 2021-11-29 22:50:27 PST Ucode Ver: 1043.2162 FWID 01-c51d9400
cyw43 loaded ok, mac 2c:cf:67:b5:2a:cb
API: 12.2
Data: RaspberryPi.PicoW
Compiler: 1.29.4
ClmImport: 1.47.1
Customization: v5 22/06/24
Creation: 2022-06-24 06:55:08

[INFO] (main.c:113) cyw43 connecting to: Monstera with password? 1
connect status: joining
[INFO] (main.c:145) --------------------
[INFO] (main.c:146) blink_task started on core: 1
[INFO] (main.c:152) blink_task begins loop
connect status: no ip
connect status: link up
[INFO] (main.c:76) -----------------------
[INFO] (main.c:77) Connected to wifi
[INFO] (main.c:121) WIFI CONNECTED
[INFO] (main.c:64) IP Address: 192.168.50.237
[INFO] (main.c:65) Netmask: 255.255.255.0
[INFO] (main.c:66) Gateway: 192.168.50.1
[INFO] (main.c:125) zenoh ready to begin
[INFO] (z_pub.c:58) Opening client session
[1970-01-01T00:00:11Z DEBUG ::_z_scout_encode] Encoding _Z_MID_SCOUT
[1970-01-01T00:00:11Z DEBUG ::_z_hello_decode_na] Decoding _Z_MID_HELLO
[1970-01-01T00:00:11Z DEBUG ::_z_locators_decode_na] Decoding _LOCATORS
[1970-01-01T00:00:11Z DEBUG ::__z_scout_loop] Received _Z_HELLO message
[1970-01-01T00:00:12Z DEBUG ::_z_unicast_handshake_client] Sending Z_INIT(Syn)
[1970-01-01T00:00:12Z DEBUG ::_z_init_encode] Encoding _Z_MID_T_INIT


[ERROR] (z_pub.c:32: errno: None) Unable to open session! Retrying in a moment...
[1970-01-01T00:00:30Z DEBUG ::_z_scout_encode] Encoding _Z_MID_SCOUT
[1970-01-01T00:00:30Z DEBUG ::_z_hello_decode_na] Decoding _Z_MID_HELLO
[1970-01-01T00:00:30Z DEBUG ::_z_locators_decode_na] Decoding _LOCATORS
[1970-01-01T00:00:30Z DEBUG ::__z_scout_loop] Received _Z_HELLO message
[1970-01-01T00:00:31Z DEBUG ::_z_unicast_handshake_client] Sending Z_INIT(Syn)
[1970-01-01T00:00:31Z DEBUG ::_z_init_encode] Encoding _Z_MID_T_INIT
[1970-01-01T00:00:31Z DEBUG ::_z_init_decode] Decoding _Z_MID_T_INIT
[1970-01-01T00:00:31Z DEBUG ::_z_unicast_handshake_client] Received Z_INIT(Ack)
[1970-01-01T00:00:31Z DEBUG ::_z_unicast_handshake_client] Sending Z_OPEN(Syn)
[1970-01-01T00:00:31Z DEBUG ::_z_open_encode] Encoding _Z_MID_T_OPEN
[1970-01-01T00:00:31Z DEBUG ::_z_open_decode] Decoding _Z_MID_T_OPEN

Here's the gdb output:

[rp2350.dap.core1] external reset detected
[New Thread 2]

Thread 1 "rp2350.dap.core0" hit Breakpoint 1, vApplicationMallocFailedHook () at /$PROJECT_DIR/apps/carz/main.c:50
50        log_err("Malloc failed");
─── Assembly ────────────────────────────────────────────────────────────────────────────────────
~
~
 0x10000364  vApplicationMallocFailedHook+0  push       {r4, r5, r7, lr}
 0x10000366  vApplicationMallocFailedHook+2  sub        sp, #8
 0x10000368  vApplicationMallocFailedHook+4  add        r7, sp, #8
!0x1000036a  vApplicationMallocFailedHook+6  ldr        r3, [pc, #68]   @ (0x100003b0 <vApplicationMallocFailedHook+76>)
 0x1000036c  vApplicationMallocFailedHook+8  ldr        r3, [r3, #0]
 0x1000036e  vApplicationMallocFailedHook+10 ldr        r4, [r3, #12]
 0x10000370  vApplicationMallocFailedHook+12 ldr        r0, [pc, #64]   @ (0x100003b4 <vApplicationMallocFailedHook+80>)
 0x10000372  vApplicationMallocFailedHook+14 bl 0x10000290 <render_file_name>
─── Breakpoints ─────────────────────────────────────────────────────────────────────────────────
[1] break at 0x1000036a in /$PROJECT_DIR/apps/carz/main.c:50 for vApplicationMallocFailedHook hit 1 time
─── Expressions ─────────────────────────────────────────────────────────────────────────────────
─── History ─────────────────────────────────────────────────────────────────────────────────────
─── Memory ──────────────────────────────────────────────────────────────────────────────────────
─── Registers ───────────────────────────────────────────────────────────────────────────────────
          r0 0x00000000           r1 0x00000000            r2 0x2000145b          r3 0x00000000
          r4 0x00000800           r5 0x2000e3ec            r6 0x00000000          r7 0x2000e1f8
          r8 0x08080808           r9 0x09090909           r10 0x10101010         r11 0x11111111
         r12 0x01010101           sp 0x2000e1f0            lr 0x100082e7          pc 0x1000036a
        xpsr 0x61000000        fpscr 0x00000000           msp 0x20081f00         psp 0x2000e1f0
      msp_ns 0x00000000       psp_ns 0xfffffffc         msp_s 0x20081f00       psp_s 0x2000e1f0
     primask 0x00            basepri 0x00           faultmask 0x00           control 0x06
    msplim_s 0x00000000     psplim_s 0x2000d858     msplim_ns 0x00000000   psplim_ns 0x00000000
   primask_s 0x00          basepri_s 0x00         faultmask_s 0x00         control_s 0x06
  primask_ns 0x00         basepri_ns 0x00        faultmask_ns 0x00        control_ns 0x04
─── Source ──────────────────────────────────────────────────────────────────────────────────────
 45      tight_loop_contents();
 46    }
 47  }
 48
 49  void vApplicationMallocFailedHook(void) {
!50    log_err("Malloc failed");
 51    taskDISABLE_INTERRUPTS();
 52    while (true) {
 53      tight_loop_contents();
 54    }
─── Stack ───────────────────────────────────────────────────────────────────────────────────────
[0] from 0x1000036a in vApplicationMallocFailedHook+6 at /$PROJECT_DIR/apps/carz/main.c:50
[1] from 0x100082e6 in pvPortMalloc+474 at /$PROJECT_DIR/vendor/freertos/portable/MemMang/heap_4.c:340
[2] from 0x1003b2de in z_malloc+14 at /$PROJECT_DIR/vendor/zenoh-pico/src/system/rpi_pico/system.c:44
[3] from 0x1002bd12 in _z_slice_init+16 at /$PROJECT_DIR/vendor/zenoh-pico/src/collections/slice.c:41
[4] from 0x1002be7e in _z_slice_copy+22 at /$PROJECT_DIR/vendor/zenoh-pico/src/collections/slice.c:95
[5] from 0x10043124 in _z_t_msg_copy_open+44 at /$PROJECT_DIR/vendor/zenoh-pico/src/protocol/definitions/transport.c:344
[6] from 0x100431fc in _z_t_msg_copy+116 at /$PROJECT_DIR/vendor/zenoh-pico/src/protocol/definitions/transport.c:374
[7] from 0x10045b42 in _z_link_recv_t_msg+362 at /$PROJECT_DIR/vendor/zenoh-pico/src/transport/common/rx.c:79
[8] from 0x1003a91e in _z_unicast_handshake_client+898 at /$PROJECT_DIR/vendor/zenoh-pico/src/transport/unicast/transport.c:217
[9] from 0x1003aa26 in _z_unicast_open_client+24 at /$PROJECT_DIR/vendor/zenoh-pico/src/transport/unicast/transport.c:328
[+]
─── Threads ─────────────────────────────────────────────────────────────────────────────────────
[2] id 2 name rp2350.dap.core1 from 0x1000c38c in prvPassiveIdleTask+12 at /$PROJECT_DIR/vendor/freertos/tasks.c:5757
[1] id 1 name rp2350.dap.core0 from 0x1000036a in vApplicationMallocFailedHook+6 at /$PROJECT_DIR/apps/carz/main.c:50
─── Variables ───────────────────────────────────────────────────────────────────────────────────
─────────────────────────────────────────────────────────────────────────────────────────────────

The router config is copied from the example code (maybe a minor modification):

{
  plugins: {
    rest: {                        // activate and configure the REST plugin
      http_port: 8000              // with HTTP server listening on port 8000
    },
    storage_manager: {             // activate and configure the storage_manager plugin
      storages: {
        demo: {                    // configure a "demo" storage
          key_expr: "**",          // which subscribes and replies to query on demo/**
          volume: {                // and using the "memory" volume (always present by default)
            id: "memory"
          }
        }
      }
    }
  }
}

zenohd.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants