drivers: video: API enhancement #72959

josuah · 2024-05-17T16:32:46Z

Introduction

Pursue the work on the video API past the Video 4 Zephyr added to Zephyr in #17194.

Problem description

I would like to implement UVC and enable more complex video pipelines to be written without having to rewrite from scratch a new application every time the devicetree is updated.

Proposed change

Incrementally rework the devicetree and video driver API

Detailed RFC

The existing API leaves corners left for future specification.

The goal is:

To offer a way to fully access the devicetree video content from the application
Allow to effectively connect video elements together in code, like specified in the devicetree without having to do it manually.
Define all semantics to cover the very most common use-cases completely.
Reuse ideas and code widespread through other Zephyr drivers.

Proposed change (Detailed)

remote-endpoint is not used by Zephyr drivers. RTIO can be leveraged to replace the many FIFOs that are typically defined in each driver 1, 2, 3 and provide a way to directly connect drivers while also allowing the application to be interleaved in the pipeline. This would act as a splice(2) for Zephyr video drivers, triggered by the devicetree `remote-endpoint

drivers: video: "remote-endpoint" does nothing #72311 (comment)

[UPDATE: it might be possible to use RTIO on top of the current API instead of replacing it]

video_endpoint_id has unclear semantics in some cases (doc fix only) and can be adjusted to allow positive numbers refer an individual endpoint number or address

drivers: video: introduce semantics for enum endpoint_id #73009

[UPDATE: a documentation fix, mostly]

Following this, devicetree macros can be introduced to provide ways to refer a particular endpoint without manually coding the endpoint number from the application.

include: devicetree: Add port and endpoint macros #80649

video_get_caps() fills a struct with all the video caps filled at once. This works well, but makes it more difficult than using an enumerator-style API such as what #72254 proposes which allows drivers to be a bit more generic: a device doing software video processing can filter the format capabilities of its sources this way.

Video : Add frame interval APIs #72254

video_enqueue(): "an addition to the video API to enable streaming partial frames within one video buffer" is missing. A new API is introduced as part of Arducam Mega sensor support.

Introduce a directory for the video drivers, as image sensors, MIPI/DMA controllers [...] are being mixed in one same driver API. A sensor directory was the chosen way and will be merged once the ongoing PR are completed.

drivers: video: Move the sensors to their dedicated directory #73013

Add a drivers/video_*_skeleton.c in the same style as drivers/usb/udc/udc_skeleton.c that can speed-up development and help with understanding the semantics.

drivers: video: sensors: add a sensor_skeleton.c to speed-up contribution of simple sensors #73867

[UPDATE: proposed as an emulated/fake driver instead]

video: emul: virtual driver for an imager and RX engine #79482

The devicetree can be ~~flattened~~ to enable more uniform processing, such as DT macros used to reduce boilerplate code and homogenize drivers

dts: flatten video's device tree endpoints definition #72950 (this turns out to not require any change: just conventions)

[UPDATE: no API change needed]

VIDEO_BUF_FRAG: Add fragmentation support to allow partial/fragmented frames being returned from video devices.

video-interfaces.yaml: Introduce common devicetree bindings for all video devices:

dts: bindings: video: Add common video interface binding #74415

<zephyr/drivers/video-controls.h>: Add a flag to the video CIDs that specify variants of current controls to query the kind of result to get: current value (default) or minimum/maximum/default value.

drivers: video: introduce "GET" sub-operations #78603

Dependencies

Annotated inline above.

Concerns and Unresolved Questions

A few modification to existing drivers are to be introduced to support all proposed changes.
Many of the "rework" proposal are a matter of deciding what is most comfortable to work with, and a call to discussion more than a fixed solution proposal.

Alternatives

Leaving-up the application developer resolve these issues is the current approach, and demos and samples submitted #71463 show that it effectively works end-to-end with the current API.

For the case of RTIO, it is also possible to introduce a "genderless" interconnect between source and sinks (so that the application or another driver can use it) by replacing enqueue()/dequeue() by a public FIFO integrated directly in the API.

The text was updated successfully, but these errors were encountered:

josuah · 2024-05-17T16:36:10Z

Extra material to investigate the rationale of the existing APIs, so that their semantics can be kept: https://static.linaro.org/connect/san19/presentations/san19-503.pdf

josuah · 2024-05-18T02:22:46Z

Thanks to @loicpoulain I rediscovered Gstreamer, and it is great!

The above is also the foundation for a zephyr-shell pipeline, as it is not possible to do it if rewriting the application every time is required. The difficulty, though, is that getting <&phandles> from remote-endpoint at compile-time will not allow to reconnect them later with zephyr-shell.

Maybe it is enough to have "gst-launch pipeline, but all devicetree-made" as a first step...

teburd · 2024-05-20T17:40:09Z

Right you could treat an rtio_iodev as an endpoint (input or output stream), which is what sensors do today. There's effectively two output streams for sensors, one which is polled, and one which is driven by the sensor itself with an event.

The sensing subsystem provides what amounts to a statically defined pipeline with fanout/fanin possibilities and builds on top of this.

I could imagine video input/output endpoints being similar, though perhaps needing some additional information about image encoding and framesize to describe the metadata.

Image sensors (video) are after all sensors :-)

josuah · 2024-05-21T03:42:26Z

So let's give it a try! :) I will pick some hardware known to work first and move from here.

To everyone working on a sample or a driver, I wish to avoid disturbing any ongoing effort, so in case I missed an issue, do feel free to ping me about it, I can provide a patch to merge on the PR or delay another PR so that actual driver contribution gets merged first!

josuah · 2024-05-24T19:20:07Z

Some insightful overview of how libcamera and V4L2 APIs came-up to be:
https://archive.fosdem.org/2022/schedule/event/mobile_camera/
Key takeaways (from the slides):

Application might want to configure itself each point in the pipeline
A system should not hard-code use-cases
There will be vendor-specific controls

Prior work:

Camera branch #17394

teburd · 2024-08-12T15:03:28Z

@josuah any progress on doing video with RTIO? Any help needed? I have an interest in seeing this work

josuah · 2024-08-12T20:05:51Z

@teburd many thanks!

Not direct progress, but still interesting insights:

USB Video Class with a slow CPU: context-switching seems to impact FPS.
There are independent parts to video.h: control (no change needed?), data (to integrate with RTIO).
RTIO could be built as a separate API acting as a replacement for enqueue()/dequeue()
RTIO could be built top of enqueue()/dequeue() at the cost of extra FIFO -> RTIO -> FIFO traffic but useful for migration

Where the most help is needed is probably in designing and discussing an API that gracefully fits the <video.h> model.
Then I'd be quite glad to try to convert the existing drivers. I started gathering boards for that (NXP's FRDM, Some ST's...).

teburd · 2024-08-12T23:28:31Z

@teburd many thanks!

Not direct progress, but still interesting insights:
* USB Video Class with a slow CPU: context-switching seems to impact FPS.

This is pretty common, if interrupts are involved you can reduce the cost by using direct interrupts and controlling more when the scheduler gets called into only when you know things need to be rescheduled. E.g. completed entire transfer rather than a little peripheral FIFO ding to do some refilling.

* There are independent parts to `video.h`: control (no change needed?), data (to integrate with RTIO).

* RTIO could be built as a separate API acting as a replacement for enqueue()/dequeue()

Those APIs could be replaced with the read/write ops for RTIO. Ideally we'd benefit then from some common infrastructure around tracing/statistics/timeouts/etc.

* RTIO could be built top of enqueue()/dequeue() at the cost of extra FIFO -> RTIO -> FIFO traffic but useful for migration

It can be, but it'd be an added cost instead of benefit perhaps, but I don't know enough of the video API to make a judgement on that. It's quite possible the API and implementations already do whats needed.

Where the most help is needed is probably in designing and discussing an API that gracefully fits the <video.h> model. Then I'd be quite glad to try to convert the existing drivers. I started gathering boards for that (NXP's FRDM, Some ST's...).

I don't have many boards that support video other than the imxrt1060 evk, which I do have the camera and screen for.

josuah · 2024-08-13T09:20:17Z

completed entire transfer rather than a little peripheral FIFO ding to do some refilling

Precious insight!

Those APIs could be replaced with the read/write ops for RTIO.

The API introduced by @loicpoulain looks similar to RTIO in practice:

video_enqueue() submits to a fifo_in (like a submission queue).
video_dequeue() pulls from a fifo_out (like a completion queue) with a timeout parameter.
video_set_signal() permits to watch multiple video devices like POSIX select(2).
Some drivers (1, 2, 3) have an extra work queue

7 drivers do I/O in total:

zephyr/drivers/video/video_mcux_csi.c

Lines 31 to 32 in 5368d7d

struct k_fifo fifo_in;

struct k_fifo fifo_out;
zephyr/drivers/video/video_stm32_dcmi.c

Lines 39 to 40 in 5368d7d

struct k_fifo fifo_in;

struct k_fifo fifo_out;
zephyr/drivers/video/video_sw_generator.c

Lines 20 to 21 in 5368d7d

struct k_fifo fifo_in;

struct k_fifo fifo_out;
drivers: video: video_mcux_smartdma: add SMARTDMA video driver #72827
Camera interface esp32s3 #75331
drivers: video: Add support for arducam mega #66994
usb: device_next: new USB Video Class (UVC) implementation #76798

teburd · 2024-08-13T12:48:36Z

Sounds like its solved a very similar problem.

Waiting for a completion could be done with a timeout, io_uring has a similar wait_cqe_timeout type API call https://man.archlinux.org/man/io_uring_wait_cqe_timeout.3.en that would be easy to replicate I'm sure.

josuah · 2024-08-14T18:31:07Z

Another challenge is to combine the RTIO and video buffers together.

zephyr/include/zephyr/rtio/rtio.h

Lines 232 to 260 in 6b7558a

    
           struct rtio_sqe { 
        
           	uint8_t op; /**< Op code */ 
        
           	uint8_t prio; /**< Op priority */ 
        
           	uint16_t flags; /**< Op Flags */ 
        
           	uint16_t iodev_flags; /**< Op iodev flags */ 
        
           	uint16_t _resv0; 
        
           	const struct rtio_iodev *iodev; /**< Device to operation on */ 
        
           	/** 
        
           	 * User provided data which is returned upon operation completion. Could be a pointer or 
        
           	 * integer. 
        
           	 * 
        
           	 * If unique identification of completions is desired this should be 
        
           	 * unique as well. 
        
           	 */ 
        
           	void *userdata; 
        
           	union { 
        
           		/** OP_TX, OP_RX */ 
        
           		struct { 
        
           			uint32_t buf_len; /**< Length of buffer */ 
        
           			uint8_t *buf; /**< Buffer to use*/ 
        
           		};

zephyr/include/zephyr/drivers/video.h

Lines 105 to 119 in 6b7558a

    
           struct video_buffer { 
        
           	/** pointer to driver specific data. */ 
        
           	void *driver_data; 
        
           	/** pointer to the start of the buffer. */ 
        
           	uint8_t *buffer; 
        
           	/** size of the buffer in bytes. */ 
        
           	uint32_t size; 
        
           	/** number of bytes occupied by the valid data in the buffer. */ 
        
           	uint32_t bytesused; 
        
           	/** time reference in milliseconds at which the last data byte was 
        
           	 * actually received for input endpoints or to be consumed for output 
        
           	 * endpoints. 
        
           	 */ 
        
           	uint32_t timestamp; 
        
           };

with also uint32_t flags; and uint32_t bytesframe introduced in #66994

Is it planned to just wrap the current video struct?

rtio_sqe.userdata = &video_buffer;

Or use RTIO instead of video_buffer mapping the fields like this maybe?

rtio_sqe.buf = video_buffer_buffer;
rtio_sqe.buf_len = video_buffer_size; /* if OP_RX */
rtio_sqe.buf_len = video_buffer_bytesused; /* if OP_TX */
rtio_sqe.userdata = video_buffer_driver_data;
(void *)video_buffer_bytesframe; /* does not fit */
(void *)video_buffer_timestamp; /* does not fit */

Or change the video buffer to only contain what does not fit into RTIO buffers to avoid too much copy?

struct video_buffer_extra_info {
    	uint32_t timestamp;
	uint32_t bytesframe;
} video_buffer;

rtio_sqe.userdata = &video_buffer;

Or modify RTIO to support the extra fields so that rtio_sqe can replace video_buffer completely?

rtio: Tracing #76972 for replacing timestamp
something else replacing bytesframe

If RTIO replaces struct video_buffer, then there would need to handle the allocation too.

Tagging @loicpoulain who introduced the API and contributors/reviewers @ArduCAM @CharlesDias @danieldegrasse @decsny @epc-ake @erwango @ngphibang (alphabetical order) in case anyone is interested in seeing this happen.

teburd · 2024-08-14T19:16:36Z

The way sensing deals with this is by encoding the extra info in the buffer itself and providing functions to get it back. Video can do something similar perhaps. It would require over allocating the buffer itself to account for the metadata and not just the frame data then.

E.g. something like...

rtio_sqe_prep_read(video_input, buf, buf_len);
rtio_submit(r, 1);

struct video_metadata *metadata = video_buf_metadata(buf);
struct video_frame *frames = video_buf_frames(metatadata, buf);
size_t frame_count = video_buf_frame_count(metadata, buf, buf_len);

XenuIsWatching · 2024-08-16T04:21:28Z

I have worked quite a bit with cameras to the point where we made our own "internal" API with cameras due to some short comings with the 'current' implemention of the video api. It's good to see some work on this now to improve it.

Some APIs that I've noticed that are lacking within the current implemention are ways of setting the Physical Link Rate and Type such as CPHY or DPHY along with their rate in sps for CPHY and bps for DPHY. This would be a good feature that could be implemented somehow.
Some Cameras may be a bit 'advanced', and be able to stream out multiple frame types out at once such as a IR image along with a RGB image, I'm not sure how that would be currently implemented to select which frame type to enable or set a ctrl for. Maybe enum video_endpoint_id ep could be used for it?

josuah · 2024-08-16T11:38:52Z

@XenuIsWatching

ways of setting the Physical Link Rate and Type such as CPHY or DPHY along with their rate in sps for CPHY and bps for DPHY

I think this PR is addressing this, leveraging port { endpoint { link-properties-here; }; };:

dts: bindings: video: Add common video interface binding #74415

stream out multiple frame types out at once such as a IR image along with a RGB image

This PR would allow specify endpoint numbers in addition to the generic VIDEO_EP_IN/OUT/ANY/NONE.

drivers: video: introduce semantics for enum endpoint_id #73009

I will comment on each PR...

teburd · 2024-08-16T13:23:36Z

Sensing deals with multiplexed fifo buffers today and could be used as a model to follow perhaps. Or you could have multiple output streams, each one acting like an IO device (struct rtio_iodev) by moving to that API.

josuah · 2024-08-16T14:57:23Z

@teburd

I did not think about placing the metadata in the buffer! I think all elements are in place and I will be able to test now.

It would require over allocating the buffer itself to account for the metadata and not just the frame data then.

This could be integrated into the video allocation functions then.

Sensing deals with multiplexed fifo buffers today and could be used as a model to follow perhaps.
Or you could have multiple output streams, each one acting like an IO device (struct rtio_iodev) by moving to that API.

Time for experimentation. Thank you.

teburd · 2024-08-16T15:39:07Z

I have worked quite a bit with cameras to the point where we made our own "internal" API with cameras due to some short comings with the 'current' implemention of the video api. It's good to see some work on this now to improve it.

Some APIs that I've noticed that are lacking within the current implemention are ways of setting the Physical Link Rate and Type such as CPHY or DPHY along with their rate in sps for CPHY and bps for DPHY. This would be a good feature that could be implemented somehow. Some Cameras may be a bit 'advanced', and be able to stream out multiple frame types out at once such as a IR image along with a RGB image, I'm not sure how that would be currently implemented to select which frame type to enable or set a ctrl for. Maybe enum video_endpoint_id ep could be used for it?

Right, this is general an issue in Zephyr today because we don't have a way of doing device specific behaviors akin to ioctl. Sensing sort of deals with this with its attributes API but there are still oddities with it. Because each call to set attr is a partial reconfiguration of the device, and ordering can matter, there can be invalid configurations.

For example many devices offer low power modes which will toggle on/off the mems device at the cost of noise. Usually these modes are limited in sampling rate. Sometimes the sample rate overlaps, but frequently some sample rates only work in a low noise (always on) mode.

So now you have this quirk of ordering where you may wish to change both the power mode and sample rate at the same time, but the API has no way of allowing this. So drivers then have to work out what the implied meaning of a sample rate setting may mean.

There's also the dai interface which fully gave up on trying to provide structured configuration and takes a void *config defined by each dai implementation.

josuah added the RFC Request For Comments: want input from the community label May 17, 2024

github-project-automation bot added this to RFC Backlog May 17, 2024

henrikbrixandersen added Enhancement Changes/Updates/Additions to existing features area: Video Video subsystem labels May 18, 2024

josuah added the priority: low Low impact/importance bug label Jul 22, 2024

josuah self-assigned this Jul 22, 2024

josuah added the Meta A collection of features, enhancements or bugs label Aug 12, 2024

josuah mentioned this issue Aug 22, 2024

RTIO: Add Workqueue support for delayed work items #77100

Open

josuah mentioned this issue Sep 18, 2024

drivers: video: esp32s3: Add set_signal video api function #78566

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

drivers: video: API enhancement #72959

drivers: video: API enhancement #72959

josuah commented May 17, 2024 •

edited

Loading

josuah commented May 17, 2024

josuah commented May 18, 2024 •

edited

Loading

teburd commented May 20, 2024 •

edited

Loading

josuah commented May 21, 2024

josuah commented May 24, 2024 •

edited

Loading

teburd commented Aug 12, 2024

josuah commented Aug 12, 2024

teburd commented Aug 12, 2024

josuah commented Aug 13, 2024

teburd commented Aug 13, 2024

josuah commented Aug 14, 2024

teburd commented Aug 14, 2024 •

edited

Loading

XenuIsWatching commented Aug 16, 2024 •

edited

Loading

josuah commented Aug 16, 2024

teburd commented Aug 16, 2024 •

edited

Loading

josuah commented Aug 16, 2024

teburd commented Aug 16, 2024

drivers: video: API enhancement #72959

drivers: video: API enhancement #72959

Comments

josuah commented May 17, 2024 • edited Loading

Introduction

Problem description

Proposed change

Detailed RFC

Proposed change (Detailed)

Dependencies

Concerns and Unresolved Questions

Alternatives

josuah commented May 17, 2024

josuah commented May 18, 2024 • edited Loading

teburd commented May 20, 2024 • edited Loading

josuah commented May 21, 2024

josuah commented May 24, 2024 • edited Loading

teburd commented Aug 12, 2024

josuah commented Aug 12, 2024

teburd commented Aug 12, 2024

josuah commented Aug 13, 2024

teburd commented Aug 13, 2024

josuah commented Aug 14, 2024

teburd commented Aug 14, 2024 • edited Loading

XenuIsWatching commented Aug 16, 2024 • edited Loading

josuah commented Aug 16, 2024

teburd commented Aug 16, 2024 • edited Loading

josuah commented Aug 16, 2024

teburd commented Aug 16, 2024

josuah commented May 17, 2024 •

edited

Loading

josuah commented May 18, 2024 •

edited

Loading

teburd commented May 20, 2024 •

edited

Loading

josuah commented May 24, 2024 •

edited

Loading

teburd commented Aug 14, 2024 •

edited

Loading

XenuIsWatching commented Aug 16, 2024 •

edited

Loading

teburd commented Aug 16, 2024 •

edited

Loading