-
Notifications
You must be signed in to change notification settings - Fork 8.2k
Introduce libMP (Media Pipe library) #98514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I would start with the name (both libMP & MediaPipe): not great، ex. of some existing projects using it: Also, shouldn't this be hosted as an external RTOS-agnostic library, like libmpix & LVGL, it would see wider adoption that way in my opinion & would have better APIs. |
|
|
||
| source "lib/min_heap/Kconfig" | ||
|
|
||
| source "lib/libmp/Kconfig" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be shortened to "mp" instead of "libmp"?
It's placed in the "lib" folder, so it's clear that it's a library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we can. Apart from libc which has the lib prefix that I think due to historic reason, others don't have it. Noted and will change when we are firmed on the project name.
|
|
||
| static MpCaps *mp_caps_new_empty() | ||
| { | ||
| MpCaps *caps = k_malloc(sizeof(MpCaps)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to avoid the dynamic memory allocation in the library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this place, it's possible because sizeof(MpCaps) is fixed but caps need to be set in to MpQuery and sent across function too, so maybe we can use a static array or memslab (?) but then need to specify a max number for it. But this can be done specifically here only as we couldn't avoid dynamic alloc in the whole library. For example, the items (structure, value) in caps are dynamic and is known only at runtime when querying HW. Or when creating elements, the size of elements are not known beforehand because plugined elements sizes are variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some possible way to dodge k_alloc() is k_heap_alloc() with a heap local to libMP configured with Kconfig.
With a possible K_FOREVER instead of K_NO_WAIT in functions that need to return void.
Indeed, these are existing projects which have the same name with libMP. In fact, we tried to change the name several times, it's difficult to have an intuitive name which does not overlap with the existing ... What about libMPL, I do not see this elsewhere ? Do you know if we need to change also the prefix (mp_) in the code when changing the project name ?
I think it's a bit different compared to libmpix & LVGL. AFAIK, libmpix is mainly about math, algorithms (color / format conversions, etc.) which is rather OS-agnostic. LVGL exists nearly at the same time with Zephyr and has it own life-cycle outside Zephyr. Moreover, LVGL has its own eco-system and does not interact much with the OS, except when it comes to touch (input) and HW accelerators (such as PxP, VGLite, but for this, LVGL calls directly to the low level drivers and bypasses Zephyr subsystems and drivers. About this, I don't know how it works when we want to use LVGL and Zephyr stuffs on the PxP in the same application, does it lead to conflicts because Zephyr stuffs will pass by the video subsystem through the PxP Zephyr driver to the PxP low level driver while LVGL stuffs will pass directly to the PxP low level driver ? ...). So, to port LVGL to Zephyr, we need mainly an OSA and some glue codes. BTW, libMP used (and will use) heavily Zephyr mechanismes such as devicetree, iterable section, work queue, rtio, etc. to optimize the implementation. If we make an OS-agnostic version, we need to create something equivalent or the implementation cannot be optimized. The media components (plugins, elements) in libMP interact deeply with the OS, they calls directly to the (Zephyr) video, audio, display, vision, ... subsystems. So even in a generic libMP version, these components need to be created separately for each OS. And to support FreeRTOS, as an example, where there are no such subsystems we need to create all of them (kind of a HAL layer and need to reproduce the APIs a bit like in Zephyr). Another reason is, as an external module, libMP is required to have its own life-cycle outside the Zephyr Project, that is, reside in its own repository, and have its own contribution, testing and maintenance workflow and release process. We need to do integration into Zephyr regularly (like LVGL) and review all code from contributors (that may come from many different domains : video, audio, display, vision, NPU, etc.) where we don't have enough resource to do that. Looking that such a unified multimedia framework does not exist yet in Zephyr (there are some frameworks for audio such as Maestro, but when integrated into Zephyr it bypasses all Zephyr subsystems, so not a real integration), making it inside Zephyr, we expect much more contribution and helps from the Zephyr community and benefit Zephyr infrastructure (the current code base is just an initiative). So, IMHO, if we support FreeRTOS in the future, we could port it or maintain two versions where the generic version may not be optimized and the Zephyr version may grow much faster and has its own development cycle. |
|
Also we need to understand if this is a Zephyr Subsystem or a Zephyr Library. |
It seems to me that it's a Zephyr library (?) |
|
For this message, I only look at the content of It seems like there is some RTOS abstraction layer, which needs to stay if this is not meant as Zephyr-first/only implementation:
And then a very small core on top of it, bringing the bulk of what a media subsystem would need to do in an RTOS.
So +1 to try to reduce the number of elements to integrate and abstraction layers:
|
|
Some "complex" or "multi-component" camera/video hardware is arriving to Zephyr:
Depending on the hardware, a different application has to be written (currently managed with a growing number of In that sense, libMP can also be seen as an essential part of video hardware integration as it enables writing the basic video samples without hundreads of lines of copy-pasta boilerplate. For instance, here is Zephyr implementation of libMP's zephyr/samples/subsys/usb/uvc/src/main.c Lines 46 to 53 in a8bf08b
This encourages adding a dependency from Zephyr video samples to libMP, whichever way it is integrated. |
|
Thanks for the comment.
In fact, these are not RTOS abstraction layer but the "components" that we built to use in libMP. But you are right, there are things that we could (change /)move to other places to lighten the library. The "bus" and "message" concept in libMP are much lighter than Zbus. Basically it's just a FIFO containing messages from the pipeline sent to the application (one way) so I think using Zbus is a bit overkill. Event in libMP is different from the generic event mechanism and event in Zephyr. As seen in the code, it's simply a structure that contains a pointer to a data structure. There is no mechanism for "listening" or "broadcasting" the event. Element sends an event to downstream or upstream by simply putting it in the function parameters, and the element can handle the event or propagate it but this is implemented inside the element itself. mp_event and mp_query are nearly the same and should be refactored (will do).
Actually we use k_thread and just refactored into functions to not to duplicate code. Task will be extended more in the future.
mp_pixel_format are just enums to unify formats from different domains (video, display, vision. etc.) so that they can understand each other. So an enum is sufficient, I don't see why we need a FOURCC ... and there are some formats (in display) that don't have FOURCC.
Yes, that's right. Instead of calling mp_init() in each application. libMP can be initialized with SYS_INIT(). I will do that. So, by this, it turns out that libMP should be a subsys than a lib.
Currently libMP buffer pool is just an array of buffer structures to map to the real data buffers comming from each subsystem, no FIFO, no handling mechanism required (it's already done differently in each subsystem, e.g. video subsystem already used RTIO - ongoing work), element push buffer to downstream one by one after processing it. So, I am not sure to be able to use RTIO for this but I thought of that. Will rethink about this when we finished switching to RTIO for video subsystem.
That's right. This can be taken out and upstream to sys/utils.
That's right too. These can be taken out from libMP. But currently I don't know where to put it in Zephyr.
Both are used for caps and query. Basically they are generic and can be used outside libMP but it's hard to find another usage than this one. |
|
Thank you for walking through these points, this helps estimating the overlap with Zephyr features and figure out how to reuse existing Zephyr code to lighten libMP, and where it is not useful/possible to do so. |
|
This could act as integration layer to all of these?
Maybe even sensors: combine temperature data with an audio recording of engine noise and send both to an NPU. |
|
Fixed parts of Compliance and SonarQubec failures. |
It's more like an application layer, yes, whenever we can add a plugin / element for these, it will help.
It seems we need an
It seems we need kind of
For NPU, the problem is we don't have a subsystem. So, to support them in libMP, it seems we need to pass by low level drivers and need to create custom elements for each vendor ...
Will need parsers elements like : matroskademux, tsdemux, h265parse, etc.
I am not familiar with SOF but it seems audio topology is also part of SOF, so not sure is there any overlapping with libMP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must says I am impressed to see Gstreamer in such a "minimal" and "reduced" form.
I am trying to find places where it is possible to reuse existing Zephyr native constructs instead of those introduced, but it is difficult as the implementation of everything is very small, compact.
Here I try to spot all opportunities for reducing/simplifying things even further, and reduce memory footprint to hit smallest targets which would usually not be considered for "multimedia" (i.e. audio-only MCUs repurposed to also be video-capable).
Sometimes, I might be completely mistaken, and hopefully sometimes too it might be worth considering some (hopefully light) change. Furthermore, not all improvement opportunities will be worth the conversion effort.
Some high-level call-graph (libmp_graph.sh | dot -Tpng >libmp_graph.png):
| const char *v_cstring; | ||
| MpObject *v_obj; | ||
| void *v_ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For
v_cstring, this is a pointer to const, no need to allocate/free it. - For
v_ptr, this is externally managed memory, no need to allocate/free it - For
v_obj, replacing this field by a dedicatedMpValueObjectthere need extra care for allocating/freeing it.
By removing v_obj and introducing a MpValueObject replacing replaces MpObject, this would flatten all the memory model with no more nested pointer, and reduce memory overhead.
Is this worth considering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For MP_TYPE_LIST it might not be feasible to turn it into a "flat" type as it is a complex and extensible type by nature, so developers would expect to have to free every element to begin with: no developer surprise = no footgun.
| /* Discard unmatched message */ | ||
| mp_message_destroy(message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a message queue contains A, B, C, D, and someone requests a message C, this means A and B will never be handled?
| if (MP_EVENT_DIRECTION(event) & MP_EVENT_DIRECTION_UPSTREAM) { | ||
| pad_list = &element->sinkpads; | ||
| } | ||
|
|
||
| if (MP_EVENT_DIRECTION(event) & MP_EVENT_DIRECTION_DOWNSTREAM) { | ||
| pad_list = &element->srcpads; | ||
| } | ||
|
|
||
| SYS_DLIST_FOR_EACH_CONTAINER(pad_list, obj, node) { | ||
| ret |= mp_pad_send_event(MP_PAD(obj), event); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like it is possible, at least at the flag level, to send an element in both direction:
| if (MP_EVENT_DIRECTION(event) & MP_EVENT_DIRECTION_UPSTREAM) { | |
| pad_list = &element->sinkpads; | |
| } | |
| if (MP_EVENT_DIRECTION(event) & MP_EVENT_DIRECTION_DOWNSTREAM) { | |
| pad_list = &element->srcpads; | |
| } | |
| SYS_DLIST_FOR_EACH_CONTAINER(pad_list, obj, node) { | |
| ret |= mp_pad_send_event(MP_PAD(obj), event); | |
| } | |
| if (MP_EVENT_DIRECTION(event) & MP_EVENT_DIRECTION_UPSTREAM) { | |
| SYS_DLIST_FOR_EACH_CONTAINER(&element->sinkpads, obj, node) { | |
| ret |= mp_pad_send_event(MP_PAD(obj), event); | |
| } | |
| } | |
| if (MP_EVENT_DIRECTION(event) & MP_EVENT_DIRECTION_DOWNSTREAM) { | |
| SYS_DLIST_FOR_EACH_CONTAINER(&element->srcpads, obj, node) { | |
| ret |= mp_pad_send_event(MP_PAD(obj), event); | |
| } | |
| } |
Or maybe this is not expected to happen?
| } | ||
| } | ||
|
|
||
| return ret ? MP_BUS_DROP : MP_BUS_PASS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this library is meant to be used with Zephyr only, then it could become standard -ERRNO error codes.
However, that is also getting in your way as ret |= ... would no be possible anymore (unless return ret == 0 ? 0 : -ESOMETHING is used).
| return; | ||
| } | ||
|
|
||
| field = k_malloc(sizeof(MpStructureField)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly some missing if (field == NULL)... which requires making the function return int instead of void and handle the error everywhere.
| typedef struct { | ||
| MpValue base; | ||
| union { | ||
| int32_t v_int; | ||
| uint32_t v_uint; | ||
| } num, denom; | ||
| } MpValueFraction; | ||
|
|
||
| typedef struct { | ||
| MpValue base; | ||
| MpValueFraction min, max, step; | ||
| } MpValueFractionRange; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fractions give exact representation for framerates but introduce other issues (#85697) and the solution proposed was to just use floats/doubles.
The alternative is to use an arbitrarily small enough unit like nanosecond/microseconds, and always compare using "close enough" or "closest" rather than "exact match".
However, you explored this problem much more in depth than me, and I might have missed important facts.
|
Cc @hfruchet-st |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
| SYS_DLIST_FOR_EACH_CONTAINER(otherpad_list, obj, node) { | ||
| ret |= mp_pad_send_event(MP_PAD(obj), event); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems so convenient to use boolean for error checking, as then all the "forbidden" constructs become possible:
if (!function_that_fails()) {
return false;
}ret |= function_that_fails();It diverges a bit from Zephyr -errno style, any plan about moving this to the somewhat more Zephyr-native style if importing it directly in?
f831208 to
3018f17
Compare
|
Updates:
|
Sorry that I replied this without a real search. It seems that NPU plugins / elements will be made on top of an existing NN framework like TFLite Micro, GLOW, etc., so no need for libMP to deal with vendor-specific stuffs. And TFLite Micro was already working under Zephyr : |
| add_subdirectory(os) | ||
| add_subdirectory(utils) | ||
| add_subdirectory_ifdef(CONFIG_SMF smf) | ||
| add_subdirectory_ifdef(CONFIG_MIN_HEAP min_heap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message should at least give a brief overview of libMP similar with the PR description. Keep in mind that once the PR is merged the PR description would be not really useful. People will still look at git log.
| * | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| */ | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A short description at the beginning of the file saying what bin is would be useful.
| * constitute a partial frame. | ||
| */ | ||
| uint16_t line_offset; | ||
| } MpBuffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inconsistent style line_offset , bytesused. I would use bytes_used.
| /** Size of each buffer in bytes */ | ||
| size_t size; | ||
| } MpBufferPoolConfig; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would honestly avoid mixing CamelCase with snake_case. Also, I would avoid opaque data types.
so this should become struct mp_buffer_pool_config. Unless strong reasons not to do so. This will be more zephyr like.
|
|
||
| bool mp_bus_post(MpBus *bus, MpMessage *message) | ||
| { | ||
| MpBusSyncReply reply = MP_BUS_PASS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You see here is a mix of CamelCase and snake_case which I find it all over your implementation.
This should be:
bool mp_bus_post(struct mp_bus *bus, struct mp_message *message)
and use typedef only for comples types like function types.
Introduce libMP (MediaPipe library), a gstreamer-like multimedia framework for Zephyr. Signed-off-by: Phi Bang Nguyen <[email protected]> Signed-off-by: Trung Hieu Le <[email protected]>
Add plugin for video which includes source and transform elements. Signed-off-by: Phi Bang Nguyen <[email protected]>
Add plugin for display which includes a display sink element. Signed-off-by: Phi Bang Nguyen <[email protected]>
Add plugin for audio which includes source, sink and a gain transform elements. Signed-off-by: Michal Chvatal <[email protected]>
Add video examples for libMP which includes two pipelines: - camera source and display sink - camera source, video transform and display sink Signed-off-by: Phi Bang Nguyen <[email protected]> Signed-off-by: Trung Hieu Le <[email protected]>
Add example for audio with a pipeline consists of a dmic source, a gain transform and a i2s sink element. Signed-off-by: Michal Chvatal <[email protected]>
|
Updates:
|
|



In Zephyr today, multimedia applications—such as those involving video, audio, display, vision, and graphics—are typically implemented as simple, domain-specific sample applications. While these are sufficient for basic use cases, they quickly become inadequate when dealing with:
In such cases, application complexity increases significantly. Developers must manually manage buffer allocation, queuing and dequeuing for each component as well as synchronization between components across the pipeline. Moreover, similar functionality often needs to be reimplemented across projects, leading to duplicated effort. Applications also tend to require extensive customization for each use case and become fragile to even minor changes in requirements.
To address these challenges, this PR introduces libMP (MediaPipe library)—a lightweight multimedia framework designed specifically for Zephyr.
This PR depends on the following PRs:
libMP aims to simplify the development of multimedia applications by:
It also streamlines the development of multimedia components (plugins) by:
• Offering a consistent, well-defined framework for plugin developers.
• Enabling reuse across different multimedia components.
libMP reuses many concepts from GStreamer—such as elements, pads, caps negotiation, and buffer negotiation—and adopts a pipeline-based architecture that decomposes multimedia processing into discrete, interconnected elements.
Applications simply select the built-in elements suited to their purpose to construct a pipeline, and it just works. This design promotes modularity, reusability, and efficient resource management (e.g., zero-copy data flow), which are critical for resource-constrained embedded systems.
libMP features a highly modular, inheritance-based architecture inspired by GStreamer, ensuring modularity, scalability, and maintainability. For example, new custom elements can be easily added via plugins by extending existing elements—without requiring modifications to the core components. Additionally, plugins are selectively built by enabling their corresponding Kconfig options, helping to minimize memory footprint. Key design highlights include:
Currently, libMP is provided with proof-of-concept (PoC) examples for both video and audio pipelines:
Additional plugins and example pipelines can be added in the future. Among them, the prioritized TODOs are: