This year’s Linaro Connect event took place in Spain’s capital, the city of Madrid in May and several members of The Good Penguin attended the event for the conference aspects of the gathering. This event was different than your usual open source conference as Linaro Connect is also a vessel for Linaro employee’s to have a face 2 face time annually.
If you are like us, you might have seen the name Linaro popping out for the very first time in one of your toolchains, very likely it was GCC, for the ARM processor. However, Linaro was launched in 2010 partially due to issues with merging patches into the Linux kernel. In the past most vendors basing their products on ARM IP CPUs would have their own special platform tree within the kernel with a lot of overlap and conflicts on how and why things should be done, which led to Linus, the creator of Linux, threatening to stop accepting patches for ARM based platforms altogether. Initially Linaro was a non profit organisation created to clean up the codebase, which it did and fruits of their contribution had a positive effect for utilisation of ARM CPUs in Linux and thus in embedded devices around the world. Linaro has grown since then and their client base spans Google and Qualcomm to name a few and their expertise transcends that of the Linux kernel.
Regarding the event itself, here is a list of talks that we (Pawel Zalewski and Philip Boucherat) have found particularly interesting with a short summary.
Open Source Under Threat – Amanda Brock, OpenUK
OpenUK describes itself as an open tech industry body that advocates for the use of open technology (software, hardware, data, standards, open-AI etc.) in the UK. Unlike other organisations in the open source community it deals mainly in the open source policy area, not the technology itself. The talk was delivered by Amanda Brock who is the CEO of OpenUK and is a lawyer with more than two decades of experience with a focus on the legal side of the open source world and technologies. She is the author of Open Source Law, Policy and Practice (2nd edn).
The talk dealt with the general state of open source today including economical and political pressures. In her own words the most important aspect of FOSS is that it can be used by anyone anywhere for any purpose. The talk gave a brief history of open source and how far it went, compared to the times where Microsoft was explicitly calling out Linux as cancer to fully embracing FOSS nowadays and FOSS being in 96% of software stacks around the world. It was pointed out that nearly 5% of UK population has a git account, well done everyone! Some typical examples of struggles of FOSS in the larger scope of world economy was also analysed. One of the issues with FOSS is that there is tendency of open source disrupting the status quo in a given sector of economy and thus the natural reaction is for businesses to fight back. Another pattern can be observed in the industry, where companies initially built on open source tend to go towards a proprietary licensing system for the sake of being sell-up-able.
So now with open source as being one of the forces behind driving the world economy, it gained some unwanted attention and worries started to emerge about security and responsibility. The speaker outlined the issues that will come our way due to the EU’s Cybersecurity Resilience Act or POTUS Cybersecurity strategy, mainly that system integrators or any businesses or individuals that profit from open source software in any way will become liable for the libraries that they use, not the end user, as it was the case currently. The talk makes a point that unwanted side-effect of this act might lead to FOSS not being used in the EU as no one will want to be liable. An interesting thought about this is that the very first distributor that will make money out of a particular FOSS library will be liable. The next steps for EU is to change the legal meaning that is entailed with software via Product Liability Directive, it shall be a product, not a service. We recommend to watch this talk in full.
So, you want to encrypt? – David Brown, Linaro
This was a short talk which was basically a reminder of the old advice not to invent your own encryption methods and your own protocols.
The typical requirements when sending a message are that: the recipient of the message can be sure who sent the message, the recipient can be sure that the message has not been tampered with, and that a third party is not able to read the contents of the message or even infer anything about the message contents.
The main cryptographic tools available are :
- Symmetric cipher
- Hash function
- Key agreement
- Digital signature
David emphasised that for each of these tools there is a huge range of choices already implemented, many of which have been around for years, and which have well known vulnerabilities. Over time development moves on in an effort to improve security, new encryption methods are developed, as well as new secure protocols. But where these methods become standards, many people are able to debate their merits and possible vulnerabilities, and maybe thousands of people on the internet will be trying to break these things – so they are well tested.
Even so, just from the example of TLS, one can see by the number of versions that even for a major protocol like this, vulnerabilities are always being discovered so it has to keep up, to the extent that you really should not be using older versions of TLS. Further examples were hash functions like md5 which are now deprecated, and the original Diffie-Hellmann key exchange method which would always end up with the same key for each session between the same two people, which is not ideal. So the point is that making something really secure is really hard and you are unlikely to succeed on your own.
After a summary of the above tools, David went on to introduce a new set of standards being developed by the IETF called CBOR Object Signing and Encryption (COSE, RFC 9052) which describes how to create and process signatures, message authentication codes, and encryption using Concise Binary Object Representation (CBOR, RFC 8949) which is like a binary form of json designed for small code/message size. In addition, there is a thing called SUIT (Software Update for IoT, RFC9019 and RFC9124) which currently is mostly to do with firmware updates for IoT devices, and it is based around COSE and COBR.
COSE is still really at a fairly early stage and so far has so far mostly been used just for signing, the encryption aspects are still under development. However SUIT has developed further on the encryption front, and it basically provides profiles which consist of a set of cryptographic choices taken from COSE, i.e which key derivation function, which symmetric cypher, and parameters for all these choices. As these profiles have been designed by cryptographic experts, this greatly reduces the chance of you making a mistake when making your own choices about these things and inadvertently introducing a vulnerability.
However the conclusion is that people will always find vulnerabilities in any security method, and that really security is a best effort thing. In the questions, David was asked whether being told not to invent security stuff yourself meant an end to innovation, and the answer was definitely not, for instance there are cryptographic competitions going on all the time. But rather than trying to develop something on your own, do it in conjunction with Working Groups so you can get as much feedback as possible, and as many minds as possible trying to figure out any weaknesses in your approach, or you may even then find out that what you’re doing has already been done.
Pioneering Edge AI Platforms – Sergi Mansilla, EdgeImpulse
Edge Impulse is a leading company in the area of embedded machine learning known for their platform that enables developers to accelerate the delivery of their ML solutions to the market. The talk was interesting as it described the typical pitfalls associated with trying to deliver embedded ML in production, some of which we have seen and experienced ourselves. Sergi Mansilla is VP of Engineering at Edge Impulse with 2 decades of experience.
Running pre-trained ML models for inference on an energy/compute/storage-constrained devices, which is what we deal with in out daily jobs as Embedded software engineers, can be challenging. This is due to the fact that AI compute is resource hungry in general, the raw data needs be processed, features need to be extracted and then passed through several layers for actual inference, often times within a set deadline. But there are benefits of making the effort to do this and not to go for a usual cloud based solution, as Sergi points out the main arguments are:
- bandwidth and latency: do not have to wait for cloud or need a fast connection
- privacy: data stays on the device
- power: can stay operational longer as no constant use of wifi is required
- economics: save on cloud services
- reliability: can operate with low connectivity
- innovation: as it can be more flexible
The talk described how embedded AI is already in most of the devices in a variety of sectors (industrial, health etc.) and the prediction is that this will keep on growing, with some CPUs already having a NPU sub module to aid AI workloads. One of the interesting aspects of the embedded AI area is that usually no one is an expert in both fields at the same time, so AI-knowledgable companies do not know how to implement their products in hardware and do not understand the hardware constrains. On the other hand we can have embedded-knowledgable companies that do not understand the art of training correctly a model and are unaware of the fact that it can require thousands of labelled examples. Such companies are often confused about AI and need to be educated that data is key, labelling data, curating data, collecting data, a whole pipeline is a great effort and can require a lot of man hours and knowledge to deliver.
The presenter also described the struggles with the open source stack being used, mainly due to poor support from Google on the TF Micro project (which is used as the inference backend on the device). The TF MIcro github project has stopped accepting PRs, new versions are not backwards compatible with no release/stable branches being maintained or even tags being present. So it is more of an available code base still owned by Google, not your high quality FOSS repo. As a response to all of this, they had to create their own fork of the project and test if it actually still works and if it is stable.
We learn that highly optimised C++ is king for embedded AI and the “desktop land” Python models can be converted into C++ (usually with bit reduction for representing weights for reducing the size), Edge Impulse have created their own EON compiler (based on tflite micro compiler) to aid them in optimising for RAM, which is the main issue. Another trick one can use to aid embedded AI is to take full advantage of the DSP, this is especially true for computing the FFT spectrum of the time series signals (which can often be one of the features or base for other features deeper down the network). NPUs can be slower with such workloads, with the difference being O(N^2) for NPU vs O(N logN) for fast FFT, thus it is benefitical to preprocess the data without using the NPU. The other benefit of using DSP is that one does not need lots of space, which is at premium in embeded, for storing weights if using plain FFT. In comparison a NPU based 256-point-FFT would require 65,536 weights for example. However, using DSP is not a rule and sometimes the NPU can outperform the DSP, so the performance can be hardware dependant – something very typical in the embedded world. Thus, sometimes one has to arrive at the best solution using heuristics, which their framework can make less painful. The conclusion is that it is tricky to have a well trained and optimised model running on a constrained hardware, as it needs to be optimised specifically for the platform that is being used and the whole process calls for expertise in both areas.
Extending libcamera to provide a fully open camera SoftISP stack – Bryan O’Donoghue, Linaro
Libcamera is an open source user space library and framework for the control of cameras and processing of camera data. Normally the camera data would be processed in a hardware ISP (Image Signal Processor), but there are also systems that do not contain an ISP or where the camera itself does not contain an ISP, and that is the subject of this talk.
Bryan started the talk with a brief explanation of the processing required to obtain useable camera data. The individual sensors in the camera sensor array are actually monochrome sensors, so the sensor array is overlaid with an array of red green and blue filters. Since the eye (or actually the cones in the eye) is most sensitive to light around the 550nm wavelength, or greenish yellow colours, the filter pattern contains more green filters than red or blue filters as there is no point capturing too much information that the eye won’t really be able to see. The pattern in which the filters are laid out is known as a Bayer pattern, named after Bruce Bayer who invented this method in 1974 at Eastman Kodak.
The raw data coming from the camera is therefore called Bayer encoded data, and each pixel just has a single value. So to get an RGB value for each pixel, various algorithms exist that basically use some kind of interpolation using the value of and filter colour of adjacent pixels to estimate the RGB value of the current pixel. Some well known Bayer decoding algorithms are Nearest, Bilinear and Malvar-He-Cutler, with the last one being the best of those three but also the most computationally expensive.
In addition to debayering, further processing is required to deal with things like Autofocus (AF), Auto White Balance (AWB) and Auto Exposure/Gain – Bryan calls these the the 3As. Typically a hardware ISP will contain blocks to accelerate all these things plus additional manufacturer-specific stuff to differentiate their hardware. Bryan showed us an interesting photo showing a bride reflected in two mirrors that was in fact 3 different poses merged into one photo by an Apple phone’s ISP (so called black mirror photo).
However, because of economics, camera sensors that provide raw Bayer encoded data are proliferating in PCs and embedded equipment as this eliminates the expense of an ISP in the sensor hardware, and this is where libcamera comes in as the underlying hardware and drivers deliver raw bayer encoded data to libcamera in userspace, and this data can be processed in software, i.e. a SoftISP within libcamera.
While Bryan explained that the 3As have been implemented in libcamera, for more complex applications such as video meetings, integration with pipewire was added since for instance firefox has extensions that can interact with pipewire cameras, so there is no need for firefox to know about libcamera. Because of the complex pipeline management required as a result, libcamera takes care of this and it is no longer possible to directly open /dev/video0 and apps have to go through libcamera. He then discussed future plans which involve GPU acceleration through OpenGL, GPU acceleration through OpenCL/Vulkan, and various type of image quality enhancement to be done in the SoftISP.
Bryan finished the talk with a demonstration of the software running on an ARM-based laptop. In typical demo fashion, he couldn’t get the HDMI to work with the screen in the room, so the laptop was passed around, it showed the video coming from the camera going through the libcamera SoftISP and the results were surprisingly good, with a 5MP sensor taking only about 5% of CPU with the data entirely processed in software at a very respectable looking frame rate (I forgot to ask).
Implementing an Openchain compliant policy and best practice at Linaro – Carlo Piana and Alberto Pianon, Array
Carlo Piana and Alberto Pianon from the Italian IT law firm Array who are an OpenChain partner gave a talk that outlined their journey in compliance for the Eclipse Foundation’s Oniro platform (which is an operating system based on the Linux or Zephyr kernels), hence it was another talk in the themes of software liability in FOSS and thus distributions. Openchain is a Linux Foundation run project for providing open source with consistent process management and issues ISO standards for license compliance and security compliance in FOSS. The talk points out that using FOSS in a product that is being distributed triggers obligation as one has to comply with the OSS licenses. One is required to be aware of the software that is being used in their products and the associated risks they might entail with IP rights and cybersecurity regulations like the incoming EU’s CRA. Ideally one needs to have a process in place to ensure continuous compliance and that is able to demonstrate the compliance to ones customers or authorities. One of the artefacts that can be easily delivered on this is the Software Bill Of Materials created in a machine-readable format such as the SPDX format (so that it can be scalable, you do not want to do it manually).
Alberto and Carlo also described how they have approached the problem of Software Composition Analysis on the Oniro project, highlighting the fact that license metadata is often incomplete, unrefined and requires human review, that is why for validation they used fossology. This is all a lot of manual work, so they decided to re-use the benefits of the debian‘s package, as its metadata is detailed, and developed an automated toolchain that does most of the work. That toolchain can be integrated with Yocto CI pipelines – Aliens4Friends, aiding the whole SCA processes and just making it part of the build. Moving forward, they have contributed to Yocto Project itself to enhance it with handling the compliance tasks and added an Unpack Tracer API that can track each source file back to the upstream URL. There is also a meta-bbtracer Yocto layer work in progress as well. The speakers are currently working on being able to map binary files to source files with the argument that different source files might have different licenses and also plan to extend Aliens4Friends to tackle the security aspect to comply with the CRA in addition to providing a detailed SBOM.
From the talk we learnt that in 3 years from now all software would need to be CRA compliant to be placed on the EU market and have a CE mark for Products With Digital Elements. We learnt that SBOM for OSS components will be mandatory and so will providing vulnerability fixes throughout the whole product lifecycle. This due diligence will not only apply to end products, but also to components that the product is using! Alberto and Carlo make a point that this is a lot of work and one needs to start now to be ready in 3 years and that one should follow the Openchain ISO standards. Interesting times ahead…
Simple, Yocto, Secure Boot Using new systemd features to pick all three! – Erik Schilling, Linaro
For me this was a particularly interesting talk as I was about to start a project involving encrypted user partitions on Nvidia devices. What this talk is really about is extending secure boot to the user partition. Typically secure boot has got as far as verifying the bootloader, kernel and device tree, but the user partition is not verified and needs additional consideration. Part of the reason for this is that data on the user partition gets altered at runtime so any original signature of it will no longer be valid.
Anyway, the essence of this talk is to use systemd-repart to automatically create an encrypted user data partition with a per-device private key.
Erik outlined what he considers to be five of the problems that are faced in trying to make this into a turnkey solution and pointed out areas where there is still work to do, so whilst a lot of the work is in a usable state, it still needs tweaks. The five problems he covered were:
- How to create a secure system partition – systemd-repart
- How to secure /usr – dm-verity
- How to mount the secure partitions – uapi group – discoverable partition specification, unified kernel images
- Which disk to boot – EFI variables defined by systemd boot loader interface
- Where to implement the boot logic – initramfs required that includes systemd, made easier with a Unified Kernel Image which is a single EFI binary containing kernel, initramfs, splash creen and device tree.
The question and answer session after the talk was particularly good, but it also made it clear that this is still very much a work in progress. A lot of the questions and discussion were/was to do with the details of how to deal with TPMs which are not so relevant in Nvidia systems.
To sum up, these were the talks that we have enjoyed the most. Madrid is an interesting city with an efficient metro system and they do serve a bag of crisps or olives to every beer, what is there not to love!