This is the draft text – Work in progress. It will be synchronized with the document format occasionally.
The Graphics Sharing and Distributed HMI project is setup to explore the technologies for complex graphics interaction in distributed and consolidated automotive systems. Here Graphics Sharing refers to “Graphics in the form of bitmap, scene graph or drawing commands generated on one ECU, transferred for display from another ECU (or between virtual machine instances)”. Distributed HMI compositing technologies refer to “methods to turn a multi-ECU system into what appears and acts as a single user-experience”.
The underlying challenge is producing a consistent distributed HMI experience across distributed and diverse multi-ECU systems:
- Top-level compositing across domains for the same physical display, with or without mixed safety levels
- Graphics on virtualized / consolidated systems.
- Diverse operating systems and HMI technologies
- Distributed HMIs that appear unified
- Graphics transfer encoding/technology and composition Wayland/Waltham, etc.
Technologies to address the following graphical use cases are discussed:
- Multiple ECUs or operating systems sharing a single display.
- Multiple operating systems sharing a single GPU.
- Multiple ECUs or operating systems working together to create a UI in unison.
- One ECU or operating system providing graphical content to multiple ECU or operating systems.
Five primary solution categories for distributing graphics and HMI between cooperating systems have been identified in the project and each is exemplified and studied from its relative strengths and weaknesses.
This paper provides an overview of the technologies and use cases where these technologies can be applied. Each of these technologies are compared and guidelines are provided which assist in choosing the right technology for any use case. The white paper clarifies these alternative approaches and give guidance for which method to choose in a given situation. The project ultimately aims to produce guidelines, and shared standards described using specification and code with open licenses. We hope to confirm implementation of open specifications among multiple platform and graphics technology vendors. The technologies discussed here are practically used and case studies are highlighted to prove the concepts. They include real-world demos of open source technologies a few proprietary implementations for illustration purposes. By using these technologies it is possible to create a complex automotive multi-ECU HMI which appears and acts as a single ECU/system HMI.
Graphics Sharing Categories
Development of a complex automotive HMI has already a long history and during this time a lot of different technologies and systems where created and integrated. Also new driver assistance systems, new features of infotainment systems and also handheld devices, like mobile phones, continue to appear in automotive environment and need to be integrated into one system to provide harmonic experience to a driver and other passengers. Along with other requirements this integration needs technologies which will allow to communicate, share state and exchange graphical content between different devices and systems. This is a challenge especially because some of the devices are originally not developed to be used in a car.
To master this today several solutions, communication protocols and framework are created. Some of them are proprietary and close sourced, some of them are open source and used also in non IVI environments. To handle this challenge also in the future a standardization and robust implementation of sharing technologies is required and categorization of a existing technologies is an good step forward into this direction.
Sometimes it might be difficult to assign a concrete sharing technology to a single category, because of some implementation details it might fit to several categories at the same time but this is not a concern of this white paper. In first place categories should provide an abstraction of a complex technologies to be able to design the system without a detailed understanding of concrete implementation.
Following five categories are suggested:
Sub-category: "Virtual Display" – Full display transfer by encoding as a video stream.
Shared state, independent rendering
They are described and compared to each others in the following captures.
API Remoting involves taking an existing graphics API and making it accessible through a network protocol, thus enabling one system to effectively call the graphics/drawing API of another system. In some cases an existing programming API such as that of a local graphics library (e.g. OpenGL), is simply extended to provide network-capability turning it into an RPC (Remote Procedure Call) interface. In other cases, new custom programming interfaces might be created that are specially tailored for distributed graphics.
Use cases :
- A device with no graphical capability wants to show graphical content on a display connected to a remote server.
- In a hypervised system, an operating system with no access to GPU wants to render graphical content.
- Depending on the available bandwidth HMI/animation performance can vary.
- Smoothness of animation and UI response additionally depends on Network latency.
- In case of API remoting libraries need update, then this update needs to be carried on all relevant nodes in network.
- If standard API (e.g. OpenGL) is converted to RPC then compatibility of applications is maintained, else changes are needed for applications to use the custom APIs.
RAMSES is a framework for defining, storing and distributing 3D scenes for HMIs. From a user’s perspective, RAMSES looks like a thin C++ scene abstraction on top of OpenGL. RAMSES supports higher level relationships between objects such as a hierarchical scene graph, content grouping, off-screen rendering, animations, text, and compositing. All of those features follow the principle of API remoting, i.e. the commands (that define the state in the RAMSES case) can be sent to another system and the actual rendering is executed at the receiving side.
RAMSES distinguishes between content creation and content distribution; the same content (created with RAMSES) can be rendered locally, on one or more network nodes, or everywhere. RAMSES handles the distribution of the scenes, and offers an interface to control the final composition - which scene to be sent where, shown at which time,and how multiple scenes and/or videos will interact together. The control itself is subject to application logic (HMI framework, compositor, smartphone app etc.).
TODO: add link to video
RAMSES is a low-level framework, closely aligned to OpenGL. Migrating from an existing OpenGL application to RAMSES usually involves providing a software wrapper which creates RAMSES objects (shaders, geometry, rendering passes, transformation nodes, etc) instead of sending OpenGL commands directly to the GPU. The main difference (and thus cause of migration effort) is that OpenGL sends a command stream per rendered frame, whereas RAMSES requires a scene definition which is updated only on change. For example, a static image would have to be re-rendered in every frame, starting with glClear(), setting all the necessary states and commands, and finishing with eglSwapBuffers(). With RAMSES, the scene creation code would look similar to an OpenGL command stream, but once the scene has been defined, its states and properties don't have to be re-applied every frame, only changes to it. Migrating from OpenGL to RAMSES is comparable to migrating from an OpenGL-centric rendering engine to a scene-graph-centric rendering engine. Depending on application this effort may vary.
Surface sharing distributes already rendered graphical content, representing the intended graphics from the application. A surface is represented as a two-dimensional image in memory, which can be described with width, height, pixel format and some additional meta-data. Along with the image data, other information, e.g. touch events, can be shared but in terms of size, image data would have by far the biggest share. Therefore, sharing of image data should be the driving point for optimization during the definition and implementation of the sharing mechanisms.
When possible, shared memory between systems should be used. On distributed systems without access to common memory all data needs to be shared via network. To reduce the bandwidth usage, video encoding and decoding hardware can be used with reasonable performance.
Use cases :
- Navigation surface rendered by infotainment unit needs to be shown on Instrument cluster.
- In a hypervised environment the operating system controlling display wants to show surface from another operating system.
Surface sharing requires a communication protocol to request or notify about new available graphical content in the system, forward touch events and control the sharing in general. This results in modification of standard graphical applications. To avoid modification of standard graphical application lets checkout virtual display and Waltham.
Wayland is a set of libraries designed to abstract the messaging between client and server. Wayland protocols are the high level definitions of the messages exchanged between client and server. The server using wayland to listen to clients, is a Wayland compositor and the clients are graphical applications. Clients send their surfaces to be shown on display to Wayland compositor. Wayland compositor controls the display and shows client surfaces on display.
Waltham is similar to wayland but with a capability to communicate over TCP/IP i.e. it is designed to work with distributed systems. Wayland lags this capability. One of the reasons why wayland lags this capability is that it uses file descriptors to share client surface with server.
- In case of distributed systems, high memory bandwidth is consumed for transport of pixel data. Pixel data can be encoded to save bandwidth but there will be loss of quality.
- Depending on the available bandwidth HMI/animation performance can vary.
- Smoothness of animation and UI response additionally depends on Network latency.
Virtual Display describes a concept which can be used to realize distributed HMI use-cases. Important characteristic of the Virtual Display concept is that the entire display content is transferred instead of transferring content from dedicated applications. The system which provides the content should be presented with a display which act likes a physical display, but is not necessarily linked to a physical display, so the middleware and applications can use it as usual. Such a display can be called a Virtual Display. The implementation of Virtual Display on the producer system should be generic enough to look like the normal display and should take care of the transferring the display content to another HMI unit or another system.
Following diagram illustrates a full solution using virtual display concept and surface sharing with communication protocols defined using Waltham libraries.
Weston is the reference implementation of wayland server. Weston provides an example of Virtual Display implementation . Weston can be configured to create a Virtual Display which is not linked to any physical display. Properties like resolution, timing, display name, IP-address and port can be defined in weston.ini file. From this configuration Weston will create a virtual display "area" and all content which will appear in this area will be encoded to M-JPEG video compression format and transmitted to the configured IP address. Transmitter plugin mentioned in the above diagram creates virtual display and handles input events arriving over waltham.
All sources that are required to realize a virtual display on weston, using Waltham can be found at waltham, wayland, wayland-ivi-extension, [TODO] Add repositories for transmitter plugin and Waltham server.
Android OS has support for creation of multiple virtual displays. Applications can render to virtual display. Android provides capabilities to access the framebuffer of virtual display. Sharing of this framebuffer over network or by means of memory sharing is left to individual implementations. One such solution from Android is described below:
Above solution from Allgo is explained in this video.
Shared State, Independent Rendering
The shared state category refers to rendering independent but coordinated UIs by Multiple ECUs or operating systems, by sharing only the HMI defining state and data rather than direct graphical elements. UI state such as the position of window, other data defining the displayed content and input events are shared over communication channel. By both sides implementing the same graphical elements and look-and-feel, this can still provide a synchronised user experience where displays connected to different ECUs or operating systems, appear as if they belong to a single system.
In the above diagram HMI application on ECU1 was able to render to a display on ECU2 by means of sharing the state in HMI framework. HMI framework synchronizes the states on both ECUs. This way HMI application can control where its content is rendered and shown. Also, configuration can be done in a way that, HMI application does not even know where it is rendered and displayed. Events from input devices result in change of state in HMI framework on ECU2. States are synchronized between ECU1 and ECU2 HMI frameworks. Thus, a completely interactive UI can be realized.
- User wants to drag and drop a phone contact list from multi-media system display to Instrument cluster display.
- User wants to drag and drop the navigation application from multi-media system display to Instrument cluster display.
- Seamless user experience depends on network bandwidth and latency.
- Using only shared state results in duplication of UI resources (images, scenegraph, navigation data). On the other hand sharing these resources over network impacts performance.
- Synchronized software update is required between ECUs using shared state concept.
- Shared state is a feature of UI toolkit. If same UI toolkit available on different operating systems, this approach is lucrative in-terms of portability.
Using Qt remote objects for shared graphics state
Qt Remote Objects (https://doc.qt.io/qt-5.12/qtremoteobjects-index.html) is inter-process and domain communication module developed for Qt. It can be utilized to implement shared state between two or more objects. Below is an example diagram showing Qt Remote Objects to share navigation state between IVI and cluster.
1) QtRO compiler .rep file is shared between the entities and used to define API to be shared (properties, signals and slots)
2) QtRO compiler generates SimpleSource class that can be inherited or used as is by creating source and replica objects and exposing those as QML context properties.
3) Exposed properties can be used in QML (https://doc.qt.io/qt-5/qmlapplications.html) application code and changes made to properties are propagated to all replica objects. Cluster goes to navigation state when IVI changes navigationMode property in onClicked() handler.
Harman shared state concept:
TODO: Contribution here from Harman.
GPU sharing is referred to sharing of GPU hardware between multiple operating systems running on an hypervisor. When GPU is shared, each operating system works as though GPU hardware is assigned to it.
GPU Sharing can be implemented by:
- Providing virtualization capabilities to the GPU hardware.
- Using a virtual device which schedules render job on GPU by communicating with Operating system holding GPU hardware access.
GPU sharing using hardware virtualization capability:
Modern GPUs are equipped with their own MMU. Additionally, GPU hardware can implement functionality to identify the operating system which is scheduling render job. One way of doing this is to use the domain ID of the operating system to identify its rendering job. Now when a particular rendering job is executed, GPU can load the page tables corresponding to that domain (operating system), thus ensuring that a render job from one operating system does not access the memory dedicated to other operating systems. There can be additional capabilities needed e.g. if any of the domains is safety relevant then GPU faults of non-safe domains should not affect the other and it should be possible to prioritize the render jobs from different domains.
GPU sharing using virtual device:
Domains running on hyperisor can be categorized as host and guest. Host domain has full access to GPU hardware where as the guests use a different GPU driver which communicates with host. Rendering job of guest is scheduled on GPU by host. Guest uses a virtual device which communicates with a serer on host domain to execute the render job.
Use cases :
- Instrument cluster and infotainment unit running as different domains on an hypervisor. Both need access to GPU hardware.
Design considerations :
- GPU sharing requires a lot of attention in terms of security aspects, specifically when one or more of the domains are safety relevant.
- GPU sharing without GPU virtualization capabilities introduces communication and memory copy overhead. This affects performance.
virtio-gpu 3D is a virtual device. This is an open source implementation, where some modifications to the Mesa library (an open-source implementation of many graphics APIs, including OpenGL, OpenGL ES (versions 1, 2, 3), OpenCL) have been done on guest side. Applications on the guest side still speak unmodified OpenGL to the Mesa library. But instead of Mesa handing commands over to the hardware it is channeled through virtio-gpu to the backend on the host. The backend then receives the raw graphics stack state (Gallium state) and interprets it using virglrenderer from the raw state into an OpenGL form, which can be executed as entirely normal OpenGL on the host machine. The host also translates shaders from the TGSI format used by Gallium into the GLSL format used by OpenGL. The OpenGL stack on the host side does not even have to be Mesa, and could be some proprietary graphics stack.
In display sharing, a display is shared between multiple operating systems, with each of the operating systems thinking that it has complete access to display. The operating systems are co-operative so that content on the screen is meaningful. Also it could be that specific areas of display are dedicated to each of the operating systems.
Realizing Display sharing requires "Hardware Layer" feature in Display controller or linked image processing hardware. Each Hardware Layer holds a framebuffer and supports operations (e.g. scaling, color key and so on) on the pixels. Frame buffers of each of the Hardware layers are blitted to form the final image which goes to the display. This blitting is carried out by the hardware itself.
Each operating system renders to one or more Hardware layers. Each Hardware layer is associated with one operating system.
[TODO]: Choose better pictures e.g. navigation inside a cluster.
In the above diagram the black slit in the display controller hardware can be imagined as hardware layer, with each layer accepting a frame buffer. In the conceptual diagram above, display controller has composition capability and combines the hardware layers and sends final image to display.
Use cases :
- A display shared by instrument cluster and multi-media system. If each of these are controlled by different operating systems in a virtualized environment, display sharing can be used.
If each of the operating systems renders to specific area of the display then, using display sharing technology operating systems can work without knowing about each other i.e. there is no need of communication between operating systems to share the display. However, if the operating systems present a content on display which is dependent on other operating system then communication is required.
Where hardware display layers are not available a form of Display Sharing may be provided in software by the Display Manager of certain Hypervisors. The Display Manager provides a mechanism for guest VMs to pass display buffers which it then combines.
Design considerations :
- In a hypervised system if multiple operating systems render to a fixed area of display then display sharing can be used without any communication between operating systems.
- Requires implementation in hypervisor. Specific implementation are closed source many a times.
The Renesas R-Car Canvas Demo demonstrates consolidation of cockpit functionality onto a single H3 SoC running on a Salvator-X board driving three displays and multiple operating systems. In the version discussed here the Integrity Multivisor hypervisor is used to virtualise an Integrity RTOS instance running a cluster demo and Linux operating system instance running an IVI stack and other functionality. The first display is shared between Integrity and Linux, whilst the second and third display is dedicated to Linux. For display one Integrity and Linux are assigned different hardware display layers in the VSPD hardware block, with Linux being placed behind the Integrity cluster in the Z-ordering to ensure the Linux applications cannot overwrite the cluster graphics.
[TODO] : Provide a video link.
Choosing a strategy
Comparison of approaches
- The physical display can be shared across multiple operating systems or major functional units of a single operating system.
- Hardware Compositor block provides multiple layers that can be assigned to different functional units. The hardware then combines (composites) them to create the final display buffer.
- The Compositor hardware block is typically close to or part of the Display Unit hardware block.
- The different systems do not need to know about each other.
- No need to have protocols to pass graphics between operating systems.
- Little to no impact for the content producer (depends on implementation)
- Integration in existing graphics systems is possible with reasonable effort
- The content of individual graphical applications can be shared
- Bandwidth efficient (for certain cases)
- Offloads rendering effort to receiver
- Flexible, while the controller has detailed control over results
- Potential API complexity – needing synchronized software/API updates
Shared State Independent Rendering
- Low inter-domain data channel bandwidth usage
- Applicability to mid/low performant SoC
- Operating System - agnostic approach
- Access to the GPU hardware from several OS's: cost saving.
- safety concerns: force preemption needs to implemented to ensure safety use cases
Hardware may provide additional features that enhance isolation of the different functional units... characteristics of each - trade-offs, ... "dry data"
... Describe each example and then recommended approach and why. This is provides more understandable to the dry comparison of previous chapter.