The Current State of the System (As I see it)
The graphics drivers (which reside in kernel space) for Haiku are minimal and exist only to transfer information to the card, where the accelerant is meant to do the heavy lifting and resides in user space. Theoretically, if a crash happens it will be in the accelerant which theoretically would be better for the system. In practice, the accelerant is an app_server addon and thus resides in the same memory space as the app_server which is a system critical team. If the app_server crashes, the OS crashes. Therefore, if an accelerant crashes, the whole system crashes which entirely defeats the purpose of splitting code between the accelerant and driver. For more information on how drawing occurs in the app_server (and system-wide in general) please see: https://www.haiku-os.org/articles/2011-06-15_how_transform_app_server_code_use_compositing
The Proposed New System
In the new system, some code is executed on the client side to prepare the data to be sent to the graphics server. The graphics server executes in it's own team isolating it from any system-critical teams and allowing for better crash recovery. The primary means of information transport exist as shared memory areas in Haiku called... well... "areas", although initial setup is performed through normal message passing, message passing being a slower means of communication in this instance and can be limiting to the data being sent (though in some cases data sent through messaging is ultimately sent via an area).
Drawing commands in the client start off as a function call in a shared library. This library breaks the function calls down into opcodes and inserts those into a ringbuffer used by the server. The server uses a virtual machine to act on those opcodes and, in the case of software rendering, draw to a framebuffer. In the case of hardware rendering, the accelerant which is now a graphics server addon will run it's VM over the ringbuffer and send processed commands to another ringbuffer to be consumed by the driver and sent to the hardware. In reality the software renderer is itself an accelerant. One exception to this method is the case where an API function is called which requires data to be returned to the caller. In this case the ringbuffer(s) would have to be flushed before returning the value as some pending operations may affect the ultimate value to be returned.
What was just previously described is a system called indirect rendering which is a bit dated and does incur some performance costs. However, ultimately it shouldn't be too difficult to convert the given design to utilize direct rendering in the future.
Misc Notes and Thoughts
Looking at the idea of crash recovery, there are some specific problems that crop up:
- If the server handles all context information, then that information is lost upon crash.
- The client can store context information, but how can the server query the clients for that info?
- Also, what about the ringbuffers?
The data in the ringbuffers doesn't matter as it can be redrawn once the server restarts (since the clients will keep drawing, which brings up another point... what if the server is in an indeterminate (crashed) state when drawing commands are issued? Maybe the ringbuffers can be stored in the app_server?). The app_server can come into play here and act as a mediator for fixing up data, but that is only applicable when the data is in shared memory. I suppose there could be an area just for holding pointers but that would add an additional indirection layer.
I'm wondering if it might not be beneficial to even have the primary ringbuffer on the client side as well. Then the app_server could just hold references to the clients which the server itself can query upon restart. That way the clients stay entirely stable even if the server crashes. The problem though then becomes a stability issue in the server should a client crash. Or (again) the ringbuffers could exist in the app_server which would act only as a data storage area and thus not be prone to crashes due to invalid data inside the areas. This would isolate the server and clients from each other and thus mitigate crash propogation should one or the other fail catastrophically.
"Crash recovery" has been implemented with the new test app_server. The framebuffers *can* be stored in the app_server, but the ringbuffers almost certainly *will* be stored there. Also as far as context data, there will be two copies of the data, one in the gfx_server so it can use that data, and the other in the app_server as a backup so the gfx_server can retrieve it in the event it crashes.