For microservices waiting for I/O is typical: often the response of a microservice is formed from several responses from other microservices and databases.
The problem of efficient waits for I/O in the classical approach is solved by callbacks: a function (called a callback) is passed to the method that performs I/O, the callback is called by the method when the wait is complete. If you need to perform several I/O operations sequentially, the callback from the first I/O method calls the method for the next I/O and passes the next callback to it. As a result, you get code that is unpleasant to write and difficult to maintain due to the many nested functions and non-obvious control flow.
The userver framework with stackful coroutines comes to the rescue.
For the user of the framework, the code becomes simple and linear, but everything works efficiently:
For the simplicity of the example, all lines where the coroutine can be paused are marked as // 🚀
. In other frameworks or programming languages it is often necessary to explicitly mark the context switch of the coroutine. In those languages in each line with the comment // 🚀
, you would have to write a keyword like await
. In userver you do not need to do this, switching occurs automatically and you do not need to think about the implementation details of various methods of the framework.
Unlike in the Python, in userver multiple coroutines can be executed simultaneously on different processor cores. For example the View::Handle
may be called in parallel for different requests.
Now compare the above userver code with the classic callback approach:
The classical approach is almost twice as long, and it is difficult to read and maintain because of the deep nesting of the lambda functions. Moreover, the time-consuming error codes handling is completely omitted, while in the first example all the errors are automatically reported through the exception mechanism.
In simple terms, coroutines are lightweight threads that are managed by the application itself, not by the operating system.
Coroutines provide an additional benefit in high-load applications:
However, there is a disadvantage of a cooperative multitasking:
The user of the framework does not work directly with the coroutines, but rather works with tasks that are executed on the coroutines. It is very expensive to create and destroy coroutines for each asynchronous operation, so coroutines are reused between tasks. When a new task is starting, a free coroutine is selected to host it. After the task is finished, the coroutine is released, and can be used by another task.
The main purpose of userver is to be an effective solution for IO-bound applications. Coroutines help to achieve that purpose.
It takes less than 1us to invoke and wait for a noop task engine::Async([] () {}).Wait()
. In this case, the task execution time will be automatically measured inside the engine::Async()
, the task will be linked to the parent task to simplify tracing, all information will be recorded in logs.
For comparison, std::thread ([] () {}).join()
takes ~17us and does not provide or log any information.
The userver synchronization primitives are comparable in performance to standard primitives, and we are constantly working to improve them. For example concurrent `lock()` and `unlock()` of the same mutex from different threads on a 2 core system with Hyper-threading produce the following timings:
Competing threads | std::mutex | Mutex |
---|---|---|
1 | 22 ns | 19 ns |
2 | 205 ns | 154 ns |
4 | 403 ns | 669 ns |