Like most drivers in 🐙 userver the uMySQL driver consists of three logically separate layers:
Briefly, all these layers are glued like this:
There actually isn't that much to tell about infrastructure layer - it's just a bunch of connections pools, defined by topology, where every pool maintains some lockfree storage of connections and rents them to user-facing layer.
Communication layer and user-facing layer, however, are a different story, since they have to do some magic to not only make user types work out of the box, but also make containers of user types work seamlessly as well.
MYSQL_BIND
is a structure used by mariadbclient and mysqlclient for binding prepared statement parameters/result from/to C
types. When a statement is executed for every parameter marker in it mariadbclient (we'll only talk about mariadbclient here, since that is what driver uses) feeds the data from corresponding MYSQL_BIND
to a server, and when a statement execution results are fetched from the server mariadbclient populates MYSQL_BIND
with data.
These processes are different, and we'll talk about input/output parts separately.
For input parameters binding MYSQL_BIND
resembles to this:
which is self-explanatory;
buffer_type
to MYSQL_TYPE_NULL
int a
one should set buffer_type
to MYSQL_TYPE_LONG
and buffer
to &a
, std::string str
one should set buffer_type
to MYSQL_TYPE_STRING
and buffer/buffer_length
to str.data()/str.size()
accordinglyyou get the idea. For types that have no corresponding counterpart (say, std::chrono::...
or userver::formats::json::Value
) one should first serialize them to something mariadbclient understands and bind that serialized representation.
This is cool and all, and it should be pretty straightforward by now how to specialize some templates for selected subset of types (ints, strings and so on) for binding, but how does one make it work with unknown upfront user-defined aggregates?
And this is some magic
part. By none other than lead maintainer of userver Antony Polukhin himself,
boost::pfr
library is std::tuple like methods for user defined types without any macro or boilerplate code
, which allows one to do this:
It has some limitations but in general "just works", and with it the driver can
MYSQL_BIND
of size neededF
of an aggregate T
find an appropriate bind functionF
into corresponding MYSQL_BINDC++ templates are something else indeed.
MariaDB 10.2.6+ supports batched inserts and mariadbclient also supports it (in general it supports an array or params in prepared statement), as one could expect.
The API is pretty straightforward:
Generate some binds, set STMT_ATTR_ARRAY_SIZE
to specify amount of rows, set STMT_ATTR_ROW_SIZE
to specify a row size, so mariadbclient can do its pointers arithmetic, and you are good to... Hold up.
Pointers arithmetic
, right, that's the catch. Mariadbclient being C
library expects (and rightfully so) user data to be a C
struct as well, and API expects an array of these structs to be contiguous in memory, so it can add STMT_ATTR_ROW_SIZE
to MYSQL_BIND.buffer
go get the field buffer of the next row.
We could generate a std::vector
of tuples of conforming C
types, but that is a potentially heavy allocations and a lot of difficult to understand metaprogramming, and turns out there is a better way: STMT_ATTR_CB_PARAM
.
This is an undocumented feature (https://jira.mariadb.org/browse/CONC-348) and is used in only 2 repositories across all the github (guess where, mariadb-connector-c++
and mariadb-connector-python
), but it's there since Dec 3, 2018 therefore i consider it stable enough.
Basically it allows one to specify a callback which will be called before mariadbclient copies data from MYSQL_BIND into its internals, and with it instead of transforming user container into std::vector
we can maintain a mere iterator of that user container, and when prebind callback fires just populate the binds and advance the iterator.
Voila, no heavy metaprogramming (to transform user type into tuple of C
types), no intermediate allocations; a thing of beauty IMO.
The part responsible for that .AsVector<T>
API. Uses the same machinery with boost::pfr
, but binding is done differently: you see, for input params we already know all the things necessary - whether an std::optional
contains a value, what is the length of std::string
we are binding etc. - but for output we don't before the data is actually fetched; luckily MYSQL_BIND
has us covered.
For output fields MYSQL_BIND
resembles to this:
When initializing a bind for an output field one can set length
to &buffer_length
and is_null
to &is_null_value
, and then follow this process:
MYSQL_NO_DATA
we are done..resize
for strings, .emplace
for optionals, allocate intermediate buffers if needed (there are some type, say, userver::formats::json::Value
for which we can't just .resize()
).buffer
being nullptr for numeric types and just writes into it if a value is present in DB, so for optionals of numeric types we .emplace()
them at bind stage and .reset()
at fetch stage if null.User types are created and stored in container in user-facing layer, which implements AsContainer<Container>
like this:
Container::value_type{}
.boost::pfr
.Container::value_type
instance into result Container
and go to step 1.There are some optimizations implemented:
T
on stack and then move it into vector, we can just .emplace()
into vector and operate on its .back()
; but since that doesn't work in general (say, unordered_set
), we only do that for white-listed types.mysql_stmt_bind_result
for every new row - it copies all the binds supplied, which is 112 bytes per bind, and is a huge overhead, - we can just reuse the binds mariadbclient has copied after first call. There is no API to extract them (but a field is there and C
structs aren't that private) so strictly speaking it might break one day... but likely won't. https://jira.mariadb.org/browse/CONC-620