🟨 Serialize to and from a binary format using Reflection
This is a versioned binary serializer / deserializer built on top of Reflection.
It uses struct member iterators on Reflection schema serialize all members and the recursively Packed
property for optimizations, reducing the number of read / writes (or memcpy
) needed.
T[N]
)🟨 MVP
Under described limitations, the library should be usable but more testing is needed and also supporting all of the relevant additional container data-types.
SC::SerializationBinary::write is used to serialize data. The schema itself is not used at all but it could written along with the binary data so that when reading back the data in a later version of the program, the correct choice can be made between deserializing using SerializationBinary::loadVersioned (slower but allows for missing fields and conversion) or deserializing using SerializationBinary::loadExact (faster, but schema must match 1:1).
T | Type of object to be serialized (must be described by Reflection) |
value | The object to be serialized |
buffer | The buffer that will receive serialized bytes |
numberOfWrites | If provided, will return the number of serialization operations |
true
if serialization succeededAssuming the following struct:
PrimitiveStruct
can be written to a binary buffer with the following code:
SC::SerializationBinary::loadExact can deserialize binary data into a struct whose schema has not changed from when SerializationBinary::write has been used to generate that same binary data. In other words if the schema of the type passed to SerializationBinary::write must match the one of current type being deserialized. If the two schemas hash match then it's possible to use this fast path, that skips all versioning checks.
T | Type of object to be deserialized (must be described by Reflection) |
value | The object to be deserialized |
buffer | The buffer holding actual bytes for deserialization |
numberOfReads | If provided, will return the number deserialization operations |
true
if deserialization succeededAssuming the following structs:
TopLevelStruct
can be serialized and de-serialized with the following code:
The versioned read serializer SC::SerializationBinary::loadVersioned must be used when source and destination schemas do not match. Compatibility flags can be customized through SC::SerializationBinaryOptions object, allowing to remap data coming from an older (or just different) version of the schema to the current one. SerializationBinary::loadVersioned will try to match the memberTag
field specified in [Reflection](Reflection) to match fields between source and destination schemas. When the types of the fields are different, a few options allow controlling the behaviour.
T | Type of object to be deserialized (must be described by Reflection) |
value | The object to deserialize |
buffer | The buffer holding the bytes to be used for deserialization |
schema | The schema used to serialize data in the buffer |
numberOfReads | If provided, will return the number deserialization operations |
options | Options for data conversion (allow dropping fields, array items etc) |
true
if deserialization succeeded
Assuming the following structs:
VersionedStruct2
can be deserialized from VersionedStruct1
in the following way
Packed
structs by offsetInBytes
.Conversion options for the binary versioned deserializer.
Option | Description |
---|---|
allowFloatToIntTruncation | Can truncate a float to get an integer value. |
allowDropExcessArrayItems | Can drop array items if destination array is smaller. |
allowDropExcessStructMembers | Can drop fields not matching any memberTag in destination struct. |
The binary format is defined as follows:
int
, float
etc) are Packed
by definition and get dumped to binary stream as is (with their native endian-ness)struct
are serialized:Packed
== True
: In a single memcpy
-dump (as if sorted by offsetInBytes
)Packed
== False
: Serializing each field sorted by their visit order (and not the memberTag
)T[N]
arrays (fixed number of elements) are serialized:Packed
== True
: In a single memcpy
-dumpPacked
== False
: Serializing each array item in sequenceVector<T>
(variable number of elements) are serialized:uint64_t
Packed
== True
: In a single memcpy
-dumpPacked
== False
: Serializing each vector item in sequenceIf a struct
, T[N]
array or content of Vector<T>
is made of a recursively Packed
type (i.e. no padding bytes at any level inside the given type) it will be serialized in a single operation.
This can really condense a large number of operations in a single one on types obeying to the Packed
property.
static_assert
the Packed
property of a type, if one wants to be sure not to accidentally introduce padding bytes in serialized types.Some relevant blog posts are:
The binary serializer has an additional parallel implementation implementations, see SerializationBinaryTypeErased in Libraries Extra.
This is just an experiment to check if with some more runtime-code and less compile-time-code we can further bring down compile times, but we still need to build a proper benchmark for it.
The binary serializer is not a streaming one, so loading a large data structure will at some point need double of the required space in memory.
This can be solved implementing streaming binary serializer or experimenting with memory mapped files to support really large binary data structures.
🟩 Usable
🟦 Complete Features:
💡 Unplanned Features: