🟩 Describe C++ types at compile time for serialization
Reflection generates compile time information of fields in a structure or class.
Typically this library is used with one of the serialization libraries ( Serialization Binary or Serialization Text).
- Note
- Reflection uses more complex C++ constructs compared to other libraries in this repository. To limit the issue, effort has been spent trying not to use obscure C++ meta-programming techniques. The library uses only template partial specialization and
constexpr
.
Features
- Reflection info is built at compile time
- Free of heap allocations
- Describe primitive types
- Describe C-Arrays
- Describe SC::Vector, SC::VectorMap, SC::Array, SC::String
- Describe Structs composition of any supported type
- Identify types that can be serialized with a single memcpy
Status
🟩 Usable
Under described limitations, the library should be usable.
Roadmap
🟦 Complete Features:
💡 Unplanned Features:
Description
The main target use case is generating reflection information to be used for automatic serialization. There are many rules and limitations so far, and the main one is not supporting any type of reference or pointer. The output of the process is a schema of the reflected type. This schema is an array of SC::Reflection::TypeInfo tracking the type and offset location (in bytes) of the field in the structure it belongs to.
Fields that refer to non-primitive types (like other structs for example) can follow a link index that describes that field elsewhere in the scheme.
Packed attribute
The schema contains information about all the types of all fields of the structure and the packing state.
A packed struct is made of primitive types that are described to the Reflection system so that there are no padding bytes left in the struct.
Example of packed struct:
struct Vec3
{
float x;
float y;
float z;
};
Example of non-packed struct:
struct Vec3
{
float x;
float z;
};
unsigned short uint16_t
Platform independent (2) bytes unsigned int.
Definition: PrimitiveTypes.h:37
A recursively packed struct is a struct made of other structs or arrays of structs without any padding bytes inside of themselves. Example of recursively packed struct:
struct ArrayOfVec3
{
Vec3 array[10];
};
struct Vec3
{
ArrayOfVec3 array;
};
int int32_t
Platform independent (4) bytes signed int.
Definition: PrimitiveTypes.h:46
The recursively packed property allows binary serializers and deserializer to optimize reading / writing with a single memcpy
(for example Serialization Binary).
- Note
- This means that serializers like Serialization Binary will not invoke type constructor when deserializing a Packed type, as all members are explicitly written by serialization.
How to use it
Describing a structure is done externally to the struct itself, specializing a SC::Reflection::Reflect<> struct.
For Example:
struct TestNamespace::SimpleStructure
{
float f8 = 8;
double f9 = 9;
int arrayOfInt[3] = {1, 2, 3};
};
namespace SC
{
namespace Reflection
{
template <>
struct Reflect<TestNamespace::SimpleStructure> : ReflectStruct<TestNamespace::SimpleStructure>
{
template <typename Visitor>
static constexpr bool visit(Visitor&& visitor)
{
}
};
}
}
#define SC_COMPILER_OFFSETOF(Class, Field)
Returns offset of Class::Field in bytes.
Definition: Compiler.h:111
unsigned char uint8_t
Platform independent (1) byte unsigned int.
Definition: PrimitiveTypes.h:36
unsigned long long uint64_t
Platform independent (8) bytes unsigned int.
Definition: PrimitiveTypes.h:42
signed char int8_t
Platform independent (1) byte signed int.
Definition: PrimitiveTypes.h:44
long long int64_t
Platform independent (8) bytes signed int.
Definition: PrimitiveTypes.h:50
unsigned int uint32_t
Platform independent (4) bytes unsigned int.
Definition: PrimitiveTypes.h:38
short int16_t
Platform independent (2) bytes signed int.
Definition: PrimitiveTypes.h:45
Struct member info
- These fields are required for binary and text serialization with Versioning support
MemberTag
(integer)
Pointer to Member
Field Name
(string)
Field Byte Offset
in its parent struct
- This means being able to deserialize data from an older version of the program:
- For Binary Formats: retaining data in struct members with matching
MemberTag
- For Textual Formats: retaining data in struct members with matching
Field Name
- Specifying both of them allow refactoring names of c++ struct members without breaking serialization formats
- The
Field Byte Offset
is necessary to generate an unique versioning signature of a given Struct
- The
Pointer to Member
allows serializing / deserializing without reinterpret_cast<>
(we could use Field Byte Offset
as an alternative)
- Note
- Additional considerations regarding the level of repetition:
- There are techniques to get field name as string from member pointer on all compilers, but they're all C++ 20+.
- There are techniques to get compile-time offset of field from member pointer but they are complex and increase compile time unnecessarily.
- We could hash the
Field Name
to obtain MemberTag
but an explicit integer has been preferred to allow breaking textual formats and binary formats independently.
Reflection Macros
With some handy macros one can save typing and they're generally preferable.
SC_REFLECT_STRUCT_VISIT(TestNamespace::SimpleStructure)
SC_REFLECT_STRUCT_FIELD(0, f0)
SC_REFLECT_STRUCT_FIELD(1, f1)
SC_REFLECT_STRUCT_FIELD(2, f2)
SC_REFLECT_STRUCT_FIELD(3, f3)
SC_REFLECT_STRUCT_FIELD(4, f4)
SC_REFLECT_STRUCT_FIELD(5, f5)
SC_REFLECT_STRUCT_FIELD(6, f6)
SC_REFLECT_STRUCT_FIELD(7, f7)
SC_REFLECT_STRUCT_FIELD(8, f8)
SC_REFLECT_STRUCT_FIELD(9, f9)
SC_REFLECT_STRUCT_FIELD(10, arrayOfInt);
SC_REFLECT_STRUCT_LEAVE()
Example (print schema)
To understand a little bit more how Serialization library can use this information, let's try to print the schema.
The compile time flat schema can be obtained by calling SC::Reflection::Schema::compile:
using namespace SC;
constexpr auto SimpleStructureFlatSchema = Schema::compile<TestNamespace::SimpleStructure>();
Describe C++ types at compile time for serialization (see Reflection).
Definition: Reflection.h:13
For example we could print the schema with the following code:
#include "../../Strings/Console.h"
#include "../../Strings/String.h"
#include "../../Strings/StringBuilder.h"
#include "../Reflection.h"
namespace SC
{
{
switch (type)
{
}
Assert::unreachable();
}
template <int NUM_TYPES>
inline void printFlatSchema(Console& console, const Reflection::TypeInfo (&type)[NUM_TYPES],
const Reflection::TypeStringView (&names)[NUM_TYPES])
{
int typeIndex = 0;
while (typeIndex < NUM_TYPES)
{
typeIndex += printTypes(builder, typeIndex, type + typeIndex, names + typeIndex) + 1;
console.print(buffer.view());
}
}
inline int printTypes(StringBuilder& builder, int typeIndex, const Reflection::TypeInfo* types,
const Reflection::TypeStringView* typeNames)
{
builder.append("[{:02}] {}", typeIndex, typeName);
switch (types[0].type)
{
builder.append(" (Struct with {} members - Packed = {})", types[0].getNumberOfChildren(),
types[0].structInfo.isPacked ? "true" : "false");
break;
builder.append(" (Array of size {} with {} children - Packed = {})", types[0].arrayInfo.numElements,
types[0].getNumberOfChildren(), types[0].arrayInfo.isPacked ? "true" : "false");
break;
builder.append(" (Vector with {} children)", types[0].getNumberOfChildren());
break;
default: break;
}
builder.append("\n{\n");
for (int idx = 0; idx < types[0].getNumberOfChildren(); ++idx)
{
const Reflection::TypeInfo& field = types[idx + 1];
builder.append("[{:02}] ", typeIndex + idx + 1);
const StringView fieldName({typeNames[idx + 1].data, typeNames[idx + 1].length},
false,
StringEncoding::Ascii);
{
builder.append("Type={}\tOffset={}\tSize={}\tName={}", typeCategoryToStringView(field.type),
field.memberInfo.offsetInBytes, field.sizeInBytes, fieldName);
}
else
{
builder.append("Type={}\t \tSize={}\tName={}", typeCategoryToStringView(field.type),
field.sizeInBytes, fieldName);
}
if (field.hasValidLinkIndex())
{
builder.append("\t[LinkIndex={}]", field.getLinkIndex());
}
builder.append("\n");
}
builder.append("}\n");
return types[0].getNumberOfChildren();
}
}
#define SC_COMPILER_WARNING_POP
Pops warning from inside a macro.
Definition: Compiler.h:107
#define SC_COMPILER_WARNING_PUSH_UNUSED_RESULT
Disables unused-result warning (due to ignoring a return value marked as [[nodiscard]])
Definition: Compiler.h:146
TypeCategory
Enumeration of possible category types recognized by Reflection.
Definition: Reflection.h:32
@ TypeUINT32
Type is uint32_t
@ TypeUINT16
Type is uint16_t
@ TypeUINT64
Type is uint64_t
@ TypeArray
Type is an array type.
@ TypeINT16
Type is int16_t
@ TypeINT64
Type is int64_t
@ TypeINT32
Type is int32_t
@ TypeVector
Type is a vector type.
@ TypeFLOAT32
Type is float
@ TypeUINT8
Type is uint8_t
@ TypeStruct
Type is a struct type.
@ TypeDOUBLE64
Type is double
@ TypeInvalid
Invalid type sentinel.
@ Ascii
Encoding is ASCII.
@ Clear
Destination buffer will be cleared before pushing to it.
Definition: StringBuilder.h:20
Called with the following code
printFlatSchema(report.console, SimpleStructureFlatSchema.typeInfos.values, SimpleStructureFlatSchema.typeNames.values);
It will print the following output for the above struct:
[00] TestNamespace::SimpleStructure (Struct with 11 members - Packed = false)
{
[01] Type=TypeUINT8 Offset=0 Size=1 Name=f0
[02] Type=TypeUINT16 Offset=2 Size=2 Name=f1
[03] Type=TypeUINT32 Offset=4 Size=4 Name=f2
[04] Type=TypeUINT64 Offset=8 Size=8 Name=f3
[05] Type=TypeINT8 Offset=16 Size=1 Name=f4
[06] Type=TypeINT16 Offset=18 Size=2 Name=f5
[07] Type=TypeINT32 Offset=20 Size=4 Name=f6
[08] Type=TypeINT64 Offset=24 Size=8 Name=f7
[09] Type=TypeFLOAT32 Offset=32 Size=4 Name=f8
[10] Type=TypeDOUBLE64 Offset=40 Size=8 Name=f9
[11] Type=TypeArray Offset=48 Size=12 Name=arrayOfInt [LinkIndex=12]
}
[12] Array (Array of size 3 with 1 children)
{
[13] Type=TypeINT32 Size=4 Name=int
}
Another example with a more complex structure building on top of the simple one:
struct TestNamespace::IntermediateStructure
{
SimpleStructure simpleStructure;
};
SC_REFLECT_STRUCT_VISIT(TestNamespace::IntermediateStructure)
SC_REFLECT_STRUCT_FIELD(1, vectorOfInt)
SC_REFLECT_STRUCT_FIELD(0, simpleStructure)
SC_REFLECT_STRUCT_LEAVE()
struct TestNamespace::ComplexStructure
{
SimpleStructure simpleStructure;
SimpleStructure simpleStructure2;
IntermediateStructure intermediateStructure;
};
SC_REFLECT_STRUCT_VISIT(TestNamespace::ComplexStructure)
SC_REFLECT_STRUCT_FIELD(0, f1)
SC_REFLECT_STRUCT_FIELD(1, simpleStructure)
SC_REFLECT_STRUCT_FIELD(2, simpleStructure2)
SC_REFLECT_STRUCT_FIELD(3, f4)
SC_REFLECT_STRUCT_FIELD(4, intermediateStructure)
SC_REFLECT_STRUCT_FIELD(5, vectorOfStructs)
SC_REFLECT_STRUCT_LEAVE()
A contiguous sequence of heap allocated elements.
Definition: Vector.h:51
Printing the schema of ComplexStructure
outputs the following:
- Note
Packed
structs will get their members sorted by offsetInBytes
.
For regular structs, they are left in the same order as the visit sequence.
This allows some substantial simplifications in Serialization Binary implementation.
[00] TestNamespace::ComplexStructure (Struct with 6 members - Packed = false)
{
[01] Type=TypeUINT8 Offset=0 Size=1 Name=f1
[02] Type=TypeStruct Offset=8 Size=64 Name=simpleStructure [LinkIndex=7]
[03] Type=TypeStruct Offset=72 Size=64 Name=simpleStructure2 [LinkIndex=7]
[04] Type=TypeUINT16 Offset=136 Size=2 Name=f4
[05] Type=TypeStruct Offset=144 Size=72 Name=intermediateStructure [LinkIndex=19]
[06] Type=TypeVector Offset=216 Size=8 Name=vectorOfStructs [LinkIndex=22]
}
[07] TestNamespace::SimpleStructure (Struct with 11 members - Packed = false)
{
[08] Type=TypeUINT8 Offset=0 Size=1 Name=f0
[09] Type=TypeUINT16 Offset=2 Size=2 Name=f1
[10] Type=TypeUINT32 Offset=4 Size=4 Name=f2
[11] Type=TypeUINT64 Offset=8 Size=8 Name=f3
[12] Type=TypeINT8 Offset=16 Size=1 Name=f4
[13] Type=TypeINT16 Offset=18 Size=2 Name=f5
[14] Type=TypeINT32 Offset=20 Size=4 Name=f6
[15] Type=TypeINT64 Offset=24 Size=8 Name=f7
[16] Type=TypeFLOAT32 Offset=32 Size=4 Name=f8
[17] Type=TypeDOUBLE64 Offset=40 Size=8 Name=f9
[18] Type=TypeArray Offset=48 Size=12 Name=arrayOfInt [LinkIndex=24]
}
[19] TestNamespace::IntermediateStructure (Struct with 2 members - Packed = false)
{
[20] Type=TypeVector Offset=0 Size=8 Name=vectorOfInt [LinkIndex=26]
[21] Type=TypeStruct Offset=8 Size=64 Name=simpleStructure [LinkIndex=7]
}
[22] SC::Vector (Vector with 1 children)
{
[23] Type=TypeStruct Size=64 Name=TestNamespace::SimpleStructure [LinkIndex=7]
}
[24] Array (Array of size 3 with 1 children - Packed = true)
{
[25] Type=TypeINT32 Size=4 Name=int
}
[26] SC::Vector (Vector with 1 children)
{
[27] Type=TypeINT32 Size=4 Name=int
}
Implementation
As already said in the introduction, effort has been put to keep the library as readable as possible, within the limits of C++.
The only technique used is template partial specialization and some care in writing functions that are valid in constexpr
context.
For example generation of the schema is done through partial specialization of the SC::Reflection::Reflect template, by redefining the visit
constexpr static member function for a given type. Inside the visit function, it's possible to let the Reflection system know about a given field.
The output of reflection is an array of SC::Reflection::TypeInfo referred to as Flat Schema at compile time.
Such compile time information is used when serializing and deserializing data that has missing fields.
struct TypeInfo
{
bool hasLink : 1;
union
{
};
struct EmptyInfo
{
};
struct MemberInfo
{
: memberTag(memberTag), offsetInBytes(offsetInBytes)
{}
};
struct StructInfo
{
bool isPacked : 1;
constexpr StructInfo(bool isPacked) : isPacked(isPacked) {}
};
struct ArrayInfo
{
constexpr ArrayInfo(
bool isPacked,
uint32_t numElements) : isPacked(isPacked), numElements(numElements) {}
};
union
{
EmptyInfo emptyInfo;
MemberInfo memberInfo;
StructInfo structInfo;
ArrayInfo arrayInfo;
};
The flat schema is generated by SC::Reflection::SchemaCompiler, that walks the structures and building an array of SC::Reflection::TypeInfo describing each field.
For example a struct is defined by a SC::Reflection::TypeInfo that contains information on the numberOfChildren
of the struct, corresponding to the number of fields. numberOfChildren
SC::Reflection::TypeInfo exist immediately after the struct itself in the flat schema array.
If type of one of these fields is complex (for example it refers to another struct) it contains a link.
A Link is an index/offset in the flat schema array (linkIndex
field).
This simple data structure allows to describe hierarchically all types in a structure decomposing it into its primitive types or into special classes that must be handled specifically (SC::Vector, SC::Array etc.).
It also has an additional nice property that it's trivially serializable by dumping this (compile time known) array with a single memcpy
or by calculating an hash
that can act as a unique signature of the entire type itself (for versioning purposes).
It's possible associating a string
literal with each type or member, so that text based serializer can refer to it, as used by the JSON Serialization Text.
Lastly it's also possible associating to a field an MemberTag
integer field, that can be leveraged by by binary serializers to keep track of the field position in binary formats. This allows binary formats to avoid using strings at all for format versioning/evolution, allowing some executable size benefits. This makes also possible not breaking binary file formats just because a field / member was renamed. This is leveraged by the Serialization Binary.
- Note
- It's possible also trying the experimental
SC_REFLECT_AUTOMATIC
mode by using Reflection Auto library that automatically lists struct members. This makes sense only if used with Serialization Binary, as SC_REFLECT_AUTOMATIC
cannot obtain field names as strings, so any text based serialization format like Serialization Text cannot work.
Reflection Auto library is an experimental library, unfortunately using some more obscure C++ meta-programming techniques, part of Libraries Extra.