The core QBind implementation (excluding IBind implementations) is a few hundreds line of C++11 using templates defined
in the headers, an abstract IBind class, and IWriter/IReader base classes.
The core QBind implementation (excluding QAbstractValue implementations) is a few hundreds line of C++11 using templates defined
in the headers, an abstract QAbstractValue class, and QAbstractValueWriter/QAbstractValueReader base classes.
## The key idea
...
...
@@ -16,7 +16,7 @@ another generic dataset, binding the related parts together. In effect:
Hence, from now on, we will use the term *bind* instead of the more restricted *(de)serialization* term.
This traversal is driven by QBind<T> methods which may use a BindMode (Read,Write,...) to determine whether to read the generic dataset or write it according to the C++ one.
This traversal is driven by QBind<T> methods which may use a QValueMode (Read,Write,...) to determine whether to read the generic dataset or write it according to the C++ one.
[^1]:*traverse* meaning to go through without returning back
...
...
@@ -46,42 +46,42 @@ ways of representing data structures such as XML (e.g. binding the Person type w
The QBind traversal is formally described by the following recursive automaton:
```mermaid
graph LR
subgraph Val
subgraph QVal
i((start))--"null()" --> x((end))
i((start))--"bind#lt;T>()"--> x((end))
i((start))--"sequence()" --> Seq
i((start))--"record()" --> Rec
Seq --"out()" --> x((end))
Rec --"out()" --> x((end))
Seq --"item()" --> vs["Val#lt;Seq>"]
Rec --"item(name)" --> vr["Val#lt;Rec>"]
i((start))--"sequence()" --> QSeq
i((start))--"record()" --> QRec
QSeq --"out()" --> x((end))
QRec --"out()" --> x((end))
QSeq --"item()" --> vs["QVal#lt;QSeq>"]
QRec --"item(name)" --> vr["QVal#lt;QRec>"]
end
```
- Boxes (nested) represent possible states when traversing the data, the automaton is always in a single valid state
- Edges represent possible state transitions that translate to specific data format read/write actions
The automaton is implemented as follows:
-`Val<_>`, `Rec<_>` and `Seq<_>` types implement possible states where _ denotes the type of outer state
-`QVal<_>`, `QRec<_>` and `QSeq<_>` types implement possible states where _ denotes the type of outer state
- State types only expose public methods corresponding to possible transitions, that return the destination state type
- The initial state type is `Val<Cursor>` and the final state is `Cursor` (for instance: `Val<Cursor>::null()` returns a `Cursor`)
- Returning a `Rec<_>` or `Seq<_>` type automatically convert to the final `Cursor` type invoking as much `out()` transitions as required
-`Cursor` is a non-owning pointer to an `IBind` interface which implementations translate data traversal into specific
- The initial state type is `QValue` and the final state is `QValueStatus` (for instance: `QValue::null()` returns a `QValueStatus`)
- Returning a `QRec<_>` or `QSeq<_>` type automatically convert to the final `QValueStatus` type invoking as much `out()` transitions as required
-`QValueStatus` is a non-owning pointer to an `QAbstractValue` interface which implementations translate data traversal into specific
data format read/write actions
-`Cursor` instance is moved from the start state type to the end state type only for successful transitions, allowing to test
-`QValueStatus` instance is moved from the start state type to the end state type only for successful transitions, allowing to test
alternatives before proceeding with the traversal
- Transitions may fail for various reasons specific to `IBind` implementations:
- Transitions may fail for various reasons specific to `QAbstractValue` implementations:
- builders may not be able to allocate new items
- readers may read data not matching the expected transition
- ...
- In case of unsuccessfull transition the returned state type receives a null `Cursor` that transparently bypasses calls to `IBind`
-`bind<T>()` calls are forwarded to the actual `IBind` or generic `QBind` depending on `BindSupport<T>`:
- BindNative : **IBind** interface method
- In case of unsuccessfull transition the returned state type receives a null `QValueStatus` that transparently bypasses calls to `QAbstractValue`
-`bind<T>()` calls are forwarded to the actual `QAbstractValue` or generic `QBind` depending on `BindSupport<T>`:
- BindGeneric : **QBind** template specialization for T
- Every `bind<T>()` starts from a Val<Cursor> which is an un *unsafe* Cursor copy wrt well-formedness (these `unsafeItem()` copies are protected from incorrect use)
- Every `bind<T>()` starts from a QValue which is an un *unsafe* QValueStatus copy wrt well-formedness (these `unsafeItem()` copies are protected from incorrect use)
## C++ types extensibility
QBind is a functor templated on T type receiving a Value and T reference (either lvalue or rvalue reference) and returning the Cursor.
QBind is a functor templated on T type receiving a Value and T reference (either lvalue or rvalue reference) and returning the QValueStatus.
Template specializations can be defined for any T and optionally refined for specific Cur<TImpl> with different sets of BindNative types.
A default QBind specialization attempts to call `T::bind(...)` to conveniently bind `T* this` without having to understand template syntax,
...
...
@@ -95,35 +95,35 @@ editor will propose to either `bind(myData.item)`, or to construct a `sequence()
## Well-formedness guarantees
Thanks to this design, the compiler will make sure that the only possibility to return a Cursor from a `Val<Cursor>` is to traverse
the data without backtracking, calling only and all necessary IBind virtual methods.
Thanks to this design, the compiler will make sure that the only possibility to return a QValueStatus from a `QValue` is to traverse
the data without backtracking, calling only and all necessary QAbstractValue virtual methods.
The addition of default and optional values take into account most data schema evolutions in a purely declarative fluent interface without
having to test schema versions and the like. The benefit is that it is not possible to introduce bugs using just the fluent interface.
The downside is that writing loops with the fluent interface is unnatural as one must never forget to follow the valid Cursor.
The downside is that writing loops with the fluent interface is unnatural as one must never forget to follow the valid QValueStatus.
For instance:
```cpp
autoseq(v.sequence());
for(auto&&t:ts){
seq=seq.bind(t);// do not forget to reassign seq, or subsequent items will be `bind` to the moved-from Cursor and automatically ignored
seq=seq.bind(t);// do not forget to reassign seq, or subsequent items will be `bind` to the moved-from QValueStatus and automatically ignored
}
```
## Write performance
Since `Cursor`, `Val`, `Rec` and `Seq` have no data member other than outer types and `IBind*`, calling their methods can be
Since `QValueStatus`, `QVal`, `QRec` and `QSeq` have no data member other than outer types and `QAbstractValue*`, calling their methods can be
optimized and replaced with just the following operations:
1. test the IBind pointer validity [^1]
2. call the IBind virtual method corresponding to the possible transitions
3. return the resulting Cursor, Val, Rec or Seq with a valid or invalid Cursor depending on IBind method success or failure
1. test the QAbstractValue pointer validity [^1]
2. call the QAbstractValue virtual method corresponding to the possible transitions
3. return the resulting QValueStatus, QVal, QRec or QSeq with a valid or invalid QValueStatus depending on QAbstractValue method success or failure
[^1]:Experiments to use constexpr to bypass this step for writers that always return true did not seem to improve performance.
`QBind<T>` can define up to 3 bind() overloads to efficiently and conveniently handle lvalue references, const lvalue references, and
rvalue references depending on T characteristics (which of copy/move is 1/possible and 2/efficient).
Compared to manually calling non-virtual, format-specific implementations, the overhead of always testing the validity of IBind*
Compared to manually calling non-virtual, format-specific implementations, the overhead of always testing the validity of QAbstractValue*
and calling virtual methods is around 20% in our benchmark, with a maximum of 100% for trivial format-specific implementations
like copying a single char to a pre-allocated buffer.
...
...
@@ -135,31 +135,31 @@ Other than that, write performance depends on several factors:
distinguishable from QByteArray binary data
- Using QData<TContent> classes to tag string encodings allowed to pinpoint unnecessary encoding conversions, notably in QVariant handling
- In the end, directly using QByteArray buffers instead of using QIODevice can amount to ~ 2x better write performance
-IBind implementations need to use efficient data structures from storing state. For instance, using an optimized std::vector<bool>
-QAbstractValue implementations need to use efficient data structures from storing state. For instance, using an optimized std::vector<bool>
to memorize opened JSON object/array can usually be stored in a single byte and avoid memory allocations, resulting in ~ 10x better
write performance compared to QVector<bool>
## Read robustness
Performance is not so important for Read. But compared to manually calling non-virtual, format-specific implementations, QBind
enforces well-formedness checks necessary to reliably read data coming from unknown sources (IBind implementations being responsible
enforces well-formedness checks necessary to reliably read data coming from unknown sources (QAbstractValue implementations being responsible
for low-level checks).
All errors are reported as `QIdentifierLiteral` to the `IBind` implementations that will decide what to do with them:
- ignore the details and set a global error status enumeration is usually appropriate to IWriter implementations
- storing all read mismatches is usually more appropriate to world-facing IReader implementations
All errors are reported as `QIdentifierLiteral` to the `QAbstractValue` implementations that will decide what to do with them:
- ignore the details and set a global error status enumeration is usually appropriate to QAbstractValueWriter implementations
- storing all read mismatches is usually more appropriate to world-facing QAbstractValueReader implementations
Standardizing error literals allows efficient reporting and analysis while ensuring that various libraries can define new ones independantly.
## Data types extensibility
IBind is an abstract base class for translating fluent interface calls to format-specific read or write actions.
The fluent interface guarantees that IBind virtual methods are always called at appropriate times, so IBind implementations do not
QAbstractValue is an abstract base class for translating fluent interface calls to format-specific read or write actions.
The fluent interface guarantees that QAbstractValue virtual methods are always called at appropriate times, so QAbstractValue implementations do not
have to check well-formedness. It defines a set of BindNative types and default textual representations and encodings for non-textual
native types simplifying again the implementations (see TextWriter example).
IWriter and IReader provide partial IBind implementations simplifying the work of implementors and offering default textual representations.
QAbstractValueWriter and QAbstractValueReader provide partial QAbstractValue implementations simplifying the work of implementors and offering default textual representations.
*NB:* BindNative types could be extended by specifying BindSupport<T> trait but then, (de)serialization code must be specialized
for each TImpl. For instance, a QBind<QColor,QDataStream> may be implemented differently from QBind<QColor,IBind> but QBind<QColor>
for each TImpl. For instance, a QBind<QColor,QDataStream> may be implemented differently from QBind<QColor,QAbstractValue> but QBind<QColor>
example shows that meta() can also be used to avoid specialized serialization code that breaks RW2 requirement. If meta() is deemed
enough, the BindSupport trait and TImpl template parameters can be removed.
//! \warning As with QDataStream, QDataWriter and QDataReader must bind compatible C++ data types, and QDataStream ByteOrder, FloatingPointPrecision and Version
//! \remark When not statically known, such information can be transmitted using meta("type",...) although some IBind implementations may not support it
classQDataWriter:publicIWriter// allows runtime flexibility but requires QBind<T> in addition to T::operator<<(QDataStream&) and does not warrant QDataStream operators compatibility if QBind<T> ignores qmDataStreamVersion in meta()
//! \remark When not statically known, such information can be transmitted using meta("type",...) although some QAbstractValue implementations may not support it
classQDataWriter:publicQAbstractValueWriter// allows runtime flexibility but requires QBind<T> in addition to T::operator<<(QDataStream&) and does not warrant QDataStream operators compatibility if QBind<T> ignores qmDataStreamVersion in meta()