Skip to content

Annotating FlatBuffers

This provides a way to annotate flatbuffer binary data, byte-by-byte, with a schema. It is useful for development purposes and understanding the details of the internal format.

Annotating

Given a schema, as either a plain-text (.fbs) or a binary schema (.bfbs), and binary file(s) that were created by the schema. You can annotate them using:

flatc --annotate SCHEMA -- BINARY_FILES...

This will produce a set of annotated files (.afb Annotated FlatBuffer) corresponding to the input binary files.

Example

Taken from the tests/annotated_binary.

cd tests/annotated_binary
../../flatc --annotate annotated_binary.fbs -- annotated_binary.bin

Which will produce a annotated_binary.afb file in the current directory.

The annotated_binary.bin is the flatbufer binary of the data contained within annotated_binary.json, which was made by the following command:

..\..\flatc -b annotated_binary.fbs annotated_binary.json

.afb Text Format

Currently there is a built-in text-based format for outputting the annotations. A full example is shown here: annotated_binary.afb

The data is organized as a table with fixed columns grouped into Binary sections and regions, starting from the beginning of the binary (offset 0).

Columns

The columns are as follows:

  1. The offset from the start of the binary, expressed in hexadecimal format (e.g. +0x003c).

    The prefix + is added to make searching for the offset (compared to some random value) a bit easier.

  2. The raw binary data, expressed in hexadecimal format.

    This is in the little endian format the buffer uses internally and what you would see with a normal binary text viewer.

  3. The type of the data.

    This may be the type specified in the schema or some internally defined types:

    Internal Type Purpose
    VOffset16 Virtual table offset, relative to the table offset
    UOffset32 Unsigned offset, relative to the current offset
    SOffset32 Signed offset, relative to the current offset
  4. The value of the data.

    This is shown in big endian format that is generally written for humans to consume (e.g. 0x0013). As well as the "casted" value (e.g. 0x0013is 19 in decimal) in parentheses.

  5. Notes about the particular data.

    This describes what the data is about, either some internal usage, or tied to the schema.

Binary Sections

The file is broken up into Binary Sections, which are comprised of contiguous binary regions that are logically grouped together. For example, a binary section may be a single instance of a flatbuffer Table or its vtable. The sections may be labelled with the name of the associated type, as defined in the input schema.

An example of a vtable Binary Section that is associated with the user-defined AnnotateBinary.Bar table.

vtable (AnnotatedBinary.Bar):
  +0x00A0 | 08 00      | uint16_t   | 0x0008 (8)   | size of this vtable
  +0x00A2 | 13 00      | uint16_t   | 0x0013 (19)  | size of referring table
  +0x00A4 | 08 00      | VOffset16  | 0x0008 (8)   | offset to field `a` (id: 0)
  +0x00A6 | 04 00      | VOffset16  | 0x0004 (4)   | offset to field `b` (id: 1)

These are purely annotative, there is no embedded information about these regions in the flatbuffer itself.

Binary Regions

Binary regions are contiguous bytes regions that are grouped together to form some sort of value, e.g. a scalar or an array of scalars. A binary region may be split up over multiple text lines, if the size of the region is large.

Annotation Example

Looking at an example binary region:

vtable (AnnotatedBinary.Bar):
  +0x00A0 | 08 00      | uint16_t   | 0x0008 (8)   | size of this vtable

The first column (+0x00A0) is the offset to this region from the beginning of the buffer.

The second column are the raw bytes (hexadecimal) that make up this region. These are expressed in the little-endian format that flatbuffers uses for the wire format.

The third column is the type to interpret the bytes as. For the above example, the type is uint16_t which is a 16-bit unsigned integer type.

The fourth column shows the raw bytes as a compacted, big-endian value. The raw bytes are duplicated in this fashion since it is more intuitive to read the data in the big-endian format (e.g., 0x0008). This value is followed by the decimal representation of the value (e.g., (8)). For strings, the raw string value is shown instead.

The fifth column is a textual comment on what the value is. As much metadata as known is provided.

Offsets

If the type in the 3rd column is of an absolute offset (SOffet32 or Offset32), the fourth column also shows an Loc: +0x025A value which shows where in the binary this region is pointing to. These values are absolute from the beginning of the file, their calculation from the raw value in the 4th column depends on the context.