Skip to content

Evolution

FlatBuffers enables the schema to evolve over time while still maintaining forwards and backwards compatibility with old flatbuffers.

Some rules must be followed to ensure the evolution of a schema is valid.

Rules

Adding new tables, vectors, structs to the schema is always allowed. Its only when you add a new field to a table that certain rules must be followed.

Addition

New fields MUST be added to the end of the table definition.

This allows older data to still be read correctly (giving you the default value of the added field if accessed).

Older code will simply ignore the new field in the flatbuffer.

Use id attributes

You can ignore this rule if you use the id attribute on all the fields of a table. This

Removal

You MUST not remove a field from the schema, even if you don't use it anymore. You simply stop writing them to the buffer.

Its encouraged to mark the field deprecated by adding the deprecated attribute. This will skip the generation of accessors and setters in the code, to enforce the field not to be used any more.

Name Changes

Its generally OK to change the name of tables and fields, as these are not serialized to the buffer. It may break code that would have to be refactored with the updated name.

Examples

The following examples uses a base schema and attempts to evolve it a few times. The versions are tracked by V1, V2, etc.. and CodeV1 means code compiled against the V1 schema.

Table Evolution

Lets start with a simple table T with two fields.

Schema V1
table T {
  a:int;
  b:int;
}

First lets extend the table with a new field.

Schema V2
table T {
  a:int;
  b:int;
  c:int;
}

This is OK. CodeV1 reading V2 data will simply ignore the presence of the new field c. CodeV2 reading V1 data will get a default value (0) when reading c.

Schema V3
table T {
  a:int (deprecated);
  b:int;
  c:int;
}

This is OK, removing field a via deprecation. CodeV1, CodeV2 and CodeV3 reading V3 data will now always get the default value of a, since it is not present. CodeV3 cannot write a anymore. CodeV3 reading old data (V1 or V2) will not be able to access the field anymore, since no generated accessors are omitted.

Add a new field, but this time at the beginning.

Schema V2
table T {
  c:int;
  a:int;
  b:int;
}

This is NOT OK, as it makes V2 incompatible. CodeV1 reading V2 data will access a but will read c data.

CodeV2 reading V1 data will access c but will read a data.

Remove a field from the schema.

Schema V2
table T {
  b:int;
}

This is NOT OK. CodeV1 reading V2 data will access a but read b data.

CodeV2 reading V1 data will access b but will read a data.

Lets add a new field to the beginning, but use id attributes.

Schema V2
table T {
  c:int (id: 2);
  a:int (id: 0);
  b:int (id: 1);
}

This is OK. This adds the a new field in the beginning, but because all the id attributes were added, it is OK.

Let change the types of the fields.

Schema V2
table T {
  a:uint;
  b:uint;
}

This is MAYBE OK, and only in the case where the type change is the same width. This is tricky if the V1 data contained any negative numbers. So this should be done with care.

Lets change the default values of the existing fields.

Schema V2
table T {
  a:int = 1;
  b:int = 2;
}

This is NOT OK. Any V1 data that did not have a value written to the buffer relied on generated code to provide the default value.

There MAY be cases where this is OK, if you control all the producers and consumers, and you can update them in tandem.

Lets change the name of the fields

Schema V2
table T {
  aa:int;
  bb:int;
}

This is generally OK. You've renamed fields will break all code and JSON files that use this schema, but you can refactor those without affecting the binary data, since the binary only address fields by id and offset, not by names.

Union Evolution

Lets start with a simple union U with two members.

Schema V1
union U {
  A,
  B
}

Lets add a another variant to the end.

Schema V2
union U {
  A,
  B,
  another_a: A
}

This is OK. CodeV1 will not recognize the another_a.

Lets add a another variant to the middle.

Schema V2
union U {
  A,
  another_a: A,
  B
}

This is NOT OK. CodeV1 reading V2 data will interpret B as another_a. CodeV2 reading V1 data will interpret another_a as B.

Lets add a another variant to the middle, this time adding a union "discriminant".

Schema V2
union U {
  A = 1,
  another_a: A = 3,
  B = 2
}

This is OK. Its like you added it to the end, but using the discriminant value to physically place it elsewhere in the union.

Version Control

FlatBuffers relies on new field declarations being added at the end, and earlier declarations to not be removed, but be marked deprecated when needed. We think this is an improvement over the manual number assignment that happens in Protocol Buffers (and which is still an option using the id attribute mentioned above).

One place where this is possibly problematic however is source control. If user A adds a field, generates new binary data with this new schema, then tries to commit both to source control after user B already committed a new field also, and just auto-merges the schema, the binary files are now invalid compared to the new schema.

The solution of course is that you should not be generating binary data before your schema changes have been committed, ensuring consistency with the rest of the world. If this is not practical for you, use explicit field ids, which should always generate a merge conflict if two people try to allocate the same id.