Building an ECS #2: Archetypes and Vectorization

Sander Mertens
11 min readMar 14, 2020

--

This is the second in a series of posts about the guts of Flecs, an Entity Component System for C and C++. Each post will cover a different part of the design and turn it inside out, so that at the end of each post you will have enough information to implement it yourself.

The first entry explained how we can efficiently keep track of which components an entity has. In this second entry we will explore one way we can store the component data. ECS is often touted for its performance benefits, and after reading this entry you will know why.

Arrays, array, arrays

As a rule of thumb, if you want to get things done fast, use arrays. The reason for this is that iterating over an array has a very predictable memory access pattern. CPUs take advantage of this by prefetching data from RAM into the CPU cache that it thinks you will access next.

If memory access is randomized, like what happens in object oriented code when each object is allocated separately, a CPU can’t predict which data is needed next. This leads to loading junk data into the CPU cache, and in turn more RAM roundtrips. RAM access is much slower than the CPU cache, which can cause measurable slowdowns in application code.

For a sense of scale, the following diagram shows approximately how many bunnies can travel from different storages to the CPU in the same time:

Needless to say, we want to avoid fetching memory from RAM if we can.

Code that iterates arrays has another benefit, which is that under certain conditions compilers can insert special “vector” or SIMD instructions. SIMD (Single Instruction, Multiple Data) instructions are neat, because they run the same operation on multiple values, in the time it takes to do a single instruction! This can make code run anywhere from 2–16x faster than the same, non vectorized code.

(note: vectorized code has nothing to do with using an std::vector)

When building our ECS, we of course want to be able to leverage both fast access and vectorization. This means we need our data to be stored in arrays and contiguous.

Contiguous means that our data shouldn’t contain any gaps. We can have an array with 1000 elements, but if only 3 of those elements contain useful data, we’re no better off than where we started.

The ABC problem

Unfortunately storing all data in perfect contiguous arrays is not trivial. To see why, let’s look at a simple example first. Imagine we have an application with 3 entities that all have component A. We can store this in a single array:

A a[3];

We can use the array index for entity id, so that a[2] gives us the value of A for entity 2. Now imagine that two of these entities also have component B. For this we need to add an array for B, but here we run into a problem. We could create an array with the same size of A, like this:

B b[3];

This is easy, as it still allows us to use the entity id as the index (A and B for entity 2 are stored in A[2] and B[2]). But, because we have an array of 3 and only two entities with B, we cannot guarantee the array is contiguous! This means that we cannot vectorize our code, which is no good.

Instead, let’s try to only store as many B’s as we need by creating an array with just two elements:

B b[2];

This is a contiguous array for which we can write vectorized code, but now we run into a different problem. Imagine our entities look like this:

0: [A B]
1: [A ]
2: [A B]

Even though both A and B are stored as contiguous arrays, the entity ids don’t match up with the array indices! This means that we cannot write vectorized code that requires both A and B. This is still no good, but we can use a trick here, which is to store all entities with B at the top of array A:

0: [A B]
2: [A B]
1: [A ]

Now A and B line up nicely. We will have to come up with a mapping from entity ids to array indices, since they no longer are the same, but that is something we can live with as long as we can write fast code. This approach even works if some entities only have B:

3: [  B]
0: [A B]
2: [A B]
1: [A ]

Problem solved? Unfortunately not. Let’s see what happens if we add another component into the mix:

0: [  B  ]
1: [ B C]
2: [A B C]
3: [A B ]
4: [A ]
5: [A C]
6: [ C]

There is no way we can order our entities so that we get three perfectly contiguous arrays for A, B and C (try it!). Did we chase an impossible goal? Not quite, but we need a relaxation in order to support vectorization.

Archetypes, revisited

What if we stop trying to create a single array per component, but instead create an array per component, per type? Imagine we have 2 entities with [A], 2 entities with [A, B] and 2 entities with [A, C]. We could store them like this:

// Type [A]
A a[2];
// Type [A, B]
A a[2];
B b[2];
// Type [A, C]
A a[2];
C c[2];

Our entities now look like this:

0: [A]
1: [A]
2: [A B]
3: [A B]
4: [A C]
5: [A C]

We now have 3 separate arrays for A which introduces a little bit of overhead for when we only want to iterate A, but other than that we fixed all of our previous problems. All of the arrays are contiguous, and the array indices correspond with a single entity. If we wanted to iterate [A, B], we could do so with a loop that looks like this (and can be vectorized):

for (int i = 0; i < 10; i ++) {
a[i] += b[i];
}

To make this work, we need to group all entities with the same components together. As it turns out, we already have something that keeps track of unique component combinations: the “Archetype” from the previous blog post! To recap, the type looks like this:

// Type used to store each unique component list only once
struct Archetype {
ArchetypeId id;
Type type;
};

We can extend this type so that it can also store components:

using Column = vector<T>; // vector for a component type Tstruct Archetype {
ArchetypeId id;
Type type;
vector<Column> components; // one vector for each component
};

Note that this is pseudo code as an ECS usually does not know the component type T beforehand. That is why an ECS usually needs to apply a technique called type erasure, which allows components to be stored without knowing what type it is. Actual code could do something like:

struct Column {
void *elements; // buffer with component data
size_t element_size; // size of a single element
size_t count; // number of elements
}

Now that we can store and iterate components in a way that is cache efficient and allows for things like vectorization, let’s look at how we can actually find the value of a component for a specific entity.

Getting a component

To find a component value for an entity, we need to extend the entity index, a data structure described in the previous post. Here is a recap of what that data structure looked like:

unordered_map<EntityId, Archetype&> entity_index;

While this lets us find the archetype for an entity, it doesn’t tell us at which index in the component arrays we can find its values. We can fix this by adding this index, or row, to the payload of the entity index:

struct Record {
Archetype& archetype;
size_t row;
}
unordered_map<EntityId, Record> entity_index;

With this in place we can now get the value for a specific component:

void* get_component(EntityId entity, ComponentId component) {
Record& record = entity_index[entity];
Archetype& archetype = r.archetype;
// First check if archetype has component
ArchetypeSet archetypes = component_index[component];
if (archetypes.count(archetype.id) == 0) {
return nullptr;
}
for (int i = 0; i < archetype.type.count(); i ++) {
ComponentId type_id = archetype.type[i];
if (type_id == component) {
return archetype.columns[i][record.row];
}
}
return nullptr;
}

We’re starting to get somewhere, but this code has a linear search to get the component value for an entity, which is not ideal. A get operation is very common in ECS applications, so if we can get rid of this search we can speed up applications by a lot!

Turns out there is a simple way to do this, which requires extending the component_index data structure described in the previous post. Here is a recap of what it looked like:

using ArchetypeSet = set<ArchetypeId>;
unordered_map<ComponentId, ArchetypeSet> component_index;

We used this in our previous example to test if an archetype has a component, but what if instead of just telling us if an archetype has a component, it could also tell us in which column the component is stored? We can make it do this by simply adding a payload to the ArchetypeSet, turning it into a map:

struct ArchetypeRecord {
size_t column;
};
using ArchetypeMap = unordered_map<ArchetypeId, ArchetypeRecord>;unordered_map<ComponentId, ArchetypeMap> component_index;

With this in place, we can rewrite our get_component function like this:

void* get_component(EntityId entity, ComponentId component) {
Record& record = entity_index[entity];
Archetype& archetype = r.archetype;
ArchetypeMap archetypes = component_index[component];
if (archetypes.count(archetype.id) == 0) {
return nullptr;
}
ArchetypeRecord& a_record = archetypes[archetype.id];
return archetype.columns[a_record.column][record.row];
}

This removes the linear search from the code, and turns it into a fast O(1) operation, which makes the get_component function a lot faster. Note that in an actual implementation you could combine testing for the record and getting the record, which would remove an additional lookup. Pretty neat!

Adding components

If a component is added to an entity, its archetype changes. This means that to make this work, we need to do two things:

  • Find the archetype that has the component we want to add in addition to the components the entity already had
  • Insert a new row into the destination archetype
  • Move overlapping components over to the destination archetype
  • Remove the entity from the current archetype

Say we have an entity with archetype [Position, Velocity], and we want to add Health to it. The first thing we need to do is find (or create) archetype [Position, Velocity, Health]. We then need to copy the Position and Velocity components over to [Position, Velocity, Health]. Then we need to remove the row from the [Position, Velocity] archetype.

This sequence is the same for removing components.

Finding an archetype

A big factor in how add and remove operations perform is how fast you can find the next archetype. In the previous post we used a data structure that used the type vector (the list of component ids) as key in a map:

// A list of component ids
using Type = vector<ComponentId>;

// Find an archetype by its list of component ids
unordered_map<Type, Archetype> archetype_index;

While this approach works, it is not very fast. Consider the steps that we would have to go through:

  • Create a new vector with enough space to contain the component ids of the old archetype + the new component id
  • Populate the array with the old and new component ids, while making sure the vector is sorted
  • Hash the vector
  • Use the hash to lookup the archetype in the archetype_index
  • Compare the list of the found archetype(s) with the vector to deal with possible hash collisions

This is not going to be fast enough for operations we want to do often in the main loop of a game. If entities don’t change components often this may be fine, but one of the big benefits of ECS is that it lets us add capabilities to entities on the fly. It would be a shame if our ECS couldn’t handle that.

Fortunately there is a better way.

The Archetype Graph

In a typical ECS application entities will move between the same archetypes many times. This means that we need many archetype searches that have the same parameters: if there is one entity that had Position, Velocity to which Health was added, there will likely be more.

We can use this knowledge by caching the result of an archetype search. Before we can start thinking about how to cache a result, let’s first take a look at what the search function looks like:

Archetype& add_to_archetype(Archetype& src, ComponentId id);

Given that we already know the source archetype from the entity index, what we could do is add a data structure to the archetype that returns the destination archetype for a given component id. Let’s extend our archetype struct to see what this looks like:

struct Archetype {
ArchetypeId id;
Type type;
vector<Column> components;
unordered_map<ComponentId, Archetype&> add_archetypes;
};

We can initialize this map lazily, the first time an entity in the archetype adds a component we will have to initialize the entry, which we can do with the hash-based approach mentioned earlier. Once the map contains an entry for the component id though, finding the next archetype becomes easy. Let’s take a look at what an add_component operation could look like:

void add_component(EntityId entity, ComponentId component) {
Record& record = entity_index[entity];
Archetype& archetype = record.archetype;
Archetype& next_archetype = archetype.add_archetypes[component];
move_entity(archetype, record.row, next_archetype);
}

We also need to be able to remove components. We could support this by simply adding another map to the archetype:

struct Archetype {
ArchetypeId id;
Type type;
vector<Column> components;
unordered_map<ComponentId, Archetype&> add_archetypes;
unordered_map<ComponentId, Archetype&> remove_archetypes;
};

Though a more efficient way is to store a single map with a payload that has both an add and remove archetype:

struct ArchetypeEdge {
Archetype& add;
Archetype& remove;
};
struct Archetype {
ArchetypeId id;
Type type;
vector<Column> components;
unordered_map<ComponentId, ArchetypeEdge> edges;
};

This effectively organizes our archetypes in a graph, where the archetypes are the graph nodes, and the component ids are the edges between the archetypes. The following diagram shows an example of what an archetype graph could look like:

An example archetype graph

This greatly improves the add/remove operations of our ECS. Instead of creating a component id vector with the component to add, sorting it, hashing it and using it in a map lookup, we do one quick O(1) lookup!

Summary

We now have an ECS that is starting to become more useful. We can do the following things:

  • We can keep track of entities in the entity_index
  • We can inspect the components of an entity by looking at its Archetype
  • We can check if an entity has a component with the component_index
  • We can quickly find all archetypes for a set of components
  • We can quickly add/remove components from an entity

This is the set of data structures we’ve created so far:

// A list of component ids
using Type = vector<ComponentId>;
// Component vector
using Column = vector<T>;
// Edges to other archetypes
struct ArchetypeEdge {
Archetype& add;
Archetype& remove;
};
// Type used to store each unique component list only once
struct
Archetype {
ArchetypeId id;
Type type;
vector<Column> components;
unordered_map<ComponentId, ArchetypeEdge> edges;
};
// Find an archetype by its list of component ids
unordered_map<Type, Archetype> archetype_index;
// Find the archetype for an entity
unordered_map<EntityId, Archetype&> entity_index;
// Record in component index with component column for archetype
struct ArchetypeRecord {
size_t column;
};
// Used to lookup components in archetypes
using ArchetypeMap = unordered_map<ArchetypeId, ArchetypeRecord>;
unordered_map<ComponentId, ArchetypeMap> component_index;

The next blog, Making the most of Entity Identifiers, goes into detail on how we can implement things like liveliness checking and entity relationships by reserving a few bits in the entity identifier.

Building Games with Entity Relationships describes ECS relationships, which is a feature that turns an ECS storage into a graph database.

Hope this helped! If you’re looking for an ECS that already implements the data structures described here, take a look at Flecs:

If you have other ideas that you’d like to discuss with others that are also building ECS implementations, join the Flecs discord!

--

--

Sander Mertens

Author of Flecs, an Entity Component System for C and C++