Building an ECS #1: Where are my Entities and Components

Sander Mertens
6 min readAug 6, 2022

This is the first in a series of posts about how to build an Entity Component System. Each post will cover a different part of the design and turn it inside out, so that at the end of each post you will have enough information to implement one yourself.

In this first entry I will go over the data structures required to find entities and which components they have. The next entry, Building an ECS: Archetypes and Vectorization, goes over how components can be stored in a cache efficient way.

The Entity Index

Entities are unique “things” in your game, often represented as a unique integer value. Entities can have components, which can be represented as a vector of component identifiers. The total set of components is the entity’s type (sometimes called an archetype).

For example, a first person shooter could have a “Player” entity that has components “Position” and “Health”. The type of this entity would be [Position, Health].

In an ECS we often need to know which components an entity has. A simple data structure that would allow for that is a map that stores a vector of component ids for each entity:

using Type = vector<ComponentId>;
unordered_map<EntityId, Type> entity_index;

The entity index lets us quickly query for a given entity id which components it has. With just the entity index, we can implement a simplehas_component function:

bool has_component(EntityId entity, ComponentId component) {
vector<ComponentId>& type = entity_index[entity];
for (auto c : type) {
if (c == component) return true;
}
return false;
}

While pretty simple, there are a few problems with this approach. The first one is that we have to store a vector per entity, which can be expensive if we have lots of entities.

The second problem is that we need an O(n) search to see if an entity has a component. This is a very common operation in an ECS, so an O(n) operation is going to be too slow for many applications, especially since we would also have to do this for almost any operation, like get, add, remove, and so on.

Let’s look at a possible solution for this.

The Archetype

First, we need to get rid of storing a vector per entity, and one way to do that is to introduce a new data structure called an “archetype”. An archetype is something that stores all the entities that have the same components. Because of this, all entities stored in an archetype can share the same component id vector which reduces memory overhead by a lot.

Let’s add the archetype to the data structures for our ECS, and create a new struct for it as we’ll be adding more members later:

using Type = vector<ComponentId>;struct Archetype {
Type type;
};
unordered_map<EntityId, Archetype&> entity_index;

We need to make sure that the type vector is sorted, so that an entity that first adds component Position and then Health, and an entity that first adds Health and then Position both end up in the archetype Position, Health. We also need a data structure to find an archetype for a list of components. A simple approach would be to add another map:

unordered_map<Type, Archetype> archetype_index;

Whenever we’d want to find an archetype we would create a type vector which has the component ids we want, in sorted order, and then use this vector with this map which would then hash the vector. This works for now, but there are way more efficient ways to do this. The next blog, Building an ECS: Archetypes and Vectorization goes over a much faster way to do archetype lookups.

OK, so now we have a way to lookup our entities in the entity index, and to find the list of component ids by looking at the archetype. Let’s see how we can get rid of the O(n) lookup.

The Component Index

Having a vector of component ids is good if we want to iterate through all the components of an entity, but having to loop through the vector if we want to test if an archetype has a specific component is not ideal. One thing we could do is add a set to the archetype that has an entry for each component id:

struct Archetype {
Type type;
unordered_set<ComponentId> type_set;
};

When an archetype is created we populate the set with each component id. We still need the vector for things like creating the component hash, but the set does give us a fast O(1) test to check if the archetype has a component. Here is our updated (and much faster!) has_component function:

bool has_component(EntityId entity, ComponentId component) {
Archetype& archetype = entity_index[entity];
return archetype->type_set.count(component) != 0;
}

While this solves the problem, it does add quite a bit of overhead to each archetype. This is not something we can ignore, because in our ECS we will end up with an archetype for each unique combination of components. Entities with components Position, Health will end up in a different archetype than entities with Position, Velocity, Health . We should expect applications to have hundreds, if not thousands of archetypes.

There is a simple solution to this, which is to flip it around: instead of having a set per archetype with component ids, we’ll have a set per component that has archetype ids. Why is this better? The number of components is always way less than the number of archetypes, so this reduces the overhead by a lot!

Let’s update the data structures with this approach:

struct Archetype {
ArchetypeId id; // unique integer identifier for an archetype
Type type;
};
using ArchetypeSet = unordered_set<ArchetypeId>;
unordered_map<ComponentId, ArchetypeSet> component_index;

Our has_component function now looks like this:

bool has_component(EntityId entity, ComponentId component) {
Archetype& archetype = entity_index[entity];
ArchetypeSet& archetype_set = component_index[component];
return archetype_set.count(archetype.id) != 0;
}

While this is a little bit more expensive than our previous function (it has one additional lookup) it is still a fast constant time operation, and we reduced the overhead of our archetypes by a lot.

Another big advantage of this approach is that we can now quickly find all archetypes for a component. To find all archetypes with the Position component, we can simply do:

ArchetypeSet& position_archetypes = component_index[Position];for (auto& archetype : position_archetypes) {
// archetype has Position component!
}

We can easily extend this to find all archetypes with both Position and Velocity:

ArchetypeSet& position_archetypes = component_index[Position];
ArchetypeSet& velocity_archetypes = component_index[Velocity];
for (auto& archetype : position_archetypes) {
if (velocity_archetypes.count(archetype.id) != 0) {
// archetype has Position and Velocity!
}
}

Not only did we create a data structure to quickly test if an entity has a component, we also created the foundation for what will become ECS queries!

Summary

Our ECS is starting to shape up nicely:

  • We can keep track of entities in the entity_index
  • We can inspect the components of an entity by looking at its Archetype
  • We can check if an entity has a component with the component_index
  • We can quickly find all archetypes for a set of components

This is what the data structures for the ECS look so far:

// A list of component ids
using Type = vector<ComponentId>;
// Type used to store each unique component list only once
struct Archetype {
ArchetypeId id;
Type type;
};
// Find an archetype by its list of component ids
unordered_map<Type, Archetype> archetype_index;
// Find the archetype for an entity
unordered_map<EntityId, Archetype&> entity_index;
// Find the archetypes for a component
using ArchetypeSet = unordered_set<ArchetypeId>;
unordered_map<ComponentId, ArchetypeSet> component_index;

We don’t have a functional ECS yet though. Before we can actually start using it, we need to address a few more things:

  • How to store components
  • How to add/remove components from an entity

The next blog, Building an ECS: Archetypes and Vectorization, describes a few data structures that can do this efficiently.

Hope this helped! If you’re looking for an ECS that already implements the data structures described here, take a look at Flecs:

If you have other ideas that you’d like to discuss with others that are also building ECS implementations, join the Flecs discord!

Further reading

--

--

Sander Mertens

Author of Flecs, an Entity Component System for C and C++