Why Vanilla ECS Is Not Enough

Disclaimer: I am the author of Flecs, an Entity Component System for C99. Discord: https://discord.gg/ZSSyqty

When I started writing my first ECS a year ago, I was excited. It seemed to do something unique by offering more flexibility and performance to a developer at the same time. Also, having high-level design primitives that translate well to cache- and vectorization friendly code sounded great.

This has all proven to be true, at least for me. And yet.

First I have to preface this blog with something. There are two ways you can look at ECS. One is that it is a data container, like a vector or a hashmap. The other is that of a design pattern. The line between the two is often blurred as ECS heavily relies on data, and this data needs to be stored somewhere. This is why ECS implementations put a lot of emphasis on how data is stored.

Much has been said (and will be said) about what are the best and most performant ways to store your data in ECS. In this blog I want to spend some time reflecting on the design aspects, and in particular where I think ECS as a design pattern is lacking.

If you‘re short on time, scroll to the end of this blog to see the changes I would make to the definition of vanilla ECS.

To know why ECS is lacking, we need to know what its purpose is. This is already a contentious question. When approaching this question from the “ECS as a data structure” perspective, there isn’t necessarily a single purpose, besides offering a performant way to store and retrieve data. When we treat ECS as a design pattern though, we naturally need to ask the question, “a design pattern for what?”.

Most often ECS is mentioned as a pattern for implementing game logic, where if done right, it produces code that is easier to extend, refactor, and maintain (and yes, it will probably run faster too). But what about a game engine? Could we implement core engine features like an input manager or renderer in ECS? What about a user interface? Or a low-level data structure like a quad tree? If not, why not?

Over the past year I have experimented with implementing several things such as an HTTP wrapper and REST endpoint, a reflection framework, a metrics collection backend and several core engine systems in ECS. So far I am not unhappy with the results. There is one big takeaway from these projects, and also one big caveat.

Let’s start with the takeaway: coding these features in ECS made them easier to extend, easier to build and made them plug & play, which I will talk about later. It also made the code much more data-driven, which I believe helped in keeping things simple and readable. Did it make the code faster? Possibly, but that wasn’t the goal, as ECS was not in the critical path for any of the projects.

Now the caveat: I did not just use “vanilla ECS”. Vanilla ECS can be summarized in the following four rules:

  • An entity is a unique identifier
  • A component is a plain old datatype
  • An entity can have 0 .. N components
  • A system is logic matched with entities based on their components

These four rules proved to be insufficient for my purposes. So I started tweaking ECS. Why bother, you may ask, and not just implement them in a different way? In short: because I like the style of ECS, and I think it has measurable benefits over non-ECS code. With a few tweaks I was able to at least overcome some of the shortcomings of vanilla ECS, which are:

  • Some data cannot be instantiated in a scalable or performant way
  • The semantics of systems are underspecified

Ok, enough high-level fluff, let’s get into some specifics.

Hierarchies

Despite how common hierarchies in games are, vanilla ECS provides no facilities out of the box for specifying hierarchies. To make matters worse: it is actually impossible to implement a performant hierarchy in vanilla ECS, where by “performant” I mean an approach that lets us iterate over coordinates in a contiguous array with code that can be vectorized (if you think otherwise and you have a solution, do let me know ;).

Component sharing

Multiple component instances

Runtime Tags

This would not be a problem if we never have to query subsets of entities that are known before we start the game, but this is not the case. What if a game allows you to create platoons dynamically, and you want to get all of the entities for a platoon? What if you have different teams in your game and you want to tag the players in each team? This is only possible if a game can create tags at runtime, which is not possible in vanilla ECS.

State machines

This is a poor mans approach for a few reasons. First of all, there is no inherent relationship between the tags associated with a state, which makes the state machine implicit and error prone. Secondly, there is no way to prevent that two states are added to an entity, or that the entity is in no state at all. This is all a step back from a state machine that does not rely on ECS, where we can group states as constants in an enumeration. Yet it would be nice to have the state machine represented in ECS, as this likely impacts the kinds of systems we want to run on our entities.

Extreme declarative programming

Or is it? Vanilla ECS says notoriously little about this. The most common interpretation of a system is that it is a function that runs periodically in the main loop. But what if I want a system that runs when I set the component? What about a system when I unset the component? Consider this example:

entity.set<Window>({.width = 800, .height = 600});

This is a declarative statement (“there shall be a component Window on this entity with this value”) that is an example of what I’ve coined “extreme declarative programming”. The code clearly wants there to be a window, but in vanilla ECS no such thing would happen. Ideally there would be a well-defined construct in ECS so that we can describe these kinds of interactions, and have our window be created.

System execution order

For all the promises ECS makes about code reusability, none of that will come to pass if we cannot specify execution order. If a “Move” system progresses Position with Velocity, but Velocity isn’t set yet for the current frame, the system will not work.

Any design language that does not allow me to specify the preconditions for something to work correctly is in my humble opinion flawed. ECS implementations have gotten “around” this problem by either ignoring it or providing overly complex or broken solutions. That’s strong language, so let’s qualify this a bit more.

ECS promises that systems that are decoupled through components. Specifying direct dependencies between systems is in direct violation of this. It makes code fragile as systems are prone to change during refactoring. So don’t do that.

Another approach is to let the game developer specify system order. This may work for small projects, but what if you just imported 100 systems from an asset store that you did not write yourself? Long story short: this doesn’t work either.

So how should system dependencies be defined? Just like with everything else in ECS, we should return to the data. Our Move system needs to run after Velocity is set, not after the system that writes Velocity (there can be many). In order to schedule a Move system, we need to be able to identify a part of our frame where we can safely assume that Velocity has been set, and assign our system to that part of the frame.

A new definition for ECS

Entities & Components

  • A component can optionally be associated with a plain old datatype
  • A component identifier is an entity
  • An entity can have 0 .. N components
  • A component can be annotated with a role
  • An <entity, component> tuple can have 0 .. N components

I did a few things here. Entities are still just simple unique identifiers. Components can now optionally be associated with a datatype, which releases them from the constraint of having to be defined in advance.

The next part is where it gets interesting: “A component identifier is an entity”. This allows us to treat entities and components in the same way in many cases, and more importantly, it lets us add entities to entities. The first problem this solves is that we can now generate tags on the fly, and we can add tags to entities for things that aren’t known in advance.

The next part “A component can be annotated with a role” is a catch-all mechanism that lets us specify what the “role” of a component (entity) is for an entity. Here we can specify things like, this entity is a parent, or I want to share components from this entity.

The last rule lets us add components multiple times. The aforementioned “Timer” component could be added to “entity,HealthBuff” and “entity,StaminaBuff”. Because a component is simply an identifier, we could even do things like, add “Timer” to “entity, 1000” and “entity, 1001”.

Systems

  • A system is logic matched with entities based on their components
  • A system is invoked as result of an event
  • A component mutation is an event
  • Computing a simulation frame is an event
  • A frame is divided into N phases
  • Each system is assigned to a phase

This is a much more concise definition than the original one. It recognizes that systems can be ran as a result of the simulation progressing (what many people would consider the default) and as a result of “mutations”, what essentially means adding, removing and setting components.

The “phase” construct is introduced which splits up the frame into several different parts. This idea is not new, as many engines have similar concepts that have proven to work well. Each of these phases are associated with a specific state the frame is in, and this provides the right kind of context to ensure the preconditions for our systems are met.

Conclusion

Do I think that an ECS framework has to conform to all of the above points in order to be ECS? Obviously not, nor do I think this should ever be the case. I can write code in C that uses inheritance, even though the language doesn’t support it and still call it inheritance. Ultimately what is going to move ECS adoption forward is establishing a set of patterns, and for this to emerge it helps to have a richer set of primitives than what vanilla ECS provides.

As a final note, all of this is solely based on my own experiences of the last year, and may be completely different from your own. If you have insights to share, feel like I misrepresented things or just feel like having a more in-depth discussion on any of this, feel free to join the Flecs discord:

If you’re curious about what an implementation of these concepts looks like, check out Flecs v2, which is about to launch: https://github.com/SanderMertens/flecs/tree/v2_in_progress

More reading: