Thanks for the reply, good point!
Yes, if you use this approach it will slightly decrease the performance of iterating a single component. Usually this is quite modest, as you basically get a cache miss per component/archetype switch, so you need to have a pretty dramatic level of fragmentation (the term people usually use) before it becomes significant.
My reasoning for this has been that if you have high levels of fragmentation, you likely don't have many entities, and iteration performance isn't a significant contributor to overall performance. If you have many entities, fragmentation level is likely to be low, and its overhead isn't a significant contributor to overall performance. This makes it a good default (I think) for complex apps where performance matters.
I'm intrigued by what you said about iterating components. I rarely (if ever) find myself iterating just one component. Take for example this tower defense example: https://github.com/SanderMertens/tower_defense/blob/master/src/main.cpp#L1065
A "FindTarget" system requires my own Position, a "Target" component (for storing the target) and a "Turret" component (for rotating the turret). Such systems are the default rather than the exception, at least in the way I write code.
I'd be interested if you could share which approach leads to usually iterating a single component, as I'm not familiar with it.
Also, feel free to join the Discord, where we have a bunch of people discussing ECS design! https://discord.gg/BEzP5Rgrrp