For 10 Million cell problems my compute server (with 128 GB Ram)
starts to swap, when I use debugging tools in parallel runs. I assume
that this might get an issue for others, too.
Now we consistently use unordered_map for the mapping.
- the diffusion one is basically done on runtime anyways
- the energy one gives some small code elimination gains
however, it complicates the writing of downstream templates.