SoA和AoS

In computing, array of structures (AoS), structure of arrays (SoA) and array of structures of arrays (AoSoA) refer to contrasting ways to arrange a sequence of records in memory, with regard to interleaving, and are of interest in SIMD and SIMT programming.

根据Wikipedia的介绍，SoA是一种并行友好的layout，这种数据结构是我在学习如何用OpenGL向GPU加载顶点数据时时了解到的。比如在编辑顶点数据时，我们常常会分别对顶点的不同属性单独处理，因此当使用SoA时，能保证缓存友好。打个比方，当我们想要批量修改顶点的法向量时，使用SoA这种layout，能够保证不会载入其它属性（位置、颜色）的数据到Cache中。

当然，除了顶点，还有许多地方也会用到，比如在ECS架构，大多数`System`只处理一种类型的`Component`，这类似于对顶点的属性进行处理。而`Entity`则类似于顶点的概念。

实现

这里我们想要实现SoA的容器，类似于google/filament中的SoA模板容器filament/StructureOfArrays.h at main · google/filament (github.com)，模板参数为属性。这里大部分接口都是类似的，具体实现因为数据存储方式不同而有所区别。

在google/filament中，数据存在raw buffer中，并由Allocator进行分配容量，访问接口时，利用类型大小和位置计算偏移，并进行类型转换。这里为了利用 C++17 的 PMR 特性，我们尝试使用标准库的容器进行实现。

SoA一个大致的样子如下

cpp

template <typename... Types>
class SoA {
/* Some things */
};

`std::tuple`

在实现时的第一个问题是，我们如何根据Types构建出struct。这里我们可以用std::tuple来表示struct。std::tuple - cppreference.com中的描述如下：

Class templateis a fixed-size collection of heterogeneous values.

然而std::tuple并不遵循Standar Layout，因此无法完全替代struct。因此我们不能利用指针偏移等方式来处理std::tuple中的元素。一个例子如下

cpp

struct A {
	int x = 1 , y = 2;
};

A a;
auto b = std::make_tuple<int, int>(1, 2);

其内存排布如下:

==========  Struct  ==========
Buffer Address Start: 0x7ffd4dff6018
            0   1   2   3   4   5   6   7   8   9   a   b   c   d   e   f
(dec) 00:   1   0   0   0   2   0   0   0  32  97 255  77 253 127   0   0
(hex) 00:  01  00  00  00  02  00  00  00  20  61  ff  4d  fd  7f  00  00


==========  Tuple   ==========
Buffer Address Start: 0x7ffd4dff6018
            0   1   2   3   4   5   6   7   8   9   a   b   c   d   e   f
(dec) 00:   2   0   0   0   1   0   0   0  32  97 255  77 253 127   0   0
(hex) 00:  02  00  00  00  01  00  00  00  20  61  ff  4d  fd  7f  00  00

可以看到，在 Clang 14下，A的成员顺序按照声明顺序排布，而在std::tuple中则是逆序排序（完整代码）。

数据结构

那么内部的Array就可以用标准库中的std::vector当然，这里为了配合 PMR ，我们使用std::pmr::vector，因此一个没有任何接口的SoA容器就完成了。关于 PMR 的知识可参考，和

cpp

template <typename... Types>
class SoA {
public:
    using allocator_type = std::pmr::polymorphic_allocator<std::byte>;

    explicit SoA(allocator_type allocator = {})
        : m_Data{std::allocator_arg, allocator},
          m_Allocator(allocator) {}

    allocator_type get_allocator() const noexcept { return m_Allocator; }

private:
    std::tuple<std::pmr::vector<Types>...> m_Data;
    std::size_t                            m_Size = 0;

    allocator_type m_Allocator;
};

访问

SoA的访问方式一般有：

访问某个 Array
访问某个 Struct
Iterator访问

第一个是使用SoA的主要目的，而其余的则是按照AoS的访问方式进行，更方便程序员使用。由于使用了std::tuple，这里使用std::get<>就很方便访问对应的数组。这里需要注意的是，std::get<>不应该使用类型作为模板参数，这是由于SoA是允许类型重复的，因此只能使用常量整数来访问。类似filament那样，可以按照如下方式访问

cpp

enum {
    Position = 0,
    Velocity,
};

SoA<vec3f, vec3f> data;
auto position_array = data.elements<Position>();

至于后两种，需要考虑亮点，一是放回值是包含所有属性的单个元素，二是限定符。因此我们需要定义4种类型: 值类型，引用类型，const 引用类型，以及右值类型(转发)

cpp

template <typename... Types>
class SoA {
public:
// ...
    using Structure         = std::tuple<Types...>;
    using StructureRef      = std::tuple<Types&...>;
    using StructureConstRef = std::tuple<const Types&...>;
    using StructureForward  = std::tuple<Types&&...>;
// ...
};

在访问某个struct时，根据SoA自身的修饰符，来限定返回的类型。比如const的成员方法。一个最简单的下标访问函数如下

cpp

template <typename... Types>
class SoA {
public:
// ...
    StructureConstRef operator[](std::size_t i) const noexcept {
        return [&]<std::size_t... I>(std::index_sequence<I...>) {
            return std::tie(std::get<I>(m_Data)[i]...);
        }
        (std::index_sequence_for<Types...>{});
    }
    StructureRef operator[](std::size_t i) noexcept {
        return [&]<std::size_t... I>(std::index_sequence<I...>) {
            return std::tie(std::get<I>(m_Data)[i]...);
        }
        (std::index_sequence_for<Types...>{});
    }
// ...
};

这里用到了一些参数包的技巧，值得注意的是，我们没有使用 C++ Effective中所描述的那种const成员方法和no const成员方法的类型转换技巧来减少重复代码。这是因为StructureConstRef和StructureRef之间的转换对象是std::tuple的模板参数。

之后的访问方法，修改方法就是类似的，当这些都实现完成后。那么Iterator也就自然而然地变得简单了。

完整代码

https://github.com/L-Sun/HitagiEngine/blob/xmake/hitagi/utils/include/hitagi/utils/soa.hpp

局限

由于使用的是std::pmr::vector，根据标准库的vector成倍容量增长策略，当SoA中元素变得足够多时，数据在内存上的移动会导致比较大的性能开销。解决方法是将std::pmr::vector改为其它顺序容器；或则在使用时，人为限定其大小，利用std::prm::list来管理多个SoA达到一定的取舍。

基于tuple的SoA容器实现

SoA和AoS

实现

std::tuple

数据结构

访问

完整代码

局限

参考

`std::tuple`