第3章: Poolアロケータの深層

学習目標

この章を終えると、以下ができるようになります:

Poolアロケータの内部実装を完全に理解する
フリーリストの高度な管理技法を習得する
キャッシュ効率を最大化する実装ができる
実世界のアプリケーションに最適なPoolを設計できる

Poolアロケータの本質

メモリアクセスパターンの分析

プログラムのメモリアクセスには明確なパターンがあります:

パターン1: 短命オブジェクトの大量生成
  例: パーティクル、UIイベント、一時バッファ
  寿命: ミリ秒〜秒

パターン2: 固定サイズの頻繁な割り当て
  例: リンクリストノード、ツリーノード
  寿命: プログラム実行中

パターン3: プーリング可能なリソース
  例: スレッド、データベース接続、ソケット
  寿命: 再利用により永続的

これらのパターンに対して、汎用アロケータは非効率です:

汎用アロケータの問題:

1. システムコールのオーバーヘッド
   malloc/free → カーネルへの遷移（数千サイクル）

2. フラグメンテーション
   様々なサイズの割り当てが混在 → メモリ効率悪化

3. キャッシュミス
   割り当てられた領域が物理的に離れている → CPU性能低下

4. 予測不可能性
   割り当て時間が状況により変動 → リアルタイム性なし

Poolアロケータはこれらを解決します:

Poolアロケータの解決策:

1. 事前割り当て
   起動時にメモリプールを確保 → システムコール最小化

2. 固定サイズ
   同じサイズのみ → フラグメンテーションゼロ

3. 連続配置
   プール内のオブジェクトが物理的に近い → キャッシュヒット率向上

4. O(1)操作
   割り当て・解放が常に一定時間 → リアルタイム保証

標準ライブラリのMemoryPool実装

ソースコード解析

Zigのstd.heap.MemoryPoolを詳しく見ていきます:

const std = @import("std");

// std.heap.MemoryPoolの簡略版
pub fn MemoryPool(comptime Item: type) type {
    return struct {
        const Self = @This();

        const Node = struct {
            next: ?*Node,

            fn data(node: *Node) *Item {
                return @as(*Item, @ptrCast(node));
            }

            fn fromData(item: *Item) *Node {
                return @as(*Node, @ptrCast(item));
            }
        };

        arena: std.heap.ArenaAllocator,
        free_list: ?*Node,

        pub fn init(allocator: std.mem.Allocator) Self {
            return .{
                .arena = std.heap.ArenaAllocator.init(allocator),
                .free_list = null,
            };
        }

        pub fn deinit(self: Self) void {
            self.arena.deinit();
        }

        pub fn create(self: *Self) !*Item {
            const node = if (self.free_list) |n| blk: {
                self.free_list = n.next;
                break :blk n;
            } else try self.createNode();

            return node.data();
        }

        pub fn destroy(self: *Self, item: *Item) void {
            const node = Node.fromData(item);
            node.next = self.free_list;
            self.free_list = node;
        }

        fn createNode(self: *Self) !*Node {
            const ptr = try self.arena.allocator().create(Item);
            return Node.fromData(ptr);
        }
    };
}

重要な設計決定

1. Intrusive vs Non-Intrusive

// Non-Intrusive (メタデータ分離)
const NonIntrusiveNode = struct {
    next: ?*NonIntrusiveNode,
    data: Item,
};

// メモリレイアウト:
// [next: 8B][data: sizeof(Item)]
// オーバーヘッド: 8バイト/アイテム

// Intrusive (メタデータ埋め込み)
const IntrusiveNode = struct {
    // データそのものをnextポインタとして使用
    // 未使用時のみ
};

// メモリレイアウト:
// [data: sizeof(Item)]  ← 使用時
// [next: 8B]            ← 未使用時（最初の8バイトを再利用）
// オーバーヘッド: 0バイト/アイテム

2. ArenaAllocatorとの組み合わせ

// std.heap.MemoryPoolの戦略:
// - 新しいアイテムはArenaAllocatorから確保
// - 削除されたアイテムはフリーリストで管理
// - deinit()でArenaごと一括解放

利点:
  ✓ シンプルな実装
  ✓ メモリリークが起きにくい
  ✓ 個別の解放が不要

欠点:
  ✗ メモリを返却できない（成長のみ）
  ✗ 長時間稼働でメモリ増加の可能性

高度な実装パターン

パターン1: ブロックベースPool（固定容量）

メモリ使用量を厳密に制御する実装:

const std = @import("std");

pub fn FixedCapacityPool(comptime T: type, comptime capacity: usize) type {
    return struct {
        const Self = @This();

        const Node = struct {
            next: ?*Node,
            data: T,
        };

        buffer: [capacity]Node,
        free_list: ?*Node,
        allocated_count: usize,

        pub fn init() Self {
            var self = Self{
                .buffer = undefined,
                .free_list = null,
                .allocated_count = 0,
            };

            // 全てのノードをフリーリストに追加
            for (&self.buffer, 0..) |*node, i| {
                node.next = if (i + 1 < capacity) &self.buffer[i + 1] else null;
            }
            self.free_list = &self.buffer[0];

            return self;
        }

        pub fn create(self: *Self) ?*T {
            const node = self.free_list orelse return null;
            self.free_list = node.next;
            self.allocated_count += 1;
            return &node.data;
        }

        pub fn destroy(self: *Self, item: *T) void {
            const node = @as(
                *Node,
                @ptrFromInt(@intFromPtr(item) - @offsetOf(Node, "data")),
            );

            // デバッグ: ダブルフリー検出
            if (std.debug.runtime_safety) {
                var current = self.free_list;
                while (current) |n| : (current = n.next) {
                    if (n == node) {
                        @panic("Double free detected");
                    }
                }
            }

            node.next = self.free_list;
            self.free_list = node;
            self.allocated_count -= 1;
        }

        pub fn available(self: *Self) usize {
            return capacity - self.allocated_count;
        }

        pub fn utilization(self: *Self) f32 {
            return @as(f32, @floatFromInt(self.allocated_count)) /
                   @as(f32, @floatFromInt(capacity));
        }
    };
}

// 使用例
pub fn example() !void {
    const Point = struct { x: f32, y: f32 };
    var pool = FixedCapacityPool(Point, 100).init();

    // プールから取得
    const p1 = pool.create() orelse return error.OutOfMemory;
    p1.* = .{ .x = 10, .y = 20 };

    const p2 = pool.create() orelse return error.OutOfMemory;
    p2.* = .{ .x = 30, .y = 40 };

    std.debug.print("Utilization: {d:.1}%\n", .{pool.utilization() * 100});
    std.debug.print("Available: {}\n", .{pool.available()});

    // 解放
    pool.destroy(p1);
    pool.destroy(p2);
}

利点:

ヒープ割り当て不要（スタックやグローバル領域に配置可能）
メモリ使用量が予測可能
組み込みシステムに最適

パターン2: サイズクラスベースPool（Slab Allocator）

複数のサイズをサポートする高度な実装:

const std = @import("std");

pub const SlabAllocator = struct {
    const SIZE_CLASSES = [_]usize{
        16, 32, 64, 128, 256, 512, 1024, 2048, 4096,
    };

    const Slab = struct {
        const Node = struct {
            next: ?*Node,
        };

        block: []u8,
        item_size: usize,
        capacity: usize,
        free_list: ?*Node,
        allocated: usize,

        fn init(allocator: std.mem.Allocator, item_size: usize, capacity: usize) !Slab {
            const block = try allocator.alloc(u8, item_size * capacity);

            var slab = Slab{
                .block = block,
                .item_size = item_size,
                .capacity = capacity,
                .free_list = null,
                .allocated = 0,
            };

            // フリーリストを構築
            var i: usize = 0;
            while (i < capacity) : (i += 1) {
                const offset = i * item_size;
                const node = @as(*Node, @ptrCast(@alignCast(&block[offset])));
                node.next = slab.free_list;
                slab.free_list = node;
            }

            return slab;
        }

        fn deinit(self: *Slab, allocator: std.mem.Allocator) void {
            allocator.free(self.block);
        }

        fn alloc(self: *Slab) ?[*]u8 {
            const node = self.free_list orelse return null;
            self.free_list = node.next;
            self.allocated += 1;
            return @as([*]u8, @ptrCast(node));
        }

        fn free(self: *Slab, ptr: [*]u8) void {
            const node = @as(*Node, @ptrCast(@alignCast(ptr)));
            node.next = self.free_list;
            self.free_list = node;
            self.allocated -= 1;
        }
    };

    parent_allocator: std.mem.Allocator,
    slabs: [SIZE_CLASSES.len]std.ArrayList(Slab),

    pub fn init(allocator: std.mem.Allocator) SlabAllocator {
        var self: SlabAllocator = .{
            .parent_allocator = allocator,
            .slabs = undefined,
        };

        for (&self.slabs) |*slab_list| {
            slab_list.* = std.ArrayList(Slab).init(allocator);
        }

        return self;
    }

    pub fn deinit(self: *SlabAllocator) void {
        for (&self.slabs) |*slab_list| {
            for (slab_list.items) |*slab| {
                slab.deinit(self.parent_allocator);
            }
            slab_list.deinit();
        }
    }

    fn sizeClassIndex(size: usize) ?usize {
        for (SIZE_CLASSES, 0..) |class_size, i| {
            if (size <= class_size) return i;
        }
        return null;
    }

    pub fn allocator(self: *SlabAllocator) std.mem.Allocator {
        return .{
            .ptr = self,
            .vtable = &.{
                .alloc = alloc,
                .resize = resize,
                .free = free,
            },
        };
    }

    fn alloc(
        ctx: *anyopaque,
        len: usize,
        ptr_align: u8,
        ret_addr: usize,
    ) ?[*]u8 {
        _ = ptr_align;
        _ = ret_addr;
        const self = @as(*SlabAllocator, @ptrCast(@alignCast(ctx)));

        // サイズクラスを決定
        const class_idx = sizeClassIndex(len) orelse {
            // 大きすぎる場合は親アロケータに委譲
            const buf = self.parent_allocator.alloc(u8, len) catch return null;
            return buf.ptr;
        };

        const class_size = SIZE_CLASSES[class_idx];
        var slab_list = &self.slabs[class_idx];

        // 既存のスラブから割り当てを試みる
        for (slab_list.items) |*slab| {
            if (slab.alloc()) |ptr| {
                return ptr;
            }
        }

        // 新しいスラブを作成
        const new_slab = Slab.init(
            self.parent_allocator,
            class_size,
            64,  // 64アイテム/スラブ
        ) catch return null;

        slab_list.append(new_slab) catch return null;

        const last_slab = &slab_list.items[slab_list.items.len - 1];
        return last_slab.alloc();
    }

    fn resize(
        ctx: *anyopaque,
        buf: []u8,
        buf_align: u8,
        new_len: usize,
        ret_addr: usize,
    ) bool {
        _ = ctx;
        _ = buf;
        _ = buf_align;
        _ = new_len;
        _ = ret_addr;
        return false;
    }

    fn free(
        ctx: *anyopaque,
        buf: []u8,
        buf_align: u8,
        ret_addr: usize,
    ) void {
        _ = buf_align;
        _ = ret_addr;
        const self = @as(*SlabAllocator, @ptrCast(@alignCast(ctx)));

        const class_idx = sizeClassIndex(buf.len) orelse {
            // 大きすぎる場合は親アロケータに委譲
            self.parent_allocator.free(buf);
            return;
        };

        const slab_list = &self.slabs[class_idx];

        // どのスラブに属するか検索
        for (slab_list.items) |*slab| {
            const block_start = @intFromPtr(slab.block.ptr);
            const block_end = block_start + slab.block.len;
            const ptr_addr = @intFromPtr(buf.ptr);

            if (ptr_addr >= block_start and ptr_addr < block_end) {
                slab.free(buf.ptr);
                return;
            }
        }
    }

    /// 統計情報
    pub fn dumpStats(self: *SlabAllocator) void {
        std.debug.print("=== Slab Allocator Statistics ===\n", .{});
        for (SIZE_CLASSES, 0..) |size, i| {
            const slab_list = &self.slabs[i];
            if (slab_list.items.len == 0) continue;

            var total_capacity: usize = 0;
            var total_allocated: usize = 0;

            for (slab_list.items) |*slab| {
                total_capacity += slab.capacity;
                total_allocated += slab.allocated;
            }

            const utilization = if (total_capacity > 0)
                @as(f32, @floatFromInt(total_allocated)) / @as(f32, @floatFromInt(total_capacity)) * 100.0
            else
                0.0;

            std.debug.print("  Size {: >4}: {} slabs, {}/{} items ({d:.1}%)\n", .{
                size,
                slab_list.items.len,
                total_allocated,
                total_capacity,
                utilization,
            });
        }
    }
};

// 使用例
pub fn demonstrateSlab() !void {
    var slab = SlabAllocator.init(std.heap.page_allocator);
    defer slab.deinit();

    const allocator = slab.allocator();

    // 様々なサイズの割り当て
    var allocations = std.ArrayList([]u8).init(std.heap.page_allocator);
    defer allocations.deinit();

    // 小さい割り当て
    for (0..100) |_| {
        const buf = try allocator.alloc(u8, 20);
        try allocations.append(buf);
    }

    // 中程度の割り当て
    for (0..50) |_| {
        const buf = try allocator.alloc(u8, 200);
        try allocations.append(buf);
    }

    // 大きい割り当て
    for (0..10) |_| {
        const buf = try allocator.alloc(u8, 2000);
        try allocations.append(buf);
    }

    slab.dumpStats();

    // 解放
    for (allocations.items) |buf| {
        allocator.free(buf);
    }
}

パターン3: 世代別Pool（Generational Pool）

世代管理によるuse-after-freeの検出:

const std = @import("std");

pub fn GenerationalPool(comptime T: type, comptime capacity: usize) type {
    return struct {
        const Self = @This();

        const Handle = struct {
            index: u32,
            generation: u32,
        };

        const Slot = struct {
            generation: u32,
            data: T,
            occupied: bool,
        };

        slots: [capacity]Slot,
        free_list: std.ArrayList(u32),
        allocator: std.mem.Allocator,

        pub fn init(allocator: std.mem.Allocator) !Self {
            var self = Self{
                .slots = undefined,
                .free_list = std.ArrayList(u32).init(allocator),
                .allocator = allocator,
            };

            // 初期化
            for (&self.slots, 0..) |*slot, i| {
                slot.* = .{
                    .generation = 0,
                    .data = undefined,
                    .occupied = false,
                };
                try self.free_list.append(@intCast(i));
            }

            return self;
        }

        pub fn deinit(self: *Self) void {
            self.free_list.deinit();
        }

        pub fn create(self: *Self) !Handle {
            const index = self.free_list.popOrNull() orelse
                return error.OutOfMemory;

            const slot = &self.slots[index];
            slot.occupied = true;

            return .{
                .index = index,
                .generation = slot.generation,
            };
        }

        pub fn get(self: *Self, handle: Handle) ?*T {
            if (handle.index >= capacity) return null;

            const slot = &self.slots[handle.index];

            // 世代チェック
            if (slot.generation != handle.generation) return null;
            if (!slot.occupied) return null;

            return &slot.data;
        }

        pub fn destroy(self: *Self, handle: Handle) !void {
            if (handle.index >= capacity) return error.InvalidHandle;

            const slot = &self.slots[handle.index];

            // 世代チェック
            if (slot.generation != handle.generation) {
                return error.InvalidHandle;
            }

            if (!slot.occupied) {
                return error.DoubleFree;
            }

            // 世代を進める
            slot.generation +%= 1;
            slot.occupied = false;

            try self.free_list.append(handle.index);
        }
    };
}

// 使用例
pub fn demonstrateGenerational() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();

    const Entity = struct {
        name: []const u8,
        health: f32,
    };

    var pool = try GenerationalPool(Entity, 100).init(gpa.allocator());
    defer pool.deinit();

    // エンティティ作成
    const handle1 = try pool.create();
    const entity1 = pool.get(handle1).?;
    entity1.* = .{ .name = "Player", .health = 100 };

    // エンティティ削除
    try pool.destroy(handle1);

    // 古いハンドルでアクセス（エラー）
    const entity_invalid = pool.get(handle1);
    try std.testing.expect(entity_invalid == null);

    std.debug.print("Generational pool prevents use-after-free!\n", .{});
}

キャッシュ最適化

データ指向設計（Data-Oriented Design）

const std = @import("std");

// 従来のOOP（キャッシュ非効率）
const TraditionalEntity = struct {
    id: u32,
    position: [3]f32,
    velocity: [3]f32,
    health: f32,
    name: []const u8,
    // ... 多くのフィールド
};

// データ指向設計（キャッシュ効率的）
const EntityPool = struct {
    const MAX_ENTITIES = 10000;

    // SoA (Structure of Arrays)
    positions: [MAX_ENTITIES][3]f32,
    velocities: [MAX_ENTITIES][3]f32,
    healths: [MAX_ENTITIES]f32,
    count: usize,

    pub fn init() EntityPool {
        return .{
            .positions = undefined,
            .velocities = undefined,
            .healths = undefined,
            .count = 0,
        };
    }

    pub fn add(self: *EntityPool, pos: [3]f32, vel: [3]f32, health: f32) !usize {
        if (self.count >= MAX_ENTITIES) return error.OutOfMemory;

        const index = self.count;
        self.positions[index] = pos;
        self.velocities[index] = vel;
        self.healths[index] = health;
        self.count += 1;

        return index;
    }

    pub fn updatePositions(self: *EntityPool, dt: f32) void {
        // SIMDフレンドリー: 連続メモリアクセス
        for (0..self.count) |i| {
            self.positions[i][0] += self.velocities[i][0] * dt;
            self.positions[i][1] += self.velocities[i][1] * dt;
            self.positions[i][2] += self.velocities[i][2] * dt;
        }
    }
};

// ベンチマーク
pub fn benchmarkCacheEfficiency() !void {
    var traditional_entities: [10000]TraditionalEntity = undefined;
    var entity_pool = EntityPool.init();

    // 初期化
    for (&traditional_entities, 0..) |*e, i| {
        e.* = .{
            .id = @intCast(i),
            .position = .{ 0, 0, 0 },
            .velocity = .{ 1, 1, 1 },
            .health = 100,
            .name = "Entity",
        };
    }

    for (0..10000) |_| {
        _ = try entity_pool.add(.{ 0, 0, 0 }, .{ 1, 1, 1 }, 100);
    }

    // ベンチマーク: 従来
    {
        var timer = try std.time.Timer.start();
        for (0..1000) |_| {
            for (&traditional_entities) |*e| {
                e.position[0] += e.velocity[0] * 0.016;
                e.position[1] += e.velocity[1] * 0.016;
                e.position[2] += e.velocity[2] * 0.016;
            }
        }
        const elapsed = timer.read();
        std.debug.print("Traditional (AoS): {} ns\n", .{elapsed});
    }

    // ベンチマーク: SoA
    {
        var timer = try std.time.Timer.start();
        for (0..1000) |_| {
            entity_pool.updatePositions(0.016);
        }
        const elapsed = timer.read();
        std.debug.print("Data-Oriented (SoA): {} ns\n", .{elapsed});
    }
}

// 典型的な結果:
// Traditional (AoS):     15,000,000 ns
// Data-Oriented (SoA):    3,000,000 ns
// Speedup: 5x

実世界のケーススタディ

ケース1: Unity ECS (Entity Component System)

Unityの最新ECSアーキテクチャはPoolパターンを活用:

const std = @import("std");

// Unity ECS風の実装
pub const ECSWorld = struct {
    const MAX_ENTITIES = 100000;
    const MAX_COMPONENTS_PER_TYPE = 100000;

    // Component Pools
    const PositionPool = FixedCapacityPool(Position, MAX_COMPONENTS_PER_TYPE);
    const VelocityPool = FixedCapacityPool(Velocity, MAX_COMPONENTS_PER_TYPE);
    const HealthPool = FixedCapacityPool(Health, MAX_COMPONENTS_PER_TYPE);

    position_pool: PositionPool,
    velocity_pool: VelocityPool,
    health_pool: HealthPool,

    // Sparse Set for Entity-Component mapping
    entities: [MAX_ENTITIES]?Entity,
    entity_count: usize,

    pub fn init() ECSWorld {
        return .{
            .position_pool = PositionPool.init(),
            .velocity_pool = VelocityPool.init(),
            .health_pool = HealthPool.init(),
            .entities = [_]?Entity{null} ** MAX_ENTITIES,
            .entity_count = 0,
        };
    }

    pub fn createEntity(self: *ECSWorld) !u32 {
        if (self.entity_count >= MAX_ENTITIES) return error.OutOfEntities;

        const id = @as(u32, @intCast(self.entity_count));
        self.entities[id] = Entity{ .id = id };
        self.entity_count += 1;

        return id;
    }

    pub fn addPosition(self: *ECSWorld, entity_id: u32, pos: Position) !void {
        const position = self.position_pool.create() orelse
            return error.OutOfMemory;
        position.* = pos;

        var entity = &self.entities[entity_id].?;
        entity.position = position;
    }

    // Systems
    pub fn movementSystem(self: *ECSWorld, dt: f32) void {
        // 位置と速度の両方を持つエンティティのみ処理
        for (self.entities[0..self.entity_count]) |maybe_entity| {
            const entity = maybe_entity orelse continue;
            const pos = entity.position orelse continue;
            const vel = entity.velocity orelse continue;

            pos.x += vel.x * dt;
            pos.y += vel.y * dt;
            pos.z += vel.z * dt;
        }
    }
};

const Entity = struct {
    id: u32,
    position: ?*Position = null,
    velocity: ?*Velocity = null,
    health: ?*Health = null,
};

const Position = struct { x: f32, y: f32, z: f32 };
const Velocity = struct { x: f32, y: f32, z: f32 };
const Health = struct { current: f32, max: f32 };

ケース2: Linuxカーネル - Slab Allocator

Linuxカーネルは1990年代からSlab Allocatorを使用:

Linuxカーネルのメモリ管理:

1. ページアロケータ（低レベル）
   4KB単位でメモリを管理

2. Slab Allocator（高レベル）
   頻繁に使われるオブジェクト用のキャッシュ
   - task_struct (プロセス記述子)
   - inode (ファイルシステムノード)
   - dentry (ディレクトリエントリ)
   - socket buffers

利点:
  ✓ フラグメンテーション最小化
  ✓ 初期化済みオブジェクトの再利用
  ✓ キャッシュカラーリング（CPUキャッシュ最適化）

性能:
  - 1秒間に数百万のオブジェクト割り当て可能
  - 割り当て時間: 数十ナノ秒

ケース3: Nginx - メモリプール

高性能WebサーバーNginxのメモリ管理:

Nginxのプール戦略:

1. コネクションプール
   - クライアント接続ごとに専用プール
   - リクエスト終了時に一括解放

2. リクエストプール
   - HTTPリクエストのパース結果
   - ヘッダー、ボディ、URIなど

3. バッファプール
   - 読み書きバッファ
   - サイズクラス別に管理

性能:
  - C10K問題の解決（10,000同時接続）
  - メモリ効率: 1接続あたり約1-2KB
  - レイテンシ: ミリ秒以下

まとめ

Poolアロケータの深層を学びました:

本質: メモリアクセスパターンの最適化
実装技法: フリーリスト、Slab、世代管理
キャッシュ最適化: データ指向設計、SoA
実世界の応用: Unity ECS、Linux、Nginx

Poolアロケータは、固定サイズのオブジェクトを大量に扱うシステムで極めて効果的です。次の章では、これまで学んだ全てのアロケータパターンを統合した実践的なメモリ管理システムの設計を学びます。

参考文献

Zig Standard Library - MemoryPool: https://github.com/ziglang/zig/blob/master/lib/std/heap/memory_pool.zig
Linux Kernel - Slab Allocator: https://www.kernel.org/doc/gorman/html/understand/understand011.html
Data-Oriented Design: https://www.dataorienteddesign.com/dodbook/
Game Programming Patterns: https://gameprogrammingpatterns.com/
Unity ECS: https://docs.unity3d.com/Packages/com.unity.entities@latest