Netty内存池分层设计架构-EW帮帮网

Netty 的内存池设计借鉴了 jemalloc 的思想，通过多线程缓存（PoolThreadCache）、内存区域（PoolArena）、内存块（PoolChunk）、子页（PoolSubpage）以及内存块列表（PoolChunkList）等组件协同工作，实现了精细化和高效的内存管理。

整体架构关系图

PoolArena (内存管理域)
├── PoolChunkList[] (chunk使用率分组)
│   ├── PoolChunkList(0-25%)
│   ├── PoolChunkList(1-50%) 
│   ├── PoolChunkList(25-75%)
│   ├── PoolChunkList(50-100%)
│   ├── PoolChunkList(75-100%)
│   └── PoolChunkList(100%)
│       └── PoolChunk[] (16MB内存块)
│           ├── runs[] (大块内存分配)
│           └── PoolSubpage[] (小块内存分配)
└── PoolThreadCache (线程本地缓存)
    ├── SmallSubPageCaches[]
    ├── NormalCaches[]

四个核心组件（ PoolThreadCache , PoolArena , PoolChunkList , PoolChunk ，以及由 PoolChunk 派生的 PoolSubpage ）紧密协作，形成了一个完整的三级内存管理体系：

PoolThreadCache ：作为最顶层，提供线程本地的快速缓存，是性能优化的关键，极大地减少了锁竞争。
PoolArena ：作为中间的协调者，负责执行分配策略，并协调 PoolChunkList 、 PoolChunk 和 PoolSubpage 的工作。
PoolChunkList ：辅助 PoolArena ，通过智能分组管理 PoolChunk ，优化了 Chunk 的选择和重用。
PoolChunk 和 PoolSubpage ：作为最底层，直接管理物理内存， PoolChunk 负责大块内存的分配和通过伙伴算法管理， PoolSubpage 则在 PoolChunk 的基础上对小块内存进行精细化管理。

设计亮点

层级隔离：
- PoolThreadCache (线程级) -> PoolArena (Arena级) -> PoolChunk / PoolSubpage (Chunk/Subpage级) 的三级结构，平衡了全局管理和局部效率。
多层次缓存策略：
- 线程级： PoolThreadCache 避免锁竞争。
- Arena级： PoolChunkList 按使用率对 PoolChunk 进行分组管理。
- Chunk级：提供 run (用于较大内存) 和 subpage (用于较小内存) 两种分配方式。
碎片管理优化：
- 使用率分组 ( PoolChunkList )：避免在高使用率的 Chunk 中分配，优先选择合适的 Chunk 。
- 伙伴算法 ( PoolChunk )：管理和合并内存块，减少外部碎片。
- 连续合并：释放内存时，自动合并相邻的空闲 run。
- 偏移排序：在 PoolSubpage 中，优先使用低偏移量的内存单元，有助于减少碎片。

性能优化机制：

无锁优化 ( PoolThreadCache )：通过线程局部存储避免多线程竞争。
紧凑编码/位操作编码：使用 long 类型的 handle 来压缩存储元数据（如 Chunk 内偏移量、 Subpage 索引等），降低了内存开销并能高效编解码。
优先队列：可能用于快速查找合适大小的 run 。

通过这种分层设计和多种优化机制，Netty 实现了高性能、低碎片、线程安全的内存池管理。这使得 Netty 能够在高并发的网络编程场景下，依然保持较低的内存分配延迟和较高的内存利用率，有效解决了频繁内存分配带来的性能瓶颈。

PoolThreadCache - 线程本地缓存

更详细的分析见：

PoolThreadCache 类的结构和源码实现-CSDN博客

揭秘Netty高性能线程本地存储机制：FastThreadLocal-CSDN博客

PoolThreadCache 是每个线程私有的缓存，用于缓存少量最近释放的 PooledByteBuf。这样，当线程再次请求相同大小的内存时，可以直接从缓存中获取，避免了向 PoolArena 申请内存的开销以及潜在的锁竞争。

结构:

持有 PoolArena<byte[]> (堆内存区域) 和 PoolArena<ByteBuffer> (直接内存区域) 的引用。
内部维护了不同大小规格的内存区域缓存 (MemoryRegionCache)，分为 smallSubPageHeapCaches, smallSubPageDirectCaches, normalHeapCaches, normalDirectCaches。这些缓存分别用于缓存不同大小 (Small/Normal) 和类型 (Heap/Direct) 的内存。
MemoryRegionCache 内部使用队列 (Queue<Entry<T>>) 来存储缓存的内存块信息 (Entry 包含 PoolChunk、handle 等)。
freeSweepAllocationThreshold: 一个阈值，当分配次数达到该阈值时，会触发 trim() 操作，清理不常用的缓存。
allocations: 记录当前线程缓存的分配次数。

交互:

当分配内存时，首先尝试从 PoolThreadCache 中获取。
如果缓存未命中，则会向 PoolArena 申请。
当 PooledByteBuf 被释放时，如果其大小和类型适合缓存，并且缓存未满，则会尝试将其放入 PoolThreadCache。

优化机制:

预分配: 避免频繁的arena竞争
大小分层: 精确匹配减少内存浪费
free()时优先缓存到线程本地。
超过freeSweepAllocationThreshold时触发全局释放。

// ... existing code ...
final class PoolThreadCache {

    private static final InternalLogger logger = InternalLoggerFactory.getInstance(PoolThreadCache.class);
    private static final int INTEGER_SIZE_MINUS_ONE = Integer.SIZE - 1;

    final PoolArena<byte[]> heapArena;
    final PoolArena<ByteBuffer> directArena;

    // Hold the caches for the different size classes, which are small and normal.
    private final MemoryRegionCache<byte[]>[] smallSubPageHeapCaches;
    private final MemoryRegionCache<ByteBuffer>[] smallSubPageDirectCaches;
    private final MemoryRegionCache<byte[]>[] normalHeapCaches;
    private final MemoryRegionCache<ByteBuffer>[] normalDirectCaches;

    private final int freeSweepAllocationThreshold;
    private final AtomicBoolean freed = new AtomicBoolean();
    @SuppressWarnings("unused") // Field is only here for the finalizer.
    private final FreeOnFinalize freeOnFinalize;

    private int allocations;

    // TODO: Test if adding padding helps under contention
    //private long pad0, pad1, pad2, pad3, pad4, pad5, pad6, pad7;

    PoolThreadCache(PoolArena<byte[]> heapArena, PoolArena<ByteBuffer> directArena,
                    int smallCacheSize, int normalCacheSize, int maxCachedBufferCapacity,
                    int freeSweepAllocationThreshold, boolean useFinalizer) {
        checkPositiveOrZero(maxCachedBufferCapacity, "maxCachedBufferCapacity");
        this.freeSweepAllocationThreshold = freeSweepAllocationThreshold;
        this.heapArena = heapArena;
        this.directArena = directArena;
        if (directArena != null) {
            smallSubPageDirectCaches = createSubPageCaches(smallCacheSize, directArena.sizeClass.nSubpages);
            normalDirectCaches = createNormalCaches(normalCacheSize, maxCachedBufferCapacity, directArena);
            directArena.numThreadCaches.getAndIncrement();
        } else {
            // No directArea is configured so just null out all caches
            smallSubPageDirectCaches = null;
            normalDirectCaches = null;
        }
        if (heapArena != null) {
            // Create the caches for the heap allocations
            smallSubPageHeapCaches = createSubPageCaches(smallCacheSize, heapArena.sizeClass.nSubpages);
            normalHeapCaches = createNormalCaches(normalCacheSize, maxCachedBufferCapacity, heapArena);
            heapArena.numThreadCaches.getAndIncrement();
        } else {
            // No heapArea is configured so just null out all caches
            smallSubPageHeapCaches = null;
            normalHeapCaches = null;
        }

        // Only check if there are caches in use.
        if ((smallSubPageDirectCaches != null || normalDirectCaches != null
                || smallSubPageHeapCaches != null || normalHeapCaches != null)
                && freeSweepAllocationThreshold < 1) {
            throw new IllegalArgumentException("freeSweepAllocationThreshold: "
                    + freeSweepAllocationThreshold + " (expected: > 0)");
        }
        freeOnFinalize = useFinalizer ? new FreeOnFinalize(this) : null;
    }

// ... existing code ...
    private abstract static class MemoryRegionCache<T> {
        private final int size;
        private final Queue<Entry<T>> queue;
        private final SizeClass sizeClass;
        private int allocations;

        MemoryRegionCache(int size, SizeClass sizeClass) {
            this.size = MathUtil.safeFindNextPositivePowerOfTwo(size);
            queue = PlatformDependent.newFixedMpscUnpaddedQueue(this.size);
            this.sizeClass = sizeClass;
        }

// ... existing code ...
        static final class Entry<T> {
            final EnhancedHandle<Entry<?>> recyclerHandle;
            PoolChunk<T> chunk;
            ByteBuffer nioBuffer;
            long handle = -1;
            int normCapacity;

            Entry(Handle<Entry<?>> recyclerHandle) {
                this.recyclerHandle = (EnhancedHandle<Entry<?>>) recyclerHandle;
            }

// ... existing code ...
        }

// ... existing code ...
    }

// ... existing code ...
}

PoolArena - 内存管理域

更详细的分析见：Netty内存池核心PoolArena源码解析-CSDN博客

职责: 内存分配入口，作为内存分配的顶层管理器，协调所有内存分配策略

PoolArena 是内存分配的核心。它管理着一组 PoolChunk，并负责从这些 PoolChunk 中分配内存。通常会有多个 PoolArena (例如，CPU核心数个) 来减少线程间的竞争。PooledByteBufAllocator 会选择一个负载最小的 PoolArena 来服务当前线程。

结构:

parent: 指向 PooledByteBufAllocator。
sizeClass: 定义了不同大小内存块的规格 (SizeClasses)。
smallSubpagePools: PoolSubpage 的数组，用于管理不同大小规格的 Small 类型分配。每个元素是一个 PoolSubpage 链表的头节点。
维护6个PoolChunkList链表（qInit/q000/q025/q050/q075/q100），按Chunk利用率分组。
numThreadCaches: 原子计数器，记录有多少 PoolThreadCache 正在使用此 Arena。
allocationsNormal, allocationsSmall, allocationsHuge: 记录不同类型的分配次数。
DirectArena 和 HeapArena 是其两个具体实现，分别用于分配直接内存和堆内存。

交互:

PoolThreadCache 在缓存未命中时会向其关联的 PoolArena 请求内存。
PoolArena 会根据请求的大小，从合适的 PoolChunkList 中选择一个 PoolChunk 进行分配。
如果所有 PoolChunk 都不足以满足请求，PoolArena 会创建一个新的 PoolChunk。
管理 PoolSubpage 的生命周期，用于 Small 类型的分配。

分配策略：

Small请求（<8KB）：转交PoolSubpage处理。
Normal请求（8KB-16MB）：在PoolChunkList链表中查找合适Chunk。
Huge请求（>16MB）：直接分配非池化内存。
线程安全：通过ReentrantLock控制并发分配。

关键属性:

// ... existing code ...
abstract class PoolArena<T> implements PoolArenaMetric {
    private static final boolean HAS_UNSAFE = PlatformDependent.hasUnsafe();

    enum SizeClass {
        Small,
        Normal
    }

    final PooledByteBufAllocator parent;

    final PoolSubpage<T>[] smallSubpagePools;

    private final PoolChunkList<T> q050;
    private final PoolChunkList<T> q025;
    private final PoolChunkList<T> q000;
    private final PoolChunkList<T> qInit;
    private final PoolChunkList<T> q075;
    private final PoolChunkList<T> q100;

    private final List<PoolChunkListMetric> chunkListMetrics;

    // Metrics for allocations and deallocations
    private long allocationsNormal;
    // We need to use the LongCounter here as this is not guarded via synchronized block.
    private final LongAdder allocationsSmall = new LongAdder();
    private final LongAdder allocationsHuge = new LongAdder();
// ... existing code ...
    protected PoolArena(PooledByteBufAllocator parent, SizeClasses sizeClass) {
        assert null != sizeClass;
        this.parent = parent;
        this.sizeClass = sizeClass;
        smallSubpagePools = newSubpagePoolArray(sizeClass.nSubpages);
        for (int i = 0; i < smallSubpagePools.length; i ++) {
            smallSubpagePools[i] = newSubpagePoolHead(i);
        }

// ... existing code ...
    }
// ... existing code ...
    static final class DirectArena extends PoolArena<ByteBuffer> {

        DirectArena(PooledByteBufAllocator parent, SizeClasses sizeClass) {
            super(parent, sizeClass);
        }

        @Override
        boolean isDirect() {
// ... existing code ...
    }

// ... existing code ...
    static final class HeapArena extends PoolArena<byte[]> {
        private final AtomicReference<PoolChunk<byte[]>> lastDestroyedChunk;


        HeapArena(PooledByteBufAllocator parent, SizeClasses sizeClass) {
            super(parent, sizeClass);
            lastDestroyedChunk = new AtomicReference<>();
        }

        private static byte[] newByteArray(int size) {
// ... existing code ...
    }
// ... existing code ...
}

分配策略:

Small分配 (< 512B): 优先从PoolThreadCache获取，否则从smallSubpagePools分配
Normal分配 (512B - 16MB): 按使用率从低到高遍历PoolChunkList
Huge分配 (> 16MB): 直接创建unpooled chunk

PoolChunk - 16MB内存块

更详细的分析见：Netty内存池核心：PoolChunk深度解析-CSDN博客

内存组织: 通过handle编码实现精确管理

PoolChunk 代表一个较大的连续内存块 (默认 16MB)。它是内存分配的基本单元。一个 PoolChunk 可以被划分为多个页 (Page, 默认 8KB)。

结构:

arena: 指向所属的 PoolArena。
memory: 实际的内存区域 (byte[] 或 ByteBuffer)。
unpooled:标记是否为非池化的 Chunk。
subpages: PoolSubpage 数组，用于管理此 Chunk 内的 Subpage 分配。数组的索引对应 Page 的索引。
pageSize: 页大小。
pageShifts: 页大小的对数。
chunkSize: Chunk 的总大小。
freeBytes: 当前 Chunk 中剩余的空闲字节数。
runsAvailMap, runsAvail: 用于管理和查找空闲的连续页 (Run)。runsAvailMap 使用 LongLongHashMap 存储 run 的偏移量和句柄（runOffset --> handle），runsAvail 是一个 IntPriorityQueue 数组，按 run 的大小组织。
handle: 一个 long 型数值，编码了内存分配的信息，如偏移量、大小、是否已使用、是否为 Subpage 等。

交互:

由 PoolArena 创建和管理。
负责实际的内存分配逻辑，包括 Normal 类型的分配 (分配一个或多个连续的 Page) 和 Small 类型的分配 (从 PoolSubpage 分配)。
当一个 PoolChunk 的内存被完全分配或大部分分配后，它会在 PoolArena 的不同 PoolChunkList 之间移动。

Handle编码结构:

// 64位handle布局
// oooooooo ooooooos ssssssss ssssssue bbbbbbbb bbbbbbbb bbbbbbbb bbbbbbbb
// o: runOffset(15bit) | s: pages(15bit) | u: isUsed(1bit) 
// e: isSubpage(1bit) | b: bitmapIdx(32bit)

分配机制:

Run分配: 通过runsAvail[]优先队列管理，按offset排序减少碎片
Subpage分配: 小内存通过位图管理，支持高效的slot分配
allocateRun()：处理标准内存请求，使用伙伴算法拆分/合并空闲页。
allocateSubpage()：处理小对象请求，委托给PoolSubpage。

// ... existing code ...
final class PoolChunk<T> implements PoolChunkMetric {
    private static final int SIZE_BIT_LENGTH = 15;
    private static final int INUSED_BIT_LENGTH = 1;
    private static final int SUBPAGE_BIT_LENGTH = 1;
    private static final int BITMAP_IDX_BIT_LENGTH = 32;

    static final int IS_SUBPAGE_SHIFT = BITMAP_IDX_BIT_LENGTH;
    static final int IS_USED_SHIFT = SUBPAGE_BIT_LENGTH + IS_SUBPAGE_SHIFT;
    static final int SIZE_SHIFT = INUSED_BIT_LENGTH + IS_USED_SHIFT;
    static final int RUN_OFFSET_SHIFT = SIZE_BIT_LENGTH + SIZE_SHIFT;

    final PoolArena<T> arena;
    final CleanableDirectBuffer cleanable;
    final Object base;
    final T memory;
    final boolean unpooled;

    /**
     * store the first page and last page of each avail run
     */
    private final LongLongHashMap runsAvailMap;

    /**
     * manage all avail runs
     */
    private final IntPriorityQueue[] runsAvail;

    private final ReentrantLock runsAvailLock;

    /**
     * manage all subpages in this chunk
     */
    private final PoolSubpage<T>[] subpages;

    /**
     * Accounting of pinned memory – memory that is currently in use by ByteBuf instances.
     */
    private final LongAdder pinnedBytes = new LongAdder();

    final int pageSize;
    final int pageShifts;
    final int chunkSize;
    final int maxPageIdx;

    // Use as cache for ByteBuffer created from the memory. These are just duplicates and so are only a container
// ... existing code ...
    @SuppressWarnings("unchecked")
    PoolChunk(PoolArena<T> arena, CleanableDirectBuffer cleanable, Object base, T memory, int pageSize, int pageShifts,
              int chunkSize, int maxPageIdx) {
        unpooled = false;
        this.arena = arena;
        this.cleanable = cleanable;
        this.base = base;
        this.memory = memory;
        this.pageSize = pageSize;
        this.pageShifts = pageShifts;
        this.chunkSize = chunkSize;
        this.maxPageIdx = maxPageIdx;
        freeBytes = chunkSize;

        runsAvail = newRunsAvailqueueArray(maxPageIdx);
        runsAvailLock = new ReentrantLock();
        runsAvailMap = new LongLongHashMap(-1);
        subpages = new PoolSubpage[chunkSize >> pageShifts];

        //insert initial run, offset = 0, pages = chunkSize / pageSize
        int pages = chunkSize >> pageShifts;
        long initHandle = (long) pages << SIZE_SHIFT;
// ... existing code ...
    }
// ... existing code ...
}

`PoolSubpage`

详细分析见：揭秘Netty高性能内存池：PoolSubpage核心原理-CSDN博客

PoolSubpage 用于管理对小块内存 (Small 类型，小于 PageSize) 的分配。一个 Page 可以被划分为多个等大的 PoolSubpage 元素 (Element)。这样做可以有效减少小内存分配时的内部碎片。

结构:

chunk: 指向其所属的 PoolChunk。
elemSize: 当前 PoolSubpage 中每个元素的大小。
maxNumElems: 当前 PoolSubpage 最多可以容纳的元素数量。
bitmap: 位图，用于标记哪些元素是空闲的，哪些已被分配。
nextAvail: 下一个可用元素的索引。
numAvail: 当前可用的元素数量。
prev, next: 用于将相同 elemSize 的 PoolSubpage 组织成一个双向链表，挂在 PoolArena 的 smallSubpagePools 对应索引下。

交互:

当需要分配 Small 类型的内存时，PoolArena 会查找对应 elemSize 的 PoolSubpage 链表。
如果找到可用的 PoolSubpage，则从中分配一个元素。
如果没有可用的 PoolSubpage，PoolChunk 会分配一个新的 Page，并将其初始化为一个 PoolSubpage，然后加入到 PoolArena 的链表中。

// ... existing code ...
final class PoolSubpage<T> implements PoolSubpageMetric {

    static final PoolSubpage<?> UNAVAILABLE = null; // TODO: make this an actual sentinel value.

    final PoolChunk<T> chunk;
    private final int pageShifts;
    private final int runOffset;
    private final int pageSize;
    private final long[] bitmap;

    PoolSubpage<T> prev;
    PoolSubpage<T> next;

    boolean doNotDestroy;
    int elemSize;
    private int maxNumElems;
    private int bitmapLength;
    private int nextAvail;
    private int numAvail;

    private final int headIndex; // The index of the head of the PoolSubPage pool in the PoolArena.

    // TODO: Test if adding padding helps under contention
    //private long pad0, pad1, pad2, pad3, pad4, pad5, pad6, pad7;

    /** Special constructor for the head of the PoolSubpage pool. */
    PoolSubpage(int headIndex) {
        chunk = null;
        pageShifts = -1;
        runOffset = -1;
        elemSize = -1;
        pageSize = 0;
        bitmap = null;
        this.headIndex = headIndex;
    }

    PoolSubpage(PoolSubpage<T> head, PoolChunk<T> chunk, int pageShifts, int runOffset, int pageSize, int elemSize) {
        this.chunk = chunk;
        this.pageShifts = pageShifts;
        this.runOffset = runOffset;
        this.pageSize = pageSize;
        this.headIndex = head.headIndex;
        bitmap = new long[pageSize >>> 10]; // pageSize / 16 / 64
        init(head, elemSize);
    }

// ... existing code ...
}

PoolChunkList - 使用率分组管理

详细分析见：PoolChunkList解析-CSDN博客

设计理念: 将不同使用率的chunk分组管理，优化分配效率

PoolChunkList 用于将 PoolChunk 根据其内存使用率组织起来。PoolArena 维护多个 PoolChunkList，分别对应不同的使用率区间。

qInit(0-25%) → q000(1-50%) → q025(25-75%) → q050(50-100%) → q075(75-100%) → q100(100%)

// 初始化时的关联关系
qInit.next = q000;
q000.next = q025; 
q025.next = q050;
q050.next = q075;
q075.next = q100;

结构:

arena: 指向所属的 PoolArena。
nextList, prevList: 指向前一个和后一个 PoolChunkList，形成一个链表结构，方便 PoolChunk 在不同使用率的列表之间移动。
minUsage, maxUsage: 定义了此列表接受的 PoolChunk 的最小和最大使用率。
head: 指向列表中的第一个 PoolChunk。列表中的 PoolChunk 也是一个双向链表。

交互:

当 PoolChunk 的使用率发生变化时 (例如，分配或释放内存后)，它可能会从一个 PoolChunkList 移动到另一个更符合其当前使用率的 PoolChunkList。
PoolArena 在分配内存时，会按照一定的顺序（通常是从使用率较高的列表开始，如 q050, q025, q000, qInit, q075）遍历这些 PoolChunkList，尝试从中找到合适的 PoolChunk 进行分配。
碎片控制：低利用率链表优先回收，避免内存滞留。

核心机制:

晋升规则: chunk使用率超过maxUsage时移入下一级
降级规则: 使用率低于minUsage时移入上一级
分配优先级: 从q050开始向后查找，避免碎片化

// ... existing code ...
final class PoolChunkList<T> implements PoolChunkListMetric {
    private static final Iterator<PoolChunkMetric> EMPTY_METRICS = Collections.emptyIterator();
    private final PoolArena<T> arena;
    private final PoolChunkList<T> nextList;
    private final int minUsage;
    private final int maxUsage;
    private final int maxCapacity;

    private PoolChunk<T> head;
    private final PoolChunkList<T> prevList;

    // TODO: Test if adding padding helps under contention
    //private long pad0, pad1, pad2, pad3, pad4, pad5, pad6, pad7;

    PoolChunkList(PoolArena<T> arena, PoolChunkList<T> nextList, int minUsage, int maxUsage, int chunkSize) {
        assert minUsage <= maxUsage;
        this.arena = arena;
        this.nextList = nextList;
        this.minUsage = minUsage;
        this.maxUsage = maxUsage;
        maxCapacity = calculateMaxCapacity(minUsage, chunkSize);

        if (nextList == null) {
            prevList = null;
        } else {
            prevList = nextList.prevList; // This is not really used but just to make it consistent.
            nextList.prevList = this;
        }
    }

// ... existing code ...
}

辅助组件

IntPriorityQueue：优化空闲页查找（小顶堆）。

LongLongHashMap：高效管理空闲页元数据（开放寻址哈希）。

SizeClasses：定义内存分级策略（sizeIdx→size/pageIdx→pages映射）。

见：

Netty PoolChunk依赖的自定义数据结构：IntPriorityQueue和LongLongHashMap -CSDN博客

揭秘Netty内存池核心：SizeClasses-CSDN博客

协作流程分析

分配流程 (PoolArena.allocate())

请求进入PoolArena，按大小分类（Small/Normal/Huge）。
Small请求：从PoolThreadCache获取缓存 → 失败则查smallSubpagePools链表 → 无可用页时创建新PoolSubpage。
Normal请求：遍历PoolChunkList链表 → 在PoolChunk中分配连续页 → 更新Chunk位置。

1. 尺寸判断与缓存查找
   if (size <= maxCachedBufferCapacity) {
       // 先查PoolThreadCache
       if (cache.allocateSmall/Normal()) return;
   }

2. Arena级分配
   if (sizeIdx <= smallMaxSizeIdx) {
       // 小内存: 从smallSubpagePools分配
       allocateSmall();
   } else if (sizeIdx < nPSizes) {
       // 中等内存: 遍历PoolChunkList
       allocateNormal();  
   } else {
       // 大内存: 直接分配unpooled
       allocateHuge();
   }

ChunkList遍历策略

// 关键代码 - 从q050开始分配
private void allocateNormal(PooledByteBuf<T> buf, int reqCapacity, int sizeIdx, PoolThreadCache cache) {
    if (q050.allocate(buf, reqCapacity, sizeIdx, cache) ||
        q025.allocate(buf, reqCapacity, sizeIdx, cache) ||
        q000.allocate(buf, reqCapacity, sizeIdx, cache) ||
        qInit.allocate(buf, reqCapacity, sizeIdx, cache) ||
        q075.allocate(buf, reqCapacity, sizeIdx, cache)) {
        return;
    }
    // 新建chunk
    PoolChunk<T> c = newChunk(pageSize, nPSizes, pageShifts, chunkSize);
    boolean success = c.allocate(buf, reqCapacity, sizeIdx, cache);
    qInit.add(c);
}

释放与回收流程

优先放回PoolThreadCache → 触发定期缓存清理 → 最终归还到PoolArena。

1. 缓存回收
   if (cache.add(chunk, handle, normCapacity)) {
       // 成功加入PoolThreadCache
       return;
   }

2. Chunk级释放  
   chunk.free(handle, normCapacity, nioBuffer);
   
3. ChunkList重新分组
   if (chunk.freeBytes > chunk.chunkSize * maxUsage / 100) {
       // 移至更低使用率的ChunkList
       remove(chunk);
       nextList.add(chunk);
   }

内存到底在哪里？如何复用？

外部用户通常通过 ByteBufAllocator 接口的公共方法来请求内存，例如：

allocator.heapBuffer(initialCapacity, maxCapacity)
这是请求堆内存的公共 API，它在内部会调用到 newHeapBuffer。
allocator.buffer(...)
根据分配器的默认设置（是否偏好直接内存）来决定调用 heapBuffer 还是 directBuffer。

实际的物理内存（byte[] 用于堆内存，ByteBuffer 用于直接内存）由 PoolChunk 对象持有和管理。
一个 PoolChunk 代表一块较大的连续内存，默认大小为 16MB。

内存复用流程（精炼版）

请求内存
外部调用 allocator.buffer() 或 allocator.heapBuffer() / allocator.directBuffer() 请求一个 ByteBuf。这个请求会调用到会调用到 newHeapBuffer/newDirectBuffer
选择 Arena
PooledByteBufAllocator 内部维护了一个 PoolThreadLocalCache（通过 FastThreadLocal 实现），它为每个线程缓存了一个 PoolArena（堆内存和直接内存各一个）。分配器首先从 PoolThreadLocalCache 中获取当前线程绑定的 PoolArena。
线程缓存分配
PoolArena 会将分配请求委托给 PoolThreadCache。
PoolThreadCache 为不同大小的内存（Tiny/Small/Normal）维护了缓存队列（MemoryRegionCache）。它会优先尝试从这些缓存队列中直接获取一个已回收的、大小合适的 ByteBuf。若成功，分配即完成（极快）。
Arena 分配
若线程缓存无可用 ByteBuf，则由 PoolArena 分配：
- 管理一个或多个 PoolChunk。
- 对 Normal 大小内存（>pageSize），直接在 PoolChunk 中分配。
- 对 Small/Tiny 内存，先从 PoolChunk 分配一个 Page（默认8KB），再将其划分为更小的 Subpage 分配。
创建 ByteBuf 对象
内存分配成功后，从对象池获取 PooledByteBuf 实例（如 PooledHeapByteBuf 或 PooledDirectByteBuf），包装内存块的元信息（句柄、偏移量、长度等），最终返回该对象。
释放内存
调用 ByteBuf.release() 方法。
回收内存
- release() 将内存块交还 PoolArena 并标记为空闲。
- PooledByteBuf 对象本身也会被回收到对象池。
- 释放的内存块可能被缓存到 PoolThreadCache 中，供下次快速分配。
内存复用
被标记为空闲的内存（无论在 PoolChunk 还是 PoolThreadCache 中）均可被后续分配请求复用。

Netty内存池分层设计架构

整体架构关系图

PoolThreadCache - 线程本地缓存

PoolArena - 内存管理域

PoolChunk - 16MB内存块

`PoolSubpage`

PoolChunkList - 使用率分组管理

辅助组件

协作流程分析

内存到底在哪里？如何复用？

网站公告

今日签到

热门文章

最新发布

Netty内存池分层设计架构

整体架构关系图

PoolThreadCache - 线程本地缓存

PoolArena - 内存管理域

PoolChunk - 16MB内存块

PoolSubpage

PoolChunkList - 使用率分组管理

辅助组件​​

协作流程分析

内存到底在哪里？如何复用？

网站公告

今日签到

热门文章

最新发布

`PoolSubpage`

辅助组件