highlight: arduino-light

FastThreadLocal

FastThreadLocal 的實現與 ThreadLocal 非常類似，Netty 為 FastThreadLocal 量身打造了 FastThreadLocalThread 和 InternalThreadLocalMap 兩個重要的類。下面我們看下這兩個類是如何實現的。

FastThreadLocalThread 是對 Thread 類的一層包裝，每個線程對應一個 InternalThreadLocalMap 實例。只有 FastThreadLocal 和 FastThreadLocalThread 組合使用時，才能發揮 FastThreadLocal 的性能優勢。首先看下 FastThreadLocalThread 的源碼定義：

java public class FastThreadLocalThread extends Thread { ? ?private InternalThreadLocalMap threadLocalMap; ? ?// 省略其他代碼 }

可以看出 FastThreadLocalThread 主要擴展了 InternalThreadLocalMap 字段，我們可以猜測到 FastThreadLocalThread 主要使用 InternalThreadLocalMap 存儲數據，而不再是使用 Thread 中的 ThreadLocalMap。所以想知道 FastThreadLocalThread 高性能的奧秘，必須要了解 InternalThreadLocalMap 的設計原理。

上文中我們講到了 ThreadLocal 的一個重要缺點，就是 ThreadLocalMap 采用線性探測法解決 Hash 沖突性能較慢，那么 InternalThreadLocalMap 又是如何優化的呢？首先一起看下 InternalThreadLocalMap 的內部構造。

java class UnpaddedInternalThreadLocalMap { ? ?static final ThreadLocal<InternalThreadLocalMap> slowThreadLocalMap = new ThreadLocal<InternalThreadLocalMap>(); ? ?static final AtomicInteger nextIndex = new AtomicInteger(); ? ? ?Object[] indexedVariables; ? ?UnpaddedInternalThreadLocalMap(Object[] indexedVariables) { ? ? ? ?this.indexedVariables = indexedVariables; ? } ? ?// 省略其他代碼 } ? public final class InternalThreadLocalMap extends UnpaddedInternalThreadLocalMap { ? ?private static final int DEFAULT_ARRAY_LIST_INITIAL_CAPACITY = 8; ? ?private static final int STRING_BUILDER_INITIAL_SIZE; ? ?private static final int STRING_BUILDER_MAX_SIZE; ? ?public static final Object UNSET = new Object(); ? ?private BitSet cleanerFlags; ? ? ?private InternalThreadLocalMap() { ? ? ? ?super(newIndexedVariableTable()); ? } ? ?private static Object[] newIndexedVariableTable() { ? ? ? ?Object[] array = new Object[32]; ? ? ? ?Arrays.fill(array, UNSET); ? ? ? ?return array; ? } ? ? ?public static int nextVariableIndex() { ? ? ? ?int index = nextIndex.getAndIncrement(); ? ? ? ?if (index < 0) { ? ? ? ? ? ?nextIndex.decrementAndGet(); ? ? ? ? ? ?throw new IllegalStateException("too many thread-local indexed variables"); ? ? ? } ? ? ? ?return index; ? } ? ?// 省略其他代碼 } ?

從 InternalThreadLocalMap 內部實現來看，與 ThreadLocalMap 一樣都是采用數組的存儲方式。但是 InternalThreadLocalMap 并沒有使用線性探測法來解決 Hash 沖突，而是在 FastThreadLocal 初始化的時候分配一個數組索引 index，index 的值采用原子類 AtomicInteger 保證順序遞增，通過調用 InternalThreadLocalMap.nextVariableIndex() 方法獲得。然后在讀寫數據的時候通過數組下標 index 直接定位到 FastThreadLocal 的位置，時間復雜度為 O(1)。如果數組下標遞增到非常大，那么數組也會比較大，所以 FastThreadLocal 是通過空間換時間的思想提升讀寫性能。下面通過一幅圖描述 InternalThreadLocalMap、index 和 FastThreadLocal 之間的關系。

通過上面 FastThreadLocal 的內部結構圖，我們對比下與 ThreadLocal 有哪些區別呢？FastThreadLocal 使用 Object 數組替代了 Entry 數組，Object[0] 存儲的是一個Set\ > 集合，從數組下標 1 開始都是直接存儲的 value 數據，不再采用 ThreadLocal 的鍵值對形式進行存儲。

假設現在我們有一批數據需要添加到數組中，分別為 value1、value2、value3、value4，對應的 FastThreadLocal 在初始化的時候生成的數組索引分別為 1、2、3、4。如下圖所示。

至此，我們已經對 FastThreadLocal 有了一個基本的認識，下面我們結合具體的源碼分析 FastThreadLocal 的實現原理。

FastThreadLocal 示例

在講解源碼之前，我們回過頭看下上文中的 ThreadLocal 示例，如果把示例中 ThreadLocal 替換成 FastThread，應當如何使用呢？ java package io.netty.example.chapter1.echo; ? import io.netty.util.concurrent.FastThreadLocal; import io.netty.util.concurrent.FastThreadLocalThread; ? public class FastThreadLocalTest { ? ?private static final FastThreadLocal<String> THREAD_NAME_LOCAL ? ? ? ?= new FastThreadLocal<>(); ? ?private static final FastThreadLocal<String> TRADE_THREAD_LOCAL ? ? ? ?= new FastThreadLocal<>(); ? ? ?public static void main(String[] args) { ? ? ? ?for (int i = 0; i < 2; i++) { ? ? ? ? ? ?int tradeId = i; ? ? ? ? ? ?String threadName = "thread-" + i; ? ? ? ? ? ?new FastThreadLocalThread(() -> { ? ? ? ? ? ? ? ?THREAD_NAME_LOCAL.set(threadName); ? ? ? ? ? ? ? ?String String = new String("未支付" + Thread.currentThread().getName()); ? ? ? ? ? ? ? ?TRADE_THREAD_LOCAL.set(String); ? ? ? ? ? ? ? ?System.out.println("threadName: " + THREAD_NAME_LOCAL.get()); ? ? ? ? ? ? ? ?System.out.println("String info：" + TRADE_THREAD_LOCAL.get()); ? ? ? ? ? }, threadName).start(); ? ? ? } ? } } ? threadName: thread-1 String info：未支付thread-1 threadName: thread-0 String info：未支付thread-0 可以看出，FastThreadLocal 的使用方法幾乎和 ThreadLocal 保持一致，只需要把代碼中 Thread、ThreadLocal 替換為 FastThreadLocalThread 和 FastThreadLocal 即可，Netty 在易用性方面做得相當棒。下面我們重點對示例中用得到 FastThreadLocal.set()/get() 方法做深入分析。

FastThreadLocal 構造分析

```java public FastThreadLocal() { //下標遞增 index = InternalThreadLocalMap.nextVariableIndex(); }

public static int nextVariableIndex() {int index = nextIndex.getAndIncrement();if (index < 0) {nextIndex.decrementAndGet();throw new IllegalStateException("too many thread-local indexed variables");}return index;}

FastThreadLocal set源碼分析

public final void set(V value) {// 1. value 是否為缺省值if (value != InternalThreadLocalMap.UNSET) { // 2. 獲取當前線程的 InternalThreadLocalMapInternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.get(); // 3. 將 InternalThreadLocalMap 中數據替換為新的 valuesetKnownNotUnset(threadLocalMap, value); } else {remove();}
}//setKnownNotUnset() 如何將數據添加到 InternalThreadLocalMap 的。
private void setKnownNotUnset(InternalThreadLocalMap threadLocalMap, V value) {// 1. 找到數組下標 index 位置，設置新的 value//返回true 代表第一次放入//同一個 index重復放入不再放入if (threadLocalMap.setIndexedVariable(index, value)) { // 2. 將 FastThreadLocal 對象保存到待清理的 Set 中addToVariablesToRemove(threadLocalMap, this); }
}public boolean setIndexedVariable(int index, Object value) {Object[] lookup = indexedVariables;if (index < lookup.length) {Object oldValue = lookup[index]; // 直接將數組 index 位置設置為 value，時間復雜度為 O(1)lookup[index] = value; return oldValue == UNSET;} else {// 容量不夠，先擴容再設置值expandIndexedVariableTableAndSet(index, value); return true;}
}

``` indexedVariables 就是 InternalThreadLocalMap 中用于存放數據的數組，如果數組容量大于 FastThreadLocal 的 index 索引，那么直接找到數組下標 index 位置將新 value 設置進去，事件復雜度為 O(1)。在設置新的 value 之前，會將之前 index 位置的元素取出，如果舊的元素還是 UNSET 缺省對象，那么返回成功。

如果數組容量不夠了怎么辦呢？InternalThreadLocalMap 會自動擴容，然后再設置 value。接下來看看 expandIndexedVariableTableAndSet() 的擴容邏輯： ```java private void expandIndexedVariableTableAndSet(int index, Object value) { Object[] oldArray = indexedVariables; final int oldCapacity = oldArray.length; int newCapacity = index; newCapacity |= newCapacity >>> 1; newCapacity |= newCapacity >>> 2; newCapacity |= newCapacity >>> 4; newCapacity |= newCapacity >>> 8; newCapacity |= newCapacity >>> 16; newCapacity ++; Object[] newArray = Arrays.copyOf(oldArray, newCapacity); Arrays.fill(newArray, oldCapacity, newArray.length, UNSET); newArray[index] = value; indexedVariables = newArray; }

上述代碼的位移操作是不是似曾相識？我們去翻閱下 JDK HashMap 中擴容的源碼，其中有這么一段代碼：static final int tableSizeFor(int cap) {int n = cap - 1;n |= n >>> 1;n |= n >>> 2;n |= n >>> 4;n |= n >>> 8;n |= n >>> 16;return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

``` 可以看出 InternalThreadLocalMap 實現數組擴容幾乎和 HashMap 完全是一模一樣的，所以多讀源碼還是可以給我們很多啟發的。InternalThreadLocalMap 以 index 為基準進行擴容，將數組擴容后的容量向上取整為 2 的次冪。然后將原數組內容拷貝到新的數組中，空余部分填充缺省對象 UNSET，最終把新數組賦值給 indexedVariables。

為什么 InternalThreadLocalMap 以 index 為基準進行擴容，而不是原數組長度呢？假設現在初始化了 70 個 FastThreadLocal，但是這些 FastThreadLocal 從來沒有調用過 set() 方法，此時數組還是默認長度 32。當第 index = 70 的 FastThreadLocal 調用 set() 方法時，如果按原數組容量 32 進行擴容 2 倍后，還是無法填充 index = 70 的數據。所以使用 index 為基準進行擴容可以解決這個問題，但是如果 FastThreadLocal 特別多，數組的長度也是非常大的。

回到 setKnownNotUnset() 的主流程，向 InternalThreadLocalMap 添加完數據之后，接下就是將 FastThreadLocal 對象保存到待清理的 Set 中。我們繼續看下 addToVariablesToRemove() 是如何實現的。 private static void addToVariablesToRemove(InternalThreadLocalMap threadLocalMap, FastThreadLocal<?> variable) { // 獲取數組下標為 0 的元素 Object v = threadLocalMap.indexedVariable(variablesToRemoveIndex); Set<FastThreadLocal<?>> variablesToRemove; if (v == InternalThreadLocalMap.UNSET || v == null) { // 創建 FastThreadLocal 類型的 Set 集合 variablesToRemove = Collections.newSetFromMap(new IdentityHashMap<FastThreadLocal<?>, Boolean>()); // 將 Set 集合填充到數組下標 0 的位置 threadLocalMap.setIndexedVariable(variablesToRemoveIndex, variablesToRemove); } else { // 如果不是 UNSET，Set 集合已存在，直接強轉獲得 Set 集合 variablesToRemove = (Set<FastThreadLocal<?>>) v; } //放入的是threadLocal 是為了釋放threadLocal variablesToRemove.add(variable); // 將 FastThreadLocal 添加到 Set 集合中 } variablesToRemoveIndex 是采用 static final 修飾的變量，在 FastThreadLocal 初始化時 variablesToRemoveIndex 被賦值為 0。InternalThreadLocalMap 首先會找到數組下標為 0 的元素，如果該元素是缺省對象 UNSET 或者不存在，那么會創建一個 FastThreadLocal 類型的 Set 集合，然后把 Set 集合填充到數組下標 0 的位置。如果數組第一個元素不是缺省對象 UNSET，說明 Set 集合已經被填充，直接強轉獲得 Set 集合即可。這就解釋了 InternalThreadLocalMap 的 value 數據為什么是從下標為 1 的位置開始存儲了，因為 0 的位置已經被 Set 集合占用了。

為什么 InternalThreadLocalMap 要在數組下標為 0 的位置存放一個 FastThreadLocal 類型的 Set 集合呢？這時候我們回過頭看下 remove() 方法。 java public final void remove() { ? ?remove(InternalThreadLocalMap.getIfSet()); } ? public static InternalThreadLocalMap getIfSet() { ? ?Thread thread = Thread.currentThread(); ? ?if (thread instanceof FastThreadLocalThread) { ? ? ? ?return ((FastThreadLocalThread) thread).threadLocalMap(); ? } ? ?return slowThreadLocalMap.get(); } ? public final void remove(InternalThreadLocalMap threadLocalMap) { ? ?if (threadLocalMap == null) { ? ? ? ?return; ? } ? ?// 刪除數組下標 index 位置對應的 value ? ?Object v = threadLocalMap.removeIndexedVariable(index); ? ?// 從數組下標 0 的位置取出 Set 集合，并刪除當前 FastThreadLocal ? ?removeFromVariablesToRemove(threadLocalMap, this); ? ?if (v != InternalThreadLocalMap.UNSET) { ? ? ? ?try { ? ? ? ? ? ?onRemoval((V) v); // 空方法，用戶可以繼承實現 ? ? ? } catch (Exception e) { ? ? ? ? ? ?PlatformDependent.throwException(e); ? ? ? } ? } } 在執行 remove 操作之前，會調用 InternalThreadLocalMap.getIfSet() 獲取當前 InternalThreadLocalMap。有了之前的基礎，理解 getIfSet() 方法就非常簡單了，如果是 FastThreadLocalThread 類型，直接取 FastThreadLocalThread 中 threadLocalMap 屬性。如果是普通線程 Thread，從 ThreadLocal 類型的 slowThreadLocalMap 中獲取。

找到 InternalThreadLocalMap 之后，InternalThreadLocalMap 會從數組中定位到下標 index 位置的元素，并將 index 位置的元素覆蓋為缺省對象 UNSET。接下來就需要清理當前的 FastThreadLocal 對象，此時 Set 集合就派上了用場，InternalThreadLocalMap 會取出數組下標 0 位置的 Set 集合，然后刪除當前 FastThreadLocal。最后 onRemoval() 方法起到什么作用呢？Netty 只是留了一處擴展，并沒有實現，用戶需要在刪除的時候做一些后置操作，可以繼承 FastThreadLocal 實現該方法。

至此，FastThreadLocal.set() 的完成過程已經講完了，接下來我們繼續 FastThreadLocal.get() 方法的實現就易如反掌拉。

FastThreadLocal get源碼分析

java public final V get() { ? ?InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.get(); ? ?Object v = threadLocalMap.indexedVariable(index); // 從數組中取出 index 位置的元素 ? ?if (v != InternalThreadLocalMap.UNSET) { ? ? ? ?return (V) v; ? } ? ?return initialize(threadLocalMap); // 如果獲取到的數組元素是缺省對象，執行初始化操作 } public Object indexedVariable(int index) { ? ?Object[] lookup = indexedVariables; ? ?return index < lookup.length? lookup[index] : UNSET; } private V initialize(InternalThreadLocalMap threadLocalMap) { ? ?V v = null; ? ?try { ? ? ? ?v = initialValue(); ? } catch (Exception e) { ? ? ? ?PlatformDependent.throwException(e); ? } ? ?threadLocalMap.setIndexedVariable(index, v); ? ?addToVariablesToRemove(threadLocalMap, this); ? ?return v; } 首先根據當前線程是否是 FastThreadLocalThread 類型找到 InternalThreadLocalMap，然后取出從數組下標 index 的元素，如果 index 位置的元素不是缺省對象 UNSET，說明該位置已經填充過數據，直接取出返回即可。

如果 index 位置的元素是缺省對象 UNSET，那么需要執行初始化操作。可以看到，initialize() 方法會調用用戶重寫的 initialValue 方法構造需要存儲的對象數據，如下所示。 java private final FastThreadLocal<String> threadLocal = new FastThreadLocal<String>() { @Override protected String initialValue() { return "hello world"; } }; 構造完用戶對象數據之后，接下來就會將它填充到數組 index 的位置，然后再把當前 FastThreadLocal 對象保存到待清理的 Set 中。整個過程我們在分析 FastThreadLocal.set() 時都已經介紹過，就不再贅述了。

到此為止，FastThreadLocal 最核心的兩個方法 set()/get() 我們已經分析完了。

FastThreadLocalThread

無參構造方法:和普通Thread一樣

```java public class FastThreadLocalThread extends Thread { //無參構造方法 //使用無參構造方法跟普通的Thread一樣 public FastThreadLocalThread() { //不需要清理FastThreadLocals cleanupFastThreadLocals = false; } ｝

//無參構造 
FastThreadLocalThread fastThreadLocalThread = new FastThreadLocalThread();
//其實調用的是父類Thread的run方法
/***@Overridepublic void run() {if (target != null) {target.run();}}
***/    
fastThreadLocalThread.run();

```

有參構造方法:做了包裝

```java public class FastThreadLocalThread extends Thread { //有參構造方法 public FastThreadLocalThread(Runnable target) { //使用FastThreadLocalRunnable做了1個包裝 super(FastThreadLocalRunnable.wrap(target)); //需要清理FastThreadLocals cleanupFastThreadLocals = true; } }

final class FastThreadLocalRunnable implements Runnable {private final Runnable runnable;//包裝類//判斷傳入進來的runnable是否是FastThreadLocalRunnable//如果是 就直接返回傳入進來的runnable//如果不是 構造1個FastThreadLocalRunnable返回static Runnable wrap(Runnable runnable) {return runnable instanceof FastThreadLocalRunnable ? runnable : new FastThreadLocalRunnable(runnable);} //將runnable賦值給成員變量的runnableprivate FastThreadLocalRunnable(Runnable runnable) {this.runnable = ObjectUtil.checkNotNull(runnable, "runnable");}//關鍵點在于這里做了1個包裝//業務邏輯runnable的run方法走完會調用// FastThreadLocal.removeAll();@Overridepublic void run() {try {runnable.run();} finally {FastThreadLocal.removeAll();}}
}public static void removeAll() {InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.getIfSet();if (threadLocalMap == null) {return;}try {Object v = threadLocalMap.indexedVariable(variablesToRemoveIndex);if (v != null && v != InternalThreadLocalMap.UNSET) {@SuppressWarnings("unchecked")Set<FastThreadLocal<?>> variablesToRemove = (Set<FastThreadLocal<?>>) v;FastThreadLocal<?>[] variablesToRemoveArray =variablesToRemove.toArray(new FastThreadLocal[0]);for (FastThreadLocal<?> tlv: variablesToRemoveArray) {tlv.remove(threadLocalMap);}}} finally {InternalThreadLocalMap.remove();}}

```

判斷是否會自動清理

java @UnstableApi ? ?public static boolean willCleanupFastThreadLocals(Thread thread) { ? ? ? ?//是FastThreadLocalThread ? ? ? ?//并且cleanupFastThreadLocals為true ? ? ? ?return thread instanceof FastThreadLocalThread && ? ? ? ? ? ? ? ((FastThreadLocalThread) thread).willCleanupFastThreadLocals(); ? }

FTL一定比 ThreadLocal 快嗎？

答案是不一定的，只有使用FastThreadLocalThread 類型的線程才會更快，如果是普通線程反而會更慢。

FTL不會浪費很大的空間

雖然 FastThreadLocal 采用的空間換時間的思路，但是在 FastThreadLocal 設計之初就認為不會存在特別多的 FastThreadLocal 對象，而且在數據中沒有使用的元素只是存放了同一個缺省對象的引用，并不會占用太多內存空間。

總結

本節課我們對比介紹了 ThreadLocal 和 FastThreadLocal，簡單總結下 FastThreadLocal 的優勢。

高效查找。

FastThreadLocal 在定位數據的時候可以直接根據數組下標 index 獲取，時間復雜度 O(1)。而 JDK 原生的 ThreadLocal 在數據較多時哈希表很容易發生 Hash 沖突，線性探測法在解決 Hash 沖突時需要不停地向下尋找，效率較低。

此外，FastThreadLocal 相比 ThreadLocal 數據擴容更加簡單高效，FastThreadLocal 以 index 為基準向上取整到 2 的次冪作為擴容后容量，然后把原數據拷貝到新數組。而 ThreadLocal 由于采用的哈希表，所以在擴容后需要再做一輪 rehash。

安全性更高。

JDK 原生的 ThreadLocal 使用不當可能造成內存泄漏，只能等待線程銷毀。在使用線程池的場景下，ThreadLocal 只能通過主動檢測的方式防止內存泄漏，從而造成了一定的開銷。

然而 FastThreadLocal 不僅提供了 remove() 主動清除對象的方法，而且在線程池場景中 Netty 還封裝了 FastThreadLocalRunnable，FastThreadLocalRunnable 最后會執行 FastThreadLocal.removeAll() 將 Set 集合中所有 FastThreadLocal 對象都清理掉，

FastThreadLocal 體現了 Netty 在高性能方面精益求精的設計精神，FastThreadLocal 僅僅是其中的冰山一角，下節課我們繼續探索 Netty 中其他高效的數據結構技巧。