深入ThreadLocal

2022-02-06

字数统计: 4.3k | 阅读时长≈ 18 分钟

我们知道线程安全(是指广义上的共享资源访问安全性，因为线程隔离是通过副本保证本线程访问资源安全性，它不保证线程之间还存在共享关系的狭义上的安全性)的解决思路有：

互斥同步： synchronized 和 ReentrantLock 等。
非阻塞同步： CAS, AtomicXXXX 。
无同步方案：栈封闭，本地存储(Thread Local)，可重入代码。

其中“无同步方案”中的ThreadLocal就是本文的主角。

ThreadLocal简介

ThreadLocal是java.lang.Object包下面的一个类，官方文档是这样讲的：

This class provides thread-local variables. These variables differ from their normal counterparts in that each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable. ThreadLocal instances are typically private static fields in classes that wish to associate state with a thread (e.g., a user ID or Transaction ID).

翻译成中文就是：这个类提供线程本地的变量，这些变量与普通变量的区别在于，线程访问这些本地变量时，访问的是自己的副本。 ThreadLocal实例通常是类中的私有静态字段，它们希望将状态与线程相关联(例如，用户ID或事务ID)。

总结来说就是：ThreadLocal提供了线程隔离的功能，当使用ThreaLocal维护变量时，每个线程都有自己的副本，不同线程之间互不影响。避免因多线程操作共享变量而导致的数据不一致的情况。

ThreadLocal底层原理

ThreadLocal数据结构

知道了ThreadLocal是什么，有什么用，那么ThreadLocal底层是怎样实现的呢？它是如何为每个线程都维护一个变量副本的？

在 Thread类有一个类型为ThreadLocal.ThreadLocalMap的实例变量threadLocals，也就是说每个线程有一个自己的ThreadLocalMap 。

1
2
3

/* ThreadLocal values pertaining to this thread. This map is maintained
     * by the ThreadLocal class. */
ThreadLocal.ThreadLocalMap threadLocals = null;

整个数据存储结构如下图所示：

每个线程在往ThreadLocal里放值的时候，都会往自己的ThreadLocalMap里存，读也是以ThreadLocal作为引用，在自己的map里找对应的key，从而实现了线程隔离。

ThreadLocalMap类

ThreadLocalMap是ThreadLocal中的静态内部类，只能供ThreadLocal使用。官方文档中是这样介绍它的：

ThreadLocalMap is a customized hash map suitable only for maintaining thread local values.

本质上来讲, 它就是一个Map, 但是这个ThreadLocalMap与我们平时见到的Map有点不一样：

它没有实现Map接口;
它没有public的方法, 最多有一个default的构造方法, 因为这个ThreadLocalMap的方法仅仅在ThreadLocal类中调用, 属于静态内部类
ThreadLocalMap的Entry实现继承了WeakReference<ThreadLocal<?>>
该方法仅仅用了一个Entry数组来存储Key, Value; Entry并不是链表形式, 而是每个bucket里面仅仅放一个Entry;

下面部分源码展示了ThreadLocalMap中的部分属性和静态内部类Entry。

static class ThreadLocalMap {
        static class Entry extends WeakReference<ThreadLocal<?>> {
            Object value;
            Entry(ThreadLocal<?> k, Object v) {
                super(k);
                value = v;
            }
        }
        private static final int INITIAL_CAPACITY = 16;

        private Entry[] table;

        private int size = 0;

        private int threshold; // Default to 0

        private void setThreshold(int len) {
            threshold = len * 2 / 3;
        }
}

既然ThreadLocalMap类是一个类似Map结构，那么我们最关系的就是它的哈希算法和get、set等方法了。

hash算法

1 2	//ThreadLocal<?> key，i表示数组下标，len表示数组长度 int i = key.threadLocalHashCode & (len-1);

其中threadLocalHashCode是ThreadLocal类的成员变量，生成方式如下：

private final int threadLocalHashCode = nextHashCode();
//初始值为0
private static AtomicInteger nextHashCode = new AtomicInteger();
private static final int HASH_INCREMENT = 0x61c88647;
//每创建一个ThreadLocal对象，nextHashCode值都增加0x61c88647
private static int nextHashCode() {
    return nextHashCode.getAndAdd(HASH_INCREMENT);
}

每当创建一个ThreadLocal对象，这个ThreadLocal.nextHashCode 这个值就会增长 0x61c88647 。

这个值很特殊，它是斐波那契数 也叫 黄金分割数。hash增量为这个数字，带来的好处就是 hash 分布非常均匀。

hash冲突

虽然ThreadLocalMap中使用了黄金分割数来作为hash计算因子，大大减少了Hash冲突的概率，但是仍然会存在冲突。

ThreadLocalMap 中并没有链表结构，因此采用线性探查法解决hash冲突问题。即发生冲突时线性向后查找，一直找到 Entry 为 null 的槽位才会停止查找，将当前元素放入此槽位中。

set()方法

先直接上源码：

private void set(ThreadLocal<?> key, Object value) {
    Entry[] tab = table;
    int len = tab.length;
    int i = key.threadLocalHashCode & (len-1);
	//线性探查法解决hash冲突
    for (Entry e = tab[i];
         e != null;
         e = tab[i = nextIndex(i, len)]) {
        ThreadLocal<?> k = e.get();
		//如果相等，说明是更新
        if (k == key) {
            e.value = value;
            return;
        }
		//如果为null，直接插入，否则产生hash冲突，向后寻找第一个空位
        if (k == null) {
            replaceStaleEntry(key, value, i);
            return;
        }
    }

    tab[i] = new Entry(key, value);
    int sz = ++size;
    //插入或更新完成后，需要进行过期元素的处理，并rehash
    if (!cleanSomeSlots(i, sz) && sz >= threshold)
        rehash();
}
//hash冲突时，寻找下一个空位
private static int nextIndex(int i, int len) {
    return ((i + 1 < len) ? i + 1 : 0);
}

get()方法

get调用的时getEntry方法，其源码如下：

private Entry getEntry(ThreadLocal<?> key) {
    //确定下标
    int i = key.threadLocalHashCode & (table.length - 1);
    Entry e = table[i];
    //如果存在，并且其key等于我们要找的，直接返回
    if (e != null && e.get() == key)
        return e;
    else
        return getEntryAfterMiss(key, i, e);
}
/**
 * Version of getEntry method for use when key is not found in
 * its direct hash slot.
 */
private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
    Entry[] tab = table;
    int len = tab.length;
    
    while (e != null) {
        ThreadLocal<?> k = e.get();
        if (k == key)
            return e;
        if (k == null)
            expungeStaleEntry(i);//顺带删除一下过时的entry
        else
            i = nextIndex(i, len);//向后探查
        e = tab[i];
    }
    return null;
}

关于ThreadLocalMap扩容，过期key的清理这些内容，本文就不详细介绍了。

ThreadLocal的get和set方法

get()方法

/**
 * Returns the value in the current thread's copy of this
 * thread-local variable.  If the variable has no value for the
 * current thread, it is first initialized to the value returned
 * by an invocation of the {@link #initialValue} method.
 *
 * @return the current thread's value of this thread-local
 */
public T get() {
    //获取当前线程
    Thread t = Thread.currentThread();
    //获取当前线程下的map
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        //将ThreadLocal对象作为key，从map中取数据
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    //map为空，使用初始值创建map
    return setInitialValue();
}

set()方法

/**
 * Sets the current thread's copy of this thread-local variable
 * to the specified value.  Most subclasses will have no need to
 * override this method, relying solely on the {@link #initialValue}
 * method to set the values of thread-locals.
 *
 * @param value the value to be stored in the current thread's copy of
 *        this thread-local.
 */
public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
}

set方法很简单，map不为空则插入数据，为空则创建map。

总结

看到这里，还有块内容比较蒙：Thread，ThreadLocal和ThreadLocalMap三者什么关系？如何通过ThreadLocal实现每个共享变量隔离？Thread里面这样set、get数据的？如果你也有同样的感觉，看完这个总结，就能解答全部疑惑了。

Thread，ThreadLocal和ThreadLocalMap关系

Thread不用多说，每个线程就是一个Thread实例
ThreadLocalMap，ThreadLocal类中的静态内部类，是线程用来存储共享变量副本的（Thread类中有一个成员变量ThreadLocal.ThreadLocalMap threadLocals = null;）
从前两点看来，Thread用于ThreadLocalMap就已经实现了共享变量的隔离，要ThreadLocal有何用？

原因有两点：
1. ThreadLocalMap是ThreadLocal，有ThreadLocal才能用（没啥说服力）。
2. ThreadLocalMap中的key是ThreadLocal实例的引用，当有多个ThreadLocal实例时，需要根据其引用到map中找value
因此ThreadLocal存在的意义就是作为key，当线程中存在多个共享变量时（有多个ThreadLocal实例），我们要将ThreadLocal实例的引用作为key，来确定我们要set或get哪个共享变量。

线程中get/set共享变量

捋清楚了Thread，ThreadLocal和ThreadLocalMap三者之间的关系，这里我们总结下在线程中访问共享变量的流程：

上图展示的时set方法的流程，简单总结就是：将ThreadLocal实例的引用作为key，去map中set。

get方法也是类似的，这里不再详细描述。

内存泄漏问题

为什么会内存泄漏

先看一下ThreadLocal的原理图：

每个Thread都有自己的ThreadLocalMap表，map表的key是ThreadLocal实例本身，value存储的是数据。在前面讨论Thread，ThreadLocal和ThreadLocalMap三者之间的关系时我们就提到，存储数据靠着ThreadLocalMap就够了，ThreadLocal存在的目的就是作为key。当线程中有多个共享变量时，我们需要根据ThreadLocal实例作为key，来寻找我们需要的那个共享变量。

上图值得注意的就是那条虚线，表示 ThreadLocalMap 是使用 ThreadLocal 的弱引用作为 Key 的，弱引用的对象在 GC 时会被回收（指没有强引用的情况下）。

了解了ThreadLocal原理，我们再来分析为什么可能发生内存泄漏：

ThreadLocalMap使用ThreadLocal的弱引用作为key，如果一个ThreadLocal没有外部强引用来引用它，那么系统 GC 的时候，这个ThreadLocal势必会被回收，这样一来，ThreadLocalMap中就会出现key为null的Entry，就没有办法访问这些key为null的Entry的value，如果当前线程再迟迟不结束的话，这些key为null的Entry的value就会一直存在一条强引用链：Thread Ref -> Thread -> ThreaLocalMap -> Entry -> value永远无法回收，造成内存泄漏。

其实，ThreadLocalMap的设计中已经考虑到这种情况，也加上了一些防护措施：在ThreadLocal的get(),set(),remove()的时候都会清除线程ThreadLocalMap里所有key为null的value。但是这些被动的预防措施并不能保证不会内存泄漏。

为什么用弱引用

从表面上看内存泄漏的根源在于使用了弱引用。网上的文章大多着重分析ThreadLocal使用了弱引用会导致内存泄漏，其实并不是这样。但是另一个问题也同样值得思考：为什么使用弱引用而不是强引用？

我们先来看看官方文档的说法：

To help deal with very large and long-lived usages, the hash table entries use WeakReferences for keys.

下面我们分两种情况讨论：

key 使用强引用：引用的ThreadLocal的对象被回收了，但是ThreadLocalMap还持有ThreadLocal的强引用，如果没有手动删除，ThreadLocal不会被回收，导致Entry内存泄漏。
key 使用弱引用：引用的ThreadLocal的对象被回收了，由于ThreadLocalMap持有ThreadLocal的弱引用，即使没有手动删除，ThreadLocal也会被回收。value在下一次ThreadLocalMap调用set,get，remove的时候会被清除。

比较两种情况，我们可以发现：由于ThreadLocalMap的生命周期跟Thread一样长，如果都没有手动删除对应key，都会导致内存泄漏，但是使用弱引用可以多一层保障：弱引用ThreadLocal不会内存泄漏，对应的value在下一次ThreadLocalMap调用set,get,remove的时候会被清除。

因此，ThreadLocal内存泄漏的根源是：由于ThreadLocalMap的生命周期跟Thread一样长，如果没有手动删除对应key就会导致内存泄漏，而不是因为弱引用。

最佳实践

综合上面的分析，我们可以理解ThreadLocal内存泄漏的前因后果，那么怎么避免内存泄漏呢？

每次使用完ThreadLocal，都调用它的remove()方法，清除数据。
把 ThreadLocal 对象定义成 static 的，即定义成类变量，这样，我们会一直持有 ThreadLocal 对象，从根源上断绝了内存泄露的情况（不知道大家有没有注意到，上面讨论的弱引用也好，内存泄露也好，都是在 ThreadLocal 对象会被业务代码释放引用的前提下进行讨论的）。

在使用线程池的情况下，没有及时清理ThreadLocal，不仅是内存泄漏的问题，更严重的是可能导致业务逻辑出现问题。所以，使用ThreadLocal就跟加锁完要解锁一样，用完就清理。

ThreadLocal应用

现在我们已经知道了ThreadLocal的核心功能在于：在多个线程之间隔离共享变量。那么在实际开发中，它有哪些引用场景呢？

序列号

如果我们希望通过某个类将状态(例如用户ID、事务ID)与线程关联起来，那么通常在这个类中定义private static类型的ThreadLocal 实例。

package com.test;  
  
public class TestNum {  
    // ①通过匿名内部类覆盖ThreadLocal的initialValue()方法，指定初始值  
    private static ThreadLocal<Integer> seqNum = new ThreadLocal<Integer>() {  
        public Integer initialValue() {  
            return 0;  
        }  
    };  
  
    // ②获取下一个序列值  
    public int getNextNum() {  
        seqNum.set(seqNum.get() + 1);  
        return seqNum.get();  
    }  
  
    public static void main(String[] args) {  
        TestNum sn = new TestNum();  
        // ③ 3个线程共享sn，各自产生序列号  
        TestClient t1 = new TestClient(sn);  
        TestClient t2 = new TestClient(sn);  
        TestClient t3 = new TestClient(sn);  
        t1.start();  
        t2.start();  
        t3.start();  
    }  
  
    private static class TestClient extends Thread {  
        private TestNum sn;  
  
        public TestClient(TestNum sn) {  
            this.sn = sn;  
        }  
  
        public void run() {  
            for (int i = 0; i < 3; i++) {  
                // ④每个线程打出3个序列值  
                System.out.println("thread[" + Thread.currentThread().getName() + "] --> sn["  
                         + sn.getNextNum() + "]");  
            }  
        }  
    }  
}

通常我们通过匿名内部类的方式定义ThreadLocal的子类，提供初始的变量值，如例子中①处所示。TestClient线程产生一组序列号，在③处，我们生成3个TestClient，它们共享同一个TestNum实例。运行以上代码，在控制台上输出以下的结果：

thread[Thread-0] --> sn[1]
thread[Thread-1] --> sn[1]
thread[Thread-2] --> sn[1]
thread[Thread-1] --> sn[2]
thread[Thread-0] --> sn[2]
thread[Thread-1] --> sn[3]
thread[Thread-2] --> sn[2]
thread[Thread-0] --> sn[3]
thread[Thread-2] --> sn[3]

考察输出的结果信息，我们发现每个线程所产生的序号虽然都共享同一个TestNum实例，但它们并没有发生相互干扰的情况，而是各自产生独立的序列号，这是因为我们通过ThreadLocal为每一个线程提供了单独的副本。

Session管理

private static final ThreadLocal threadSession = new ThreadLocal();  
  
public static Session getSession() throws InfrastructureException {  
    Session s = (Session) threadSession.get();  
    try {  
        if (s == null) {  
            s = getSessionFactory().openSession();  
            threadSession.set(s);  
        }  
    } catch (HibernateException ex) {  
        throw new InfrastructureException(ex);  
    }  
    return s;  
}

线程内部创建ThreadLocal

还有一种用法是在线程类内部创建ThreadLocal，基本步骤如下：

在多线程的类(如ThreadDemo类)中，创建一个ThreadLocal对象threadXxx，用来保存线程间需要隔离处理的对象xxx。
在ThreadDemo类中，创建一个获取要隔离访问的数据的方法getXxx()，在方法中判断，若ThreadLocal对象为null时候，应该new()一个隔离访问类型的对象，并强制转换为要应用的类型。
在ThreadDemo类的run()方法中，通过调用getXxx()方法获取要操作的数据，这样可以保证每个线程对应一个数据对象，在任何时刻都操作的是这个对象。

public class ThreadLocalTest implements Runnable{
    
    ThreadLocal<Student> StudentThreadLocal = new ThreadLocal<Student>();

    @Override
    public void run() {
        String currentThreadName = Thread.currentThread().getName();
        System.out.println(currentThreadName + " is running...");
        Random random = new Random();
        int age = random.nextInt(100);
        System.out.println(currentThreadName + " is set age: "  + age);
        Student Student = getStudentt(); //通过这个方法，为每个线程都独立的new一个Studentt对象，每个线程的的Studentt对象都可以设置不同的值
        Student.setAge(age);
        System.out.println(currentThreadName + " is first get age: " + Student.getAge());
        try {
            Thread.sleep(500);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        System.out.println( currentThreadName + " is second get age: " + Student.getAge());
        
    }
    
    private Student getStudentt() {
        Student Student = StudentThreadLocal.get();
        if (null == Student) {
            Student = new Student();
            StudentThreadLocal.set(Student);
        }
        return Student;
    }

    public static void main(String[] args) {
        ThreadLocalTest t = new ThreadLocalTest();
        Thread t1 = new Thread(t,"Thread A");
        Thread t2 = new Thread(t,"Thread B");
        t1.start();
        t2.start();
    }
    
}

class Student{
    int age;
    public int getAge() {
        return age;
    }
    public void setAge(int age) {
        this.age = age;
    }
    
}

SimpleDateFormat方法

SimpleDateFormat是线程不安全的，因为其内部用个Canlendar对象，多线程环境下，访问这个共享对象存在问题。阿里巴巴 java 开发手册中推荐的 ThreadLocal 的用法：

import java.text.DateFormat;
import java.text.SimpleDateFormat;
 
public class DateUtils {
	//把DateFormat放到ThreadLocal中，这样每个线程都有自己的DateFormat实例
    public static final ThreadLocal<DateFormat> threadLocal = new ThreadLocal<DateFormat>(){
        @Override
        protected DateFormat initialValue() {
            return new SimpleDateFormat("yyyy-MM-dd");
        }
    };
    public static String format(Date date) {
        return local.get().format(date);
    }

    public static Date parse(String dateStr) throws ParseException {
        return local.get().parse(dateStr);
    }
}

这是不使用ThreadLocal，多线程情况下出错的场景演示：

public class DateUtil {
    private static SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
    public static String format(Date date) {
        return dateFormat.format(date);
    }
    public static Date parse(String dateStr) throws ParseException {
        return dateFormat.parse(dateStr);
    }
    public static void main(String[] args) {
        final CountDownLatch latch = new CountDownLatch(1);
        final String[] strs = new String[] {"2016-01-01 10:24:00", "2016-01-02 20:48:00", "2016-01-11 12:24:00"};
        for (int i = 0; i < 10; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    try {
                        latch.await();
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                    for (int i = 0; i < 10; i++){
                        try {
                            System.out.println(Thread.currentThread().getName()+ "\t" + parse(strs[i % strs.length]));
                            Thread.sleep(100);
                        } catch (ParseException e) {
                            e.printStackTrace();
                        } catch (InterruptedException e) {
                            e.printStackTrace();
                        }
                    }
                }
            }).start();
        }
        latch.countDown();
    }
}

参考资料

打赏

版权声明： 本博客所有文章除特别声明外，著作权归作者所有。转载请注明出处！