分布式鎖總結

文章目錄

分布式鎖
- 什么是分布式鎖？
- 分布式鎖的實現方式
- - 基于數據庫(mysql)實現
  - 基于緩存(redis)
  - - 多實例并發訪問問題演示
    - - 項目代碼(使用redis)
      - 配置nginx.conf
      - jmeter壓測復現問題
      - 并發是1，即不產生并發問題
        并發30測試,產生并發問題(雖然單實例是`synchronized`，但是這是分布式多實例)
  - redis 分布式鎖：setnx實現
  - - 分布式鎖的過期時間和看門狗
    - 附：redis setnx相關命令和分布式鎖
    - Redisson
    - - 代碼&測試
      - Redisson 底層原理
      - 實現可重入鎖
      - redis分布式鎖的問題？
      - redis主從架構問題？
      - Redlock（超半數加鎖成功才成功）
      - 高并發分布式鎖如何實現
  - 基于ZooKeeper實現
  - - zookeeper節點類型
    - zookeeper的watch機制
    - zookeeper lock
    - - 普通臨時節點（羊群效應）
      - 順序節點（公平，避免羊群效應）
    - Curator InterProcessMutex(可重入公平鎖)
    - - code&測試
      - InterProcessMutex 內部原理
      - 初始化
        加鎖
        watch
        釋放鎖
  - redis vs zookeeper AI回答

分布式鎖

什么是分布式鎖？

鎖：共享資源；共享資源互斥的；多任務環境
分布式鎖：如果多任務是多個JVM進程，需要一個外部鎖，而不是JDK提供的鎖

在分布式的部署環境下，通過鎖機制來讓多客戶端互斥的對共享資源進行訪問

排它性：在同一時間只會有一個客戶端能獲取到鎖，其它客戶端無法同時獲取
避免死鎖：這把鎖在一段有限的時間之后，一定會被釋放（正常釋放或異常釋放）
高可用：獲取或釋放鎖的機制必須高可用且性能佳

分布式鎖的實現方式

基于數據庫(mysql)實現

新建一個鎖表

CREATE TABLE `methodLock` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主鍵',  
`method_name` varchar(64) NOT NULL DEFAULT '' COMMENT '鎖定的方法名',
`desc` varchar(1024) NOT NULL DEFAULT '備注信息',  
`update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '保存數據時間，自動生成',  
PRIMARY KEY (`id`),  
UNIQUE KEY `uidx_method_name` (`method_name `) USING BTREE ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='鎖定中的方法';

insert, delete(method_name有唯一約束)
缺點：
- 數據庫單點會導致業務不可用
- 鎖沒有失效時間：一旦解鎖操作失敗，就會導致鎖記錄一直在數據庫中，其它線程無法再獲得到鎖。
- 非重入鎖：同一個線程在沒有釋放鎖之前無法再次獲得該鎖。因為數據中數據已經存在記錄了
- 非公平鎖
通過數據庫的排它鎖來實現

在查詢語句后面增加for update(表鎖，行鎖)，數據庫會在查詢過程中給數據庫表增加排它鎖。當某條記錄被加上排他鎖之后，其它線程無法再在該行記錄上增加排它鎖。可以認為獲得排它鎖的線程即可獲得分布式鎖，當獲取到鎖之后，可以執行方法的業務邏輯，執行完方法之后，再通過connection.commit()操作來釋放鎖

public boolean lock(){connection.setAutoCommit(false)while (true) {try {result = select * from methodLock where method_name=xxx for update;if (result == null) {return true;}} catch (Exception e) {}sleep(1000);}return false;
}public void unlock(){connection.commit();
}

基于緩存(redis)

多實例并發訪問問題演示

項目代碼(使用redis)

見項目代碼：減庫存的例子

讓Springboot項目啟動兩個實例(即有兩個JVM進程)

curl -X POST \http://localhost:8088/deduct_stock_sync \-H 'Content-Type: application/json'curl -X POST \http://localhost:8089/deduct_stock_sync \-H 'Content-Type: application/json'

減庫存調用測試
在這里插入圖片描述

配置nginx.conf

http {include       mime.types;default_type  application/octet-stream;#log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '#                  '$status $body_bytes_sent "$http_referer" '#                  '"$http_user_agent" "$http_x_forwarded_for"';#access_log  logs/access.log  main;sendfile        on;#tcp_nopush     on;#keepalive_timeout  0;keepalive_timeout  65;#gzip  on;upstream redislock{server localhost:8088 weight=1;server localhost:8089 weight=1;}server {listen       8080;server_name  localhost;#charset koi8-r;#access_log  logs/host.access.log  main;location / {root   html;index  index.html index.htm;proxy_pass  http://redislock;}}
}

nginx啟動和關閉命令

mubi@mubideMacBook-Pro nginx $ sudo nginx
mubi@mubideMacBook-Pro nginx $ ps -ef | grep nginx0 47802     1   0  1:18下午 ??         0:00.00 nginx: master process nginx-2 47803 47802   0  1:18下午 ??         0:00.00 nginx: worker process501 47835 20264   0  1:18下午 ttys001    0:00.00 grep --color=always nginx
mubi@mubideMacBook-Pro nginx $

sudo nginx -s stop

訪問測試

curl -X POST \http://localhost:8080/deduct_stock_sync \-H 'Content-Type: application/json'

jmeter壓測復現問題

redis 設置 stock 為 100

并發是1，即不產生并發問題

在這里插入圖片描述
庫存減少30， redis get結果會是最終的70，符合

并發30測試,產生并發問題(雖然單實例是`synchronized`，但是這是分布式多實例)

在這里插入圖片描述

并發30訪問測試結果：并不是最后的70

在這里插入圖片描述

redis 分布式鎖：setnx實現

在這里插入圖片描述

30的并發失敗率是60%，即只有12個成功的，最后redis的stock值是88符合預期

可以看到大部分沒有搶到redis鎖，而返回了系統繁忙錯誤

在這里插入圖片描述

分布式鎖的過期時間和看門狗

機器宕機可能導致finally釋放鎖失敗，所以必須為redis key設置一個過期時間，但是設置的過期時間是多少是個問題？

超時時間是個問題：因為業務時長不確定的；如果設置短了而業務執行很長，那么會由于過期時間刪除了可以，那么鎖會被其它業務線程給搶了
其它線程可能刪除別的線程的鎖，因為鎖沒有什么標記
改進1

@PostMapping(value = "/deduct_stock_lock")
public String deductStockLock() throws Exception {// setnx，redis單線程String lockKey = "lockKey";String clientId = UUID.randomUUID().toString();// 如下兩句要原子操作
//        Boolean setOk = stringRedisTemplate.opsForValue().setIfAbsent(lockKey, lockVal);
//        stringRedisTemplate.expire(lockKey, 10 , TimeUnit.SECONDS); // 設置過期時間Boolean setOk = stringRedisTemplate.opsForValue().setIfAbsent(lockKey, clientId, 10, TimeUnit.SECONDS);if (!setOk) {throw new Exception("業務繁忙，請稍后再試");}String retVal;try {// 只有一個線程能執行成功,可能有業務異常拋出來，可能宕機等等；但無論如何要釋放鎖retVal = stockReduce();} finally {// 可能失敗if (clientId.equals(stringRedisTemplate.opsForValue().get(lockKey))) {stringRedisTemplate.delete(lockKey);}}return retVal;
}

過期時間短不夠的問題：可以不斷的定時設置，給鎖續命: 看門狗；開啟新線程，每隔一段時間，判斷鎖還在不在，然后重新設置過期時間
set key,value的時候，value設置為當前線程id，然后刪除的時候判斷下，確保刪除正確

附：redis setnx相關命令和分布式鎖

setnx（SET if Not eXists）
EXPIRE key seconds：設置key 的生存時間，當key過期(生存時間為0)，會自動刪除

如下，一個原子操作設置key:value，并設置10秒的超時

在這里插入圖片描述

boolean lock(){ret = set key value(thread Id) 10 nx;if (!ret) {return false;}return true;
}void unlock(){val = get keyif ( val != null && val.equals( thread Id) ) {del key;}
}

Redisson

Redisson是一個基于Redis的Java客戶端，提供了分布式鎖的實現。其核心通過Redis的Lua腳本和原子操作保證鎖的互斥性，支持可重入、公平鎖、鎖續期等功能。

代碼&測試

@Bean
public Redisson redisson(){Config config = new Config();config.useSingleServer().setAddress("redis://localhost:6379").setDatabase(0);return (Redisson)Redisson.create(config);
}

@Autowired
private Redisson redisson;@PostMapping(value = "/deduct_stock_redisson")
public String deductStockRedisson() throws Exception {String lockKey = "lockKey";RLock rLock = redisson.getLock(lockKey);String retVal;try {rLock.lock();// 只有一個線程能執行成功,可能有業務異常拋出來，可能宕機等等；但無論如何要釋放鎖retVal = stockReduce();} finally {rLock.unlock();}return retVal;
}

如下并發請求毫無問題：
在這里插入圖片描述

Redisson 底層原理

在這里插入圖片描述

setnx的設置key與過期時間用腳本實現原子操作
key設置成功默認30s，則有后臺線程每10秒(1/3的原始過期時間定時檢查)檢查判斷，延長過期時間
未獲取到鎖的線程會自旋，直到那個獲取到鎖的線程將鎖釋放

實現可重入鎖

value中多存儲全局信息，可重入次數相關信息

{"count":1,"expireAt":147506817232,"jvmPid":22224, // jvm進程ID"mac":"28-D2-44-0E-0D-9A", // MAC地址"threadId":14 // 線程Id
}

redis分布式鎖的問題？

Redis分布式鎖會有個缺陷，就是在Redis哨兵模式下:

客戶端1對某個master節點寫入了redisson鎖，此時會異步復制給對應的slave節點。但是這個過程中一旦發生master節點宕機，主備切換，slave節點從變為了master節點（但是鎖信息是沒有的）。這時客戶端2來嘗試加鎖的時候，在新的master節點上也能加鎖，此時就會導致多個客戶端對同一個分布式鎖完成了加鎖。

這時系統在業務語義上一定會出現問題，導致各種臟數據的產生。缺陷在哨兵模式或者主從模式下，如果master實例宕機的時候，可能導致多個客戶端同時完成加鎖。

redis主從架構問題？

補充知識：redis單機qps支持：10w級別

redis主從架構是主同步到從，如果主設置key成功，但是同步到從還沒結束就掛了；這樣從成為主，但是是沒有key存在的，那么另一個線程又能夠加鎖成功。(redis主從架構鎖失效問題？)

redis無法保證強一致性？zookeeper解決，但是zk性能不如redis

Redlock（超半數加鎖成功才成功）

在這里插入圖片描述

加鎖失敗的回滾
redis加鎖多，性能受影響

高并發分布式鎖如何實現

分段鎖思想

基于ZooKeeper實現

回顧zookeeper的一些相關知識: 文件系統+監聽通知機制

zookeeper節點類型

PERSISTENT-持久節點

除非手動刪除，否則節點一直存在于 Zookeeper 上; 重啟Zookeeper后也會恢復

EPHEMERAL-臨時節點

臨時節點的生命周期與客戶端會話綁定，一旦客戶端會話失效（客戶端與zookeeper連接斷開不一定會話失效），那么這個客戶端創建的所有臨時節點都會被移除。

PERSISTENT_SEQUENTIAL-持久順序節點

基本特性同持久節點，只是增加了順序屬性，節點名后邊會追加一個由父節點維護的自增整型數字。

EPHEMERAL_SEQUENTIAL-臨時順序節點

基本特性同臨時節點，增加了順序屬性，節點名后邊會追加一個由父節點維護的自增整型數字。

zookeeper的watch機制

主動推送：watch被觸發時，由zookeeper主動推送給客戶端，而不需要客戶端輪詢
一次性：數據變化時，watch只會被觸發一次；如果客戶端想得到后續更新的通知，必須要在watch被觸發后重新注冊一個watch
可見性：如果一個客戶端在讀請求中附帶 Watch，Watch 被觸發的同時再次讀取數據，客戶端在得到 Watch消息之前肯定不可能看到更新后的數據。換句話說，更新通知先于更新結果
順序性：如果多個更新觸發了多個 Watch ，那 Watch 被觸發的順序與更新順序一致

zookeeper lock

普通臨時節點（羊群效應）

在這里插入圖片描述

比如1000個并發，只有1個客戶端獲取鎖成功，其它999個客戶端都處在監聽并等待中；如果成功釋放鎖了，那么999個客戶端都監聽到，再次繼續進行創建鎖的流程。

所以每次鎖有變化，幾乎所有客戶端節點都要監聽并作出反應，這會給集群帶來巨大壓力，即為:羊群效應

順序節點（公平，避免羊群效應）

在這里插入圖片描述

首先需要創建一個父節點，盡量是持久節點（PERSISTENT類型)
每個要獲得鎖的線程都會在這個節點下創建個臨時順序節點，
由于序號的遞增性，可以規定排號最小的那個獲得鎖。
所以，每個線程在嘗試占用鎖之前，首先判斷自己是排號是不是當前最小，如果是，則獲取鎖。

利用順序性：每個線程都只監聽前一個線程，事件通知也只通知后面都一個線程，而不是通知全部，從而避免羊群效應

Curator InterProcessMutex(可重入公平鎖)

curator官方文檔

code&測試

實踐代碼鏈接

@Component
public class CuratorConfiguration {@Bean(initMethod = "start")public CuratorFramework curatorFramework() {RetryPolicy retryPolicy = new ExponentialBackoffRetry(1000, 3);CuratorFramework client = CuratorFrameworkFactory.newClient("127.0.0.1:2181", retryPolicy);return client;}
}

 @Autowired
private CuratorFramework curatorFramework;@PostMapping(value = "/deduct_stock_zk")
public String deductStockZk() throws Exception {String path = "/stock";InterProcessMutex interProcessMutex = new InterProcessMutex(curatorFramework, path);String retVal;try {interProcessMutex.acquire();retVal = stockReduce();} catch (Exception e) {throw new Exception("lock error");} finally {interProcessMutex.release();}return retVal;
}

在這里插入圖片描述

壓測結果正常

在這里插入圖片描述

InterProcessMutex 內部原理

初始化

/**
* @param client client
* @param path   the path to lock
* @param driver lock driver
*/
public InterProcessMutex(CuratorFramework client, String path, LockInternalsDriver driver)
{this(client, path, LOCK_NAME, 1, driver);
}

 /*** Returns a facade of the current instance that tracks* watchers created and allows a one-shot removal of all watchers* via {@link WatcherRemoveCuratorFramework#removeWatchers()}** @return facade*/
public WatcherRemoveCuratorFramework newWatcherRemoveCuratorFramework();

加鎖

private boolean internalLock(long time, TimeUnit unit) throws Exception
{/*Note on concurrency: a given lockData instancecan be only acted on by a single thread so locking isn't necessary*/Thread currentThread = Thread.currentThread();// 獲取當前線程鎖數據，獲取到的化，設置可重入LockData lockData = threadData.get(currentThread);if ( lockData != null ){// re-enteringlockData.lockCount.incrementAndGet();return true;}// 嘗試獲取鎖String lockPath = internals.attemptLock(time, unit, getLockNodeBytes());if ( lockPath != null ){// 獲取到鎖，鎖數據加入`threadData`的map結構中LockData newLockData = new LockData(currentThread, lockPath);threadData.put(currentThread, newLockData);return true;}// 沒有獲取到鎖return false;
}

String attemptLock(long time, TimeUnit unit, byte[] lockNodeBytes) throws Exception
{final long      startMillis = System.currentTimeMillis();final Long      millisToWait = (unit != null) ? unit.toMillis(time) : null;final byte[]    localLockNodeBytes = (revocable.get() != null) ? new byte[0] : lockNodeBytes;int             retryCount = 0;String          ourPath = null;boolean         hasTheLock = false;boolean         isDone = false;while ( !isDone ){isDone = true;try{ourPath = driver.createsTheLock(client, path, localLockNodeBytes);hasTheLock = internalLockLoop(startMillis, millisToWait, ourPath);}catch ( KeeperException.NoNodeException e ){// gets thrown by StandardLockInternalsDriver when it can't find the lock node// this can happen when the session expires, etc. So, if the retry allows, just try it all againif ( client.getZookeeperClient().getRetryPolicy().allowRetry(retryCount++, System.currentTimeMillis() - startMillis, RetryLoop.getDefaultRetrySleeper()) ){isDone = false;}else{throw e;}}}if ( hasTheLock ){return ourPath;}return null;
}

創建鎖是創建的臨時順序節點

@Override
public String createsTheLock(CuratorFramework client, String path, byte[] lockNodeBytes) throws Exception
{String ourPath;if ( lockNodeBytes != null ){ourPath = client.create().creatingParentContainersIfNeeded().withProtection().withMode(CreateMode.EPHEMERAL_SEQUENTIAL).forPath(path, lockNodeBytes);}else{ourPath = client.create().creatingParentContainersIfNeeded().withProtection().withMode(CreateMode.EPHEMERAL_SEQUENTIAL).forPath(path);}return ourPath;
}

watch

private boolean internalLockLoop(long startMillis, Long millisToWait, String ourPath) throws Exception
{boolean     haveTheLock = false;boolean     doDelete = false;try{if ( revocable.get() != null ){client.getData().usingWatcher(revocableWatcher).forPath(ourPath);}while ( (client.getState() == CuratorFrameworkState.STARTED) && !haveTheLock ){// 獲取lock下所有節點數據，并排序List<String>        children = getSortedChildren();String              sequenceNodeName = ourPath.substring(basePath.length() + 1); // +1 to include the slash// 判斷獲取到鎖PredicateResults    predicateResults = driver.getsTheLock(client, children, sequenceNodeName, maxLeases);if ( predicateResults.getsTheLock() ){haveTheLock = true;}else{String  previousSequencePath = basePath + "/" + predicateResults.getPathToWatch();synchronized(this){try{// use getData() instead of exists() to avoid leaving unneeded watchers which is a type of resource leak// 監聽前一個節點，并等待client.getData().usingWatcher(watcher).forPath(previousSequencePath);if ( millisToWait != null ){millisToWait -= (System.currentTimeMillis() - startMillis);startMillis = System.currentTimeMillis();if ( millisToWait <= 0 ){doDelete = true;    // timed out - delete our nodebreak;}wait(millisToWait);}else{wait();}}catch ( KeeperException.NoNodeException e ){// it has been deleted (i.e. lock released). Try to acquire again}}}}}catch ( Exception e ){ThreadUtils.checkInterrupted(e);doDelete = true;throw e;}finally{if ( doDelete ){deleteOurPath(ourPath);}}return haveTheLock;
}

是不是加鎖成功:是不是最小的那個節點

@Override
public PredicateResults getsTheLock(CuratorFramework client, List<String> children, String sequenceNodeName, int maxLeases) throws Exception
{int             ourIndex = children.indexOf(sequenceNodeName);validateOurIndex(sequenceNodeName, ourIndex);boolean         getsTheLock = ourIndex < maxLeases;String          pathToWatch = getsTheLock ? null : children.get(ourIndex - maxLeases);return new PredicateResults(pathToWatch, getsTheLock);
}

釋放鎖

可重入判斷；刪除watchers，刪除節點

/*** Perform one release of the mutex if the calling thread is the same thread that acquired it. If the* thread had made multiple calls to acquire, the mutex will still be held when this method returns.** @throws Exception ZK errors, interruptions, current thread does not own the lock*/
@Override
public void release() throws Exception
{/*Note on concurrency: a given lockData instancecan be only acted on by a single thread so locking isn't necessary*/Thread currentThread = Thread.currentThread();LockData lockData = threadData.get(currentThread);if ( lockData == null ){throw new IllegalMonitorStateException("You do not own the lock: " + basePath);}int newLockCount = lockData.lockCount.decrementAndGet();if ( newLockCount > 0 ){return;}if ( newLockCount < 0 ){throw new IllegalMonitorStateException("Lock count has gone negative for lock: " + basePath);}try{internals.releaseLock(lockData.lockPath);}finally{threadData.remove(currentThread);}
}

final void releaseLock(String lockPath) throws Exception
{client.removeWatchers();revocable.set(null);deleteOurPath(lockPath);
}

redis vs zookeeper AI回答

一致性模型
Zookeeper 提供強一致性，這意味著當客戶端在一個服務器上看到某個狀態更新后，其他服務器也會立即反映這一變化。這種特性使得 Zookeeper 非常適合用于需要嚴格一致性的場景。

相比之下，Redis 默認提供最終一致性。雖然可以通過 Redlock 算法來增強其一致性保障4，但在某些極端情況下（如網絡分區或主從延遲較高時），仍然可能存在短暫的數據不一致問題。

可靠性與容錯能力
Zookeeper 使用 Paxos 或 ZAB 協議構建高可用集群，在部分節點失效的情況下仍能保持服務正常運行。因此，即使少數節點發生故障，整個系統依然能夠繼續運作。

然而，標準的 Redis 實現存在單點故障風險。盡管引入 Sentinel 或 Cluster 模式可以在一定程度上緩解此問題，但如果主節點崩潰且未及時完成 failover，則可能導致鎖丟失的情況出現。此外，由于 Redis 主從之間采用異步復制機制，可能會進一步加劇該類問題的發生概率。

性能表現
在高頻次、低延時需求下，Redis 顯示出了顯著的優勢。它是一種內存級數據庫，所有操作幾乎都在 O(1) 時間復雜度內完成，這使其成為高性能應用場景下的理想選擇。

而 Zookeeper 更注重于穩定性和一致性而非極致速度。對于那些對實時響應要求不高但強調可靠性的業務來說，Zookeeper 是更合適的選擇。

功能擴展性
借助 Redisson 庫的支持，開發者能夠在 Redis 基礎之上輕松獲得諸如可重入鎖、自動續期以及公平鎖等功能。這些額外的功能極大地增強了 Redis 鎖機制的實際應用價值。

至于 Zookeeper，雖然原生 API 較為簡單直接，但它允許用戶自定義復雜的邏輯流程以滿足特定需求。不過相較于 Redisson 所提供的開箱即用型解決方案而言，開發成本相對更高一些。