【Linux】線程概念與控制

一. 線程的概念

1.什么是線程

線程是進程內部的一個執行流，是進程調度的基本單位。它具有輕量的特點，它的創建和銷毀所消耗的資源更少，線程間切換比進程間切換消耗的資源更少；它與進程共享一張虛擬地址空間表，通過進程來給線程執行流分配資源；同時，每個線程都是獨立執行的，擁有自己的程序計數器和上下文切換。

簡單來說在Linux下，一個進程由多個 task_struct ，一張虛擬地址空間表和頁表構成。而線程就是一個 task_struct ，進程內部的一個執行流，所有的線程都指向同一張虛擬地址空間表，讓進程共同管理。這樣我們就可以對所有線程的資源進行劃分，劃分為堆區棧區共享區等等。通過一張頁表映射到物理內存當中。

總的來說，線程就是一個 task_struct ，通過共同指向同一張虛擬地址空間的方式實現了共同管理，降低了創建調度銷毀成本。

2.深刻理解虛擬地址空間

在虛擬地址空間中，頁表用于映射虛擬地址空間到實際的物理內存。我們在管理虛擬地址空間的時候，它的地址是連續的，而物理地址空間則是可以分散碎片的。在虛擬地址空間中，我們存儲同一個資源的時候地址空間需要連續，但在物理地址當中，我們會將同類型的資源盡可能放到一處（這樣可以節省空間），無論是哪個線程都可以將數據進行整合。

有了虛擬地址空間和物理地址空間，那么我們如何將他們連接起來呢？再加一層頁表就好。

在 Linux 當中，頁表是由，三級頁表組成。一級頁表是頁表目錄，其中存儲著各個頁表的地址；二級目錄是各個頁表，頁表指向各個頁幀的地址（4 KB）；三級頁表就是頁幀，每個頁幀由 4 KB 構成。在虛擬地址空間中，每個數據在虛擬地址空間下都有一個 32 字節的地址，這 32 個字節需要分為 10 + 10 + 12 來進行閱讀，首先定位到頁表目錄當中，前10個字節，用于在頁表目錄當中找到對應的頁表；中間的10個字節用于在當前頁表當中找到對應的頁幀；最后的12個字節用于對頁幀的起始位置的偏移量，這樣我們就能通過虛擬地址找到相對于的物理地址

下面是一個簡化圖

有了頁幀，該如何管理呢？先描述再組織，操作系統引入了 struct page 結構。對于每個頁幀，都有一個 struct page 對它進行相應的管理。

下面我來介紹一下 struct page 的結構構成。

該結構主要用于管理記錄，跟蹤頁幀的使用狀態，頁針的歸屬，管理頁幀的映射關系，回收頁幀等等。

首先是狀態標識（flags），用于記錄頁幀的基本狀態，是被鎖定被修改還是內核保留；引用計數（_refcount）記錄該頁幀被引用的次數；映射關系（mapping + index）mapping 指向該文件的存儲頁，index 用于指向在該頁下的偏移量。

總結：
1. 虛擬地址和物理地址管理，通過頁表進行映射，使得其完成了解耦的操作。
2. 頁表按需創建和分頁機制有效的節省了空間消耗。

3.線程的優缺點

（1）優點

線程的創建相比于進程的創建代價要小很多且占用資源少，線程只需要創建 task_struct 掛接到虛擬地址空間上即可，而進程的創建就要涉及虛擬地址空間頁表等等資源；線程切換比進程切換效率高，如果要進行進程間的切換，就需要連同虛擬地址空間等進行統一切換，而線程只需要切換 task_struct 和上下文資源即可；線程可以利用多處理器進行并發運行，提高 IO 效率和計算效率；

（2）缺點

線程共享進程的地址空間，因此可以訪問到當前的共享資源，這就導致，若缺乏同步機制，線程會引發數據競爭，導致程序異常；當線程過多時，容易導致資源限制，首先每個線程都有自己的獨立線程棧在內存當中，大小為 8 MB ，若線程過多就容易導致內存空間耗盡。其次，若線程過多，CPU的調度開銷也對應的增加，CPU 將時間花在了不斷調度線程中，導致 CPU 利用率下降。最后，每個線程創建都會在內核當中創建一個 TCB 資源，這也就導致了高頻創建銷毀會給內核增加負擔。進程擁有較高的獨立性，即使程序出錯進程崩潰，這也不會影響其他的進程運行，但如果線程崩潰，可能會導致整個進程都退出。

二. 線程的控制

1.線程創建

pthread_create：

#include <pthread.h>
int pthread_create(pthread_t *thread, const pthread_attr_t *attr,void *(*start_routine)(void*),void *arg);
參數說明：
thread：獲取創建成功的線程 ID ，該參數是一個輸出型參數。
attr：用于設置進程屬性，傳入NULL 表示使用默認值。
start_routine：返回值和參數均為 void* 的函數指針。該參數表示線程例程，即后續線程需要執行的函數。
arg：傳給線程實例的參數。
返回值：
成功返回0，失敗返回錯誤碼。

下面我們來看一個示例，讓一個主線程創建一個新線程

當一個程序啟動時，就有一個進程被操作系統創建，于此同時一個線程也立刻運行，這個線程就是主線程。

#include <iostream>
#include <pthread.h>
#include <unistd.h>
using namespace std;void *startRoutine(void* args)
{while(true){cout<<"線程正在運行"<<endl;sleep(1);}
}int main()
{pthread_t tid;int n = pthread_create(&tid,nullptr,startRoutine,(void*)"thread-1");cout<<"new thread id:"<<tid<<endl;while(true){cout<<"main pthread 正在運行"<<endl;sleep(1);}return 0;
}

運行結果：

當我們想獲取線程 id 時，可以使用 pthread_self 函數，我們來看下面的代碼

#include <iostream>
#include <pthread.h>
#include <unistd.h>
using namespace std;static void Print(const char* name,pthread_t tid)
{cout<<name<<" 正在運行"<<tid<<endl;
}void *routine(void* argv)
{const char* name = static_cast<const char*>(argv);while(true){Print(name,pthread_self());sleep(1);}
}int main()
{pthread_t tid;int n = pthread_create(&tid,nullptr,routine,(void*)"thread");while(true){cout<<"main thread run"<<endl;sleep(1);}return 0;
}

下面是運行結果：

在線程運行中，調用了 pthread_self 函數將當前線程的tid傳給了函數進行調用。

2.線程終止

終止一個線程有三種方法：
1.從線程函數 return
2.在線程中調用
3.在線程中調用 pthread_exit 終止其它進程中的另一個線程

方法一：（從線程return）

方法較簡單不詳細講解

方法二：（pthread_exit）

pthread_exit 的功能就是終止線程
#include <pthread.h>
void pthread_exit(void* retval);
參數說明：
retval：線程退出碼
注意：
pthread_exit 和 return 返回的指針所指向的內存單元必須是全局的或者是 malloc 分配的，若是在線程內部創建的指針返回會導致訪問結果不可控。因為隨著 pthread_exit 線程棧上存儲的數據也會被銷毀。

下面看一下正確使用：

#include <iostream>
#include <pthread.h>
#include <unistd.h>
using namespace std;static void printTid(const char* name,const pthread_t &tid)
{cout<<name<<" 正在運行 "<<tid<<"  "<<endl;
}void* Routine(void* argv)
{const char* name = static_cast<const char*>(argv);int cnt = 5;while(true){printTid(name,pthread_self());if(!(cnt--)){break;}sleep(1);}cout<<"線程退出"<<endl;pthread_exit((void*)11111);
}int main()
{pthread_t tid;int n = pthread_create(&tid,nullptr,Routine,(void*)"thread");void* ret = nullptr;pthread_join(tid,&ret);cout<<"main pthread success   "<<(long long)ret<<endl;sleep(5);while(true){printTid("main othread",pthread_self());sleep(2);}return 0;
}

運行結果：

3.線程等待

pthread_join：

類比于進程等待，線程創建也是需要被等待的，如果一個新線程被創建出來，主線程不進行等待，那么這個新線程的資源就無法被回收，就會導致資源泄露。在線程中等待的函數叫 pthread_join?
#include <pthread.h>
int pthread_join(pthread_t thread,void **retval);
參數說明：
thread：被等待的線程tid
retval：線程退出時的信息碼
返回值：
成功返回0，失敗返回信息碼

下面是代碼樣例：

#include <iostream>
#include <pthread.h>
#include <unistd.h>
using namespace std;void *thread_func(void *arg) 
{printf("子線程任務完成\n");pthread_exit((void*)100);  sleep(200);
}int main() 
{pthread_t tid;void *ret;pthread_create(&tid, NULL, thread_func, NULL);pthread_join(tid, &ret);  printf("子線程退出狀態：%ld\n", (long)ret);return 0;
}

運行結果：

主線程進行阻塞，等待子線程完成任務后，主線程才會繼續運行

4.線程分離

pthread_detach：

線程分離與線程等待是一對互斥關系，當我們主線程不需要關心子線程的返回值時，我們可以將子線程進行分離（也可以是子線程自行分離），分離后的線程會繼續執行自己的內容。一個線程被分離了，這個進程依舊需要管理這個線程的資源，若被分離的線程出現故障也有可能會影響其他的線程或者當前進程。分離的線程可以減輕 join 的負擔，意味著主線程不需要再關注子線程了，而子線程執行完畢后也會自行釋放資源。
#include <pthread.h>
int pthread_detach(pthread_t thread);
參數說明：
thread：被分離的線程 ID?
返回值：
成功返回0，失敗返回錯誤碼

5.POSIX線程庫

在Linux當中，站在內核角度實際上并沒有關于線程相關的接口，但是用戶希望創建線程時可以調用接口，這樣可以使編碼更加便捷。于是，便有了第三方的線程庫，基于這個第三方庫，它為用戶提供了線程相關的接口，構成了線程有關的完整系列。

這些接口大多數都是以 pthread_? 打頭，在使用前需要包含頭文件? <pthread.h> ，鏈接庫時需要包含? -lpthread 選項。

6.線程棧和 pthread_t

線程是一個獨立的執行流，在運行的過程中也會產生自己的數據，所以線程擁有自己的獨立的棧，線程棧會隨著線程的銷毀被回收。

在 Linux 中，基于線程的接口都是通過外部庫封裝后進行調用的，pthread_t 是線程的身份證，用于識別和操作線程。在外部庫中，pthread_t 是由?thread_info 結構體進行管理的。
struct thread_info
{pthread_t tid;void *stack;
}
與其一同管理的便是線程棧。每當用戶創建一個線程時，就會在動態庫中創建一個線程控制塊 thread_info ，給用戶返回一個 pthread_t ，也就是該結構體的起始虛擬地址。

主線程中的棧區使用的是地址空間中的棧區，而創建的子線程用的是庫中提供的棧結構。

7.線程的局部存儲

在線程中，全局變量是共享的，所有的線程可以共用一份全局變量，如果想讓全局變量私有那么可以進行線程變量的局部存儲

下面我們來驗證一下，線程可以共用同一份全局變量

#include <iostream>
#include <pthread.h>
#include <unistd.h>
using namespace std;int global_val = 100;void *Routine(void *argv)
{const char* name = static_cast<const char*>(argv);while(true){cout<<"thread: "<<name<<"  global_value"<<global_val<<" new: "<<global_val++<<"address: "<<&global_val<<endl;sleep(1);}
}int main()
{pthread_t tid1;pthread_t tid2;pthread_t tid3;pthread_create(&tid1,nullptr,Routine,(void*)"thread1");pthread_create(&tid2,nullptr,Routine,(void*)"thread2");pthread_create(&tid3,nullptr,Routine,(void*)"thread3");pthread_join(tid1,nullptr);pthread_join(tid2,nullptr);pthread_join(tid3,nullptr);return 0;
}

運行結果：

我們發現，全局變量在主線程和子線程下該變量的地址都是一致的，它們所用的是同一個變量

若我們希望在每一個子線程下都創建一份變量我們可以這樣操作

樣例代碼：

#include <iostream>
#include <pthread.h>
#include <unistd.h>
using namespace std;__thread int global_val = 100;void *Routine(void *argv)
{const char* name = static_cast<const char*>(argv);while(true){cout<<"thread: "<<name<<"  global_value"<<global_val<<" new: "<<global_val++<<"address: "<<&global_val<<endl;sleep(1);}
}int main()
{pthread_t tid1;pthread_t tid2;pthread_t tid3;pthread_create(&tid1,nullptr,Routine,(void*)"thread1");pthread_create(&tid2,nullptr,Routine,(void*)"thread2");pthread_create(&tid3,nullptr,Routine,(void*)"thread3");pthread_join(tid1,nullptr);pthread_join(tid2,nullptr);pthread_join(tid3,nullptr);return 0;
}

運行結果：

我們只要在全局變量前加上??__thread? ，此時所有的線程都在自己的棧上拿到了一份數據，我們可以觀察到，此時全局變量打印出的地址是不同的，且變量是肚子增加的。

三. 線程的封裝

線程封裝

我們簡單的對線程進行封裝，使其能進行創建分離等待終止等功能

#include <iostream>
#include <string>
#include <cstdio>
#include <cstring>
#include <functional>
#include <pthread.h>
using namespace std;static uint32_t number = 1;template <typename T>
class Thread
{using func_t = function<void(T)>;private:void EnableDetach(){std::cout << "線程被分離了" << std::endl;_isdetach = true;}void EnableRunning(){_isrunning = true;}static void *Routine(void *argv){Thread<T> *self = static_cast<Thread<T> *>(argv);self->EnableRunning();if (self->_isdetach)self->Detach();self->_func(self->_data); // 回調處理return nullptr;}public:Thread(func_t func, T Data): _tid(0), _isrunning(false), _isdetach(false), _Data(Data), _func(func){_name = "Thread - " + to_string(_number++);}void Detach(){if (_isdetach)return;int n = pthread_detach(_tid);if (n != 0){cerr << "fail to detach" << strerror(n) << endl;}else{cout << "success to detach" << endl;_isdetach = true;}}void Join(){if (_isdetach){cout << "線程已經分離，無法進行等待" << endl;return;}int n = pthread_join(_tid, &res);if (n != 0){cerr << "fail to join" << strerror(n) << endl;}else{cout << "success to join" << endl;}}bool Start(){if (_isrunning)return false;int n = pthread_create(&tid, nullptr, Routine, this);if (n != 0){cerr << "fail to create pthread" << strerror(n) << endl;return false;}else{cout << "success to create pthread" << strerror(n) << endl;return true;}}bool Stop(){if (!_isrunning)return false;int n = pthread_cancel(tid);if (n != 0){cerr << "fail to Stop" << strerror(n) << endl;return false;}else{cout << "success to Stop" << endl;_isrunning = false;return true;}}~Thread(){}private:string _name;pthread_t _tid;bool _isrunning;bool _isdetach;T _Data;void *res;func_t _func;
};

感謝各位觀看，望多多支持！！！