對于具有空間差異的數據,如果不知道數據的特征關系或意義,直接用杜賓模型來處理是一個比較通用的思路,只是后續還需要很多檢驗去證明結果的可解釋性和統計性。
但如果我們已經知道特征的意義,比如企業經濟發展的數據中有著員工的科研能力,公司文化,當下的政策改革,外界的經濟變化,我們就可以將其分為個體效應(不隨時間改變的特征)和時間效應(所有個體共同經歷的時間趨勢),從而能夠快速直接地分析出各個地域企業的發展狀況。
以下是一個例子:
# 加載必要的包
library(plm)
library(lmtest)
library(dplyr)# 生成模擬數據集
set.seed(123)
n <- 100 # 個體數量
t <- 5 # 時間周期# 創建面板數據結構
data <- expand.grid(id = 1:n, time = 1:t) %>%mutate(# 個體固定效應(不隨時間變化)alpha_i = rnorm(n, mean = 0, sd = 2)[id],# 時間固定效應(不隨個體變化)gamma_t = rnorm(t, mean = 0, sd = 1)[time],# 解釋變量X = rnorm(n*t, mean = 5, sd = 2),# 誤差項epsilon = rnorm(n*t, mean = 0, sd = 1),# 生成因變量(真實系數β=0.8)Y = 0.8 * X + alpha_i + gamma_t + epsilon)# 查看前幾行數據
head(data)# 雙重固定效應模型估計
twoway_model <- plm(Y ~ X, data = data, index = c("id", "time"), model = "within", effect = "twoways")# 混合模型(無固定效應)
pooled_model <- plm(Y ~ X, data = data, index = c("id", "time"), model = "pooling")# 個體固定效應模型
individual_model <- plm(Y ~ X, data = data, index = c("id", "time"), model = "within", effect = "individual")# 時間固定效應模型
time_model <- plm(Y ~ X, data = data, index = c("id", "time"), model = "within", effect = "time")# 查看雙重固定效應模型結果
summary(twoway_model)# 正確進行F檢驗的方法
# 1. 檢驗雙重固定效應是否優于混合模型
pFtest(twoway_model, pooled_model)# 2. 檢驗個體固定效應是否顯著
pFtest(individual_model, pooled_model)# 3. 檢驗時間固定效應是否顯著
pFtest(time_model, pooled_model)# 4. 檢驗雙重固定效應是否優于僅個體固定效應
pFtest(twoway_model, individual_model)# 5. 檢驗雙重固定效應是否優于僅時間固定效應
pFtest(twoway_model, time_model)
輸出:
Twoways effects Within ModelCall:
plm(formula = Y ~ X, data = data, effect = "twoways", model = "within", index = c("id", "time"))Balanced Panel: n = 100, T = 5, N = 500Residuals:Min. 1st Qu. Median 3rd Qu. Max.
-3.224723 -0.583125 -0.010202 0.599678 2.960869 Coefficients:Estimate Std. Error t-value Pr(>|t|)
X 0.778466 0.026107 29.818 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Total Sum of Squares: 1329.9
Residual Sum of Squares: 409.08
R-Squared: 0.6924
Adj. R-Squared: 0.61141
F-statistic: 889.127 on 1 and 395 DF, p-value: < 2.22e-16F test for twoways effectsdata: Y ~ X
F = 17.185, df1 = 103, df2 = 395, p-value < 2.2e-16
alternative hypothesis: significant effectsF test for individual effectsdata: Y ~ X
F = 13.23, df1 = 99, df2 = 399, p-value < 2.2e-16
alternative hypothesis: significant effectsF test for time effectsdata: Y ~ X
F = 6.673, df1 = 4, df2 = 494, p-value = 3.094e-05
alternative hypothesis: significant effectsF test for twoways effectsdata: Y ~ X
F = 27.637, df1 = 4, df2 = 395, p-value < 2.2e-16
alternative hypothesis: significant effectsF test for twoways effectsdata: Y ~ X
F = 16.759, df1 = 99, df2 = 395, p-value < 2.2e-16
alternative hypothesis: significant effects
輸出表明:模型需要固定效應加入到模型中,且個體效應非常顯著,只是需要控制個別特殊異體,時間效應同理;所有的F的p值都小于0.001,說明必須同時控制時間和個體固定效應,結果中X的系數為0.778,表明是純凈的因果效應,而標準差0.026則說明模型的精度較高。