Levenberg-Marquardt算法詳解和C++代碼示例

Levenberg-Marquardt（LM）算法是非線性最小二乘問題中常用的一種優化算法，它融合了高斯-牛頓法和梯度下降法的優點，在數值計算與SLAM、圖像配準、機器學習等領域中應用廣泛。

一、Levenberg-Marquardt算法基本原理

1.1 問題定義

我們希望最小化一個非線性殘差平方和目標函數：

$\min_{\mathbf{x}} \, f(\mathbf{x}) = \frac{1}{2} \sum_{i=1}^m r_i(\mathbf{x})^2 = \frac{1}{2} \| \mathbf{r}(\mathbf{x}) \|^2$

其中，

$\mathbf{x} \in \mathbb{R}^n$ ：參數向量
$\mathbf{r}(\mathbf{x}) = [r_1(\mathbf{x}), \ldots, r_m(\mathbf{x})]^T$ ：殘差向量

我們要最小化的是殘差的平方和。

二、高斯-牛頓法回顧

在當前點 $\mathbf{x}_k$ 處，對殘差函數進行一階泰勒展開：

$\mathbf{r}(\mathbf{x}_k + \Delta \mathbf{x}) \approx \mathbf{r}(\mathbf{x}_k) + J(\mathbf{x}_k) \Delta \mathbf{x}$

其中 $\in \mathbb{R}^{m \times n}$ 是 Jacobian：

$J_{ij} = \frac{\partial r_i}{\partial x_j}$

代入目標函數：

$\min_{\Delta \mathbf{x}} \frac{1}{2} \| \mathbf{r} + J \Delta \mathbf{x} \|^2$

導出正規方程（Normal Equation）：

$J^T J \Delta \mathbf{x} = - J^T \mathbf{r}$

這就是高斯-牛頓法。

三、LM算法推導：阻尼的高斯-牛頓

LM法通過引入一個阻尼因子 $\lambda$ 來平衡 Gauss-Newton 與 Gradient Descent：

$(J^T J + \lambda I) \Delta \mathbf{x} = - J^T \mathbf{r}$

當 $\lambda \to 0$ ，接近高斯-牛頓法；
當 $\lambda \to \infty$ ，趨于梯度下降法。

為了更穩定地調整 $\lambda$ ，可以采用如下對角矩陣：

$(J^T J + \lambda \cdot \text{diag}(J^T J)) \Delta \mathbf{x} = - J^T \mathbf{r}$

這種處理使 LM 更具有數值穩定性。

四、LM算法偽代碼

x = x0
lambda = lambda_initwhile not converged:r = residual(x)J = jacobian(x)H = J^T * Jg = J^T * rsolve (H + lambda * diag(H)) * dx = -gif cost(x + dx) < cost(x):x = x + dxlambda = lambda / factorelse:lambda = lambda * factor

五、Levenberg-Marquardt 算法實現步驟

步驟 1：初始化

初始化參數向量 $\mathbf{x}_0$
設置初始阻尼系數 $\lambda$ ，通常取 $10^{-3} \sim 10^{-1}$
設置調整系數（如增長因子 $\nu = 2$ ）
設置收斂條件（如最大迭代次數、步長閾值、誤差閾值）

步驟 2：計算當前殘差與 Jacobian

在當前參數 $\mathbf{x}$ 處計算殘差向量 $\mathbf{r}(\mathbf{x})$
計算殘差的 Jacobian 矩陣 $J(\mathbf{x})$

步驟 3：構建 LM 修正的正規方程

構造修正的線性系統：

$(J^T J + \lambda \cdot \text{diag}(J^T J)) \Delta \mathbf{x} = -J^T \mathbf{r}$

$J^T J$ ：近似 Hessian 矩陣
$\lambda \cdot \text{diag}(J^T J)$ ：用于平滑（阻尼），避免步長過大

步驟 4：求解增量 $\Delta \mathbf{x}$

使用 Cholesky / LDLT 分解求解線性方程組，得到參數增量 $\Delta \mathbf{x}$
可選：添加線性求解器條件數檢查以保證穩定性

步驟 5：評估新參數點

更新新參數： $\mathbf{x}_{\text{new}} = \mathbf{x} + \Delta \mathbf{x}$
計算新誤差 $\| \mathbf{r}(\mathbf{x}_{\text{new}}) \|^2$
如果誤差變小，接受更新，并降低 $\lambda$ ：

$\lambda \leftarrow \lambda / \text{factor}$
否則，拒絕更新，提高阻尼系數以減小步長：

$\lambda \leftarrow \lambda \times \text{factor}$

步驟 6：檢查收斂條件

如果滿足以下任一條件，則終止：
- 殘差變化非常小： $\| \Delta \mathbf{x} \| < \epsilon$
- 最大迭代次數達到
- 梯度足夠小： $\| J^T \mathbf{r} \| < \epsilon$
否則返回步驟 2，繼續迭代

六、總結為流程圖結構

初始化 x, lambda↓
計算殘差 r(x), Jacobian J↓
構建系統 (J?J + λI)Δx = -J?r↓
求解 Δx↓
計算新誤差 cost(x + Δx)↓
誤差減少？┌─────────────┐↓             ↓
Yes           No
↓              ↓
x ← x + Δx     λ ← λ × factor
λ ← λ / factor ↓↓
滿足終止條件？↓Yes → 退出No  → 回到迭代

七、應用示例：擬合二次函數 $y = ax^2 + bx + c$

我們以擬合二次函數為例，給定數據點 $x_i, y_i)$ ，最小化以下殘差：

$r_i(a, b, c) = y_i - (a x_i^2 + b x_i + c)$

Jacobian：

$J_i = \left[ -x_i^2, -x_i, -1 \right]$

八、C++代碼實現

#include <iostream>
#include <vector>
#include <Eigen/Dense>using namespace Eigen;
using namespace std;struct DataPoint {double x, y;
};struct LMResult {Vector3d params;double final_cost;int iterations;
};LMResult LevenbergMarquardt(const vector<DataPoint>& data, Vector3d init, int max_iter = 100) {Vector3d x = init;double lambda = 1e-3;double v = 2.0;int n = data.size();double last_cost = 1e20;for (int iter = 0; iter < max_iter; ++iter) {MatrixXd J(n, 3);VectorXd r(n);for (int i = 0; i < n; ++i) {double xi = data[i].x;double yi = data[i].y;double yi_est = x(0) * xi * xi + x(1) * xi + x(2);r(i) = yi - yi_est;J(i, 0) = -xi * xi;J(i, 1) = -xi;J(i, 2) = -1.0;}Matrix3d H = J.transpose() * J;Vector3d g = J.transpose() * r;Matrix3d H_lm = H + lambda * H.diagonal().asDiagonal();Vector3d dx = H_lm.ldlt().solve(-g);Vector3d x_new = x + dx;// compute new costdouble new_cost = 0.0;for (int i = 0; i < n; ++i) {double xi = data[i].x;double yi = data[i].y;double yi_est = x_new(0) * xi * xi + x_new(1) * xi + x_new(2);double ri = yi - yi_est;new_cost += ri * ri;}if (new_cost < last_cost) {x = x_new;lambda *= 0.8;last_cost = new_cost;} else {lambda *= 2.0;}if (dx.norm() < 1e-6) break;}return {x, last_cost, max_iter};
}

九、輸出與測試

int main() {vector<DataPoint> data;for (int i = 0; i <= 10; ++i) {double x = i;double y = 2.0 * x * x + 3.0 * x + 1.0 + ((rand() % 100) / 50.0 - 1.0); // 加噪聲data.push_back({x, y});}Vector3d init(0.0, 0.0, 0.0);auto result = LevenbergMarquardt(data, init);cout << "Estimated parameters: " << result.params.transpose() << endl;cout << "Final cost: " << result.final_cost << endl;return 0;
}

十、總結

方法	特點
梯度下降法	收斂穩定但慢
高斯-牛頓法	快速但易發散
Levenberg-Marquardt	二者結合，自動調節，收斂穩定

實用建議

阻尼初始化值 $\lambda$ ：設置為初始 Hessian 的最大對角元素的某個比例（如 $\lambda = \tau \cdot \max(\text{diag}(J^T J))$ ）
梯度與步長判斷條件：
- 使用 $\| \Delta \mathbf{x} \| < 1e{-6}$ 或 $g \| < 1e{-6}$

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/pingmian/83775.shtml
繁體地址，請注明出處：http://hk.pswp.cn/pingmian/83775.shtml
英文地址，請注明出處：http://en.pswp.cn/pingmian/83775.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！