【機器學習】Nonlinear Independent Component Analysis

【機器學習】Nonlinear Independent Component Analysis - Aapo Hyv?rinen

$x_i(k) = \sum_{j=1}^{n} a_{ij}s_j(k) \quad \text{for all } i = 1 \ldots n, k = 1 \ldots K \tag{}$

$x_i(k)$ is the $i$ -th observed signal in sample point $k$ (possibly time)
$a_{ij}$ constant parameters describing “mixing”
Assuming independent, non-Gaussian latent “sources” $s_j$
ICA is identifiable, i.e. well-defined. Observing only $x_i$ we can recover both $a_{ij}$ and $s_j$ .

在這里插入圖片描述

PCA, Gaussian factor analysis are not identifiable:
- Any orthogonal rotation is equivalent: $s^{'} = U s$ has same distribution.

Extend ICA to nonlinear case to get general disentanglement?
Unfortunately, “basic” nonlinear ICA is not identifiable:
If we define nonlinear ICA model for random variables ( x_i ) as

$x_i = f_i(s_1, \ldots, s_n) , i = 1 \ldots n$

we cannot recover original sources (Darmois, 1952; Hyv?rinen & Pajunen, 1999)

Darmois (1952) showed the impossibility of nonlinear ICA:
For any $x_1, x_2$ , can always construct $y = g(x_1, x_2)$ independent of $x_1$ as

$g(\xi_1, \xi_2) = P(x_2 < \xi_2 | x_1 = \xi_1)$
Independence alone too weak for identifiability:
- We could take $x_1$ as an independent component which is absurd
Looking at non-Gaussianity equally absurd:
- Scalar transform $h(x_1)$ can give any distribution

Observe $n$ -dim time series $x (t)$
Divide $x (t)$ into $T$ segments (e.g., bins with equal sizes)
Train MLP to tell which segment a single data point comes from
- Number of classes is $T$
- Labels given by index of segment
- Multinomial logistic regression
In hidden layer $h$ , NN should learn to represent nonstationarity 非平穩性 (= differences between segments)
Could this really do Nonlinear ICA?

Assume data follows nonlinear ICA model $x (t) = f (s (t))$ with
- smooth, invertible nonlinear mixing $\mathbb{R}^n \rightarrow \mathbb{R}^n$
- components $s_i(t)$ are nonstationary, e.g., in variances
Assume we apply time-contrastive learning on $x (t)$
- using MLP with hidden layer in $h (x (t))$ with $\text{dim}(h) = \text{dim}(x)$
Then, TCL will find $s(t)^2 = Ah(x(t))$ for some linear mixing matrix $A$ . (Squaring is element-wise)
I.e.: TCL demixes nonlinear ICA model up to linear mixing (which can be estimated by linear ICA) and up to squaring.
This is a constructive proof of identifiability
Imposing independence at every segment -> more constraints -> unique solution. 增加了限制保證了indentifiability

用MLP，通過自監督分類（某一個信號來自于哪個時間段）來訓練網絡。這樣MLP可以表示不同時間段內的信號差。而后原始信號 $s^2$ 可以表示為觀測值(x)經MLP隱藏層分離結果的線性組合。

General framework with observed data vector $x$ and latent $s$ :
$\quad p(x) = \int p(x, s)ds$
where $\theta$ is a vector of parameters, e.g., in a neural network
In variational autoencoders (VAE):
- Define prior so that $s$ white Gaussian (thus $s_i$ ; all independent)
- Define posterior so that $x = f (s) + n$
Looks like Nonlinear ICA, but not identifiable
- By Gaussianity, any orthogonal rotation is equivalent:
  $\text{ has exactly the same distribution if } M^TM = I$

通過引入一個新的變量u來解，比如找視頻和音頻的關系，時間t就可以作為輔助變量（auxiliary varibale）。通過條件獨立（conditional independent）來解。

Typical deep learning needs class labels, or some targets
If no class labels: unsupervised learning
Independent component analysis is a principled approach
- can be made nonlinear
Identifiable: Can recover components that actually created the data (unlike PCA, VAE etc)
Special assumptions needed for identifiability, one of:
- Nonstationarity (“time-contrastive learning”)
- Temporal dependencies (“permutation-contrastive learning”)
- Existence of auxiliary (conditioning) variable (e.g., “iVAE”)
Self-supervised methods are easy to implement
Connection to DLVM’s can be made → iVAE
Principled framework for “disentanglement”

總結來說Linear ICA是可解的，對于Nonlinear ICA則需要增加額外的假設才能可解（原始信號可分離）。Nonlinear ICA的思想可以用在深度學習的其他模型上。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/161347.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/161347.shtml
英文地址，請注明出處：http://en.pswp.cn/news/161347.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！