ベイズ線形回帰 - 予測分布 - 機械学習基礎理論独習

予測分布とは、新たな入力値 $\bf x$ に対する $t$ の分布のことです。

$\begin{eqnarray} &&p(t|{\bf x},{\bf X},{\bf t})\tag{1} \end{eqnarray}$

$(1)$ が予測分布です。

予測分布は $t,{\bf w}$ の条件付き同時分布を $\bf w$ について周辺化することにより求まります。

$\begin{eqnarray} &&p(t|{\bf x},{\bf X},{\bf t})=\int p(t,{\bf w}|{\bf x},{\bf X},{\bf t}){\rm d}{\bf w}\tag{2} \end{eqnarray}$

$p(t,{\bf w}|{\bf x},{\bf X},{\bf t})$ を変形します。

$\begin{eqnarray} p(t,{\bf w}|{\bf x},{\bf X},{\bf t})&=&p(t|{\bf w},{\bf x},{\bf X},{\bf t})p({\bf w}|{\bf x},{\bf X},{\bf t})\\ &=&p(t|{\bf w},{\bf x})p({\bf w}|{\bf X},{\bf t})\tag{3} \end{eqnarray}$

$(3)$ の式変形ですが
${\bf w},{\bf x}$ が与えられた下で、 $t,\{{\bf X},{\bf t}\}$ が独立であることと
${\bf X},{\bf t}$ が与えられた下で、 ${\bf w},{\bf x}$ が独立であることを用いています。

$(3)$ を $(2)$ に代入します。

$\begin{eqnarray} &&p(t|{\bf x},{\bf X},{\bf t})=\int p(t|{\bf w},{\bf x})p({\bf w}|{\bf X},{\bf t}){\rm d}{\bf w}\tag{4} \end{eqnarray}$

ここで、

$\begin{eqnarray} &&p(t|{\bf w},{\bf x})=\mathcal{N}(t|{\bf w}^\top{\boldsymbol\phi}({\bf x}),\beta^{-1})\tag{5}\\ &&p({\bf w}|{\bf X},{\bf t})=\mathcal{N}({\bf w}|{\bf m}_N,{\bf S}_N)\tag{6}\\ \end{eqnarray}$

$(5)$ は仮定であり、
$(6)$ についてはMAP推定の記事で求めています。
$(5),(6)$ を $(4)$ に代入します。

$\begin{eqnarray} &&p(t|{\bf x},{\bf X},{\bf t})=\int \mathcal{N}(t|{\bf w}^\top{\boldsymbol\phi}({\bf x}),\beta^{-1})\mathcal{N}({\bf w}|{\bf m}_N,{\bf S}_N){\rm d}{\bf w}\tag{7} \end{eqnarray}$

$(7)$ は以下の公式を使えば、積分計算せずに求まります。

$\begin{eqnarray} &&p(\bf x) &=& \mathcal{N}(\mathbf x | \boldsymbol\mu, \mathbf\Lambda^{-1})\tag{8}\\ &&p(\bf y | \bf x) &=& \mathcal{N}(\mathbf y | \mathbf A \mathbf x + \mathbf b, \mathbf{L}^{-1}) \tag{9}\\ \end{eqnarray}$

のとき、

$\begin{eqnarray} p(\mathbf y) = \mathcal{N}(\mathbf y | \mathbf A {\boldsymbol\mu} + \mathbf b , \mathbf{L}^{-1} + \mathbf A \mathbf \Lambda^{-1} \mathbf A^{\top}) \tag{10} \end{eqnarray}$

が成り立ちます。

この公式に ${\bf x}={\bf w},{\boldsymbol\mu}={\bf m},{\boldsymbol\Lambda}^{-1}={\bf S}_N,{\bf y}=t,{\bf A}{\bf x}={\boldsymbol\phi}({\bf x})^\top{\bf w},{\bf L}^{-1}=\beta^{-1},{\bf b}=0$ のように当てはめると
$(7)$ は以下のようになります。

$\begin{eqnarray} &&p(t|{\bf x},{\bf X},{\bf t})=\mathcal{N}(t|m({\bf x}),s^2({\bf x}))\tag{11}\\ \end{eqnarray}$

ただし、

$\begin{eqnarray} &&m({\bf x})={\boldsymbol\phi}({\bf x})^\top{\bf m}_N\tag{12}\\ &&s^2({\bf x})=\beta^{-1}+{\boldsymbol\phi}({\bf x})^\top{\bf S}_N{\boldsymbol\phi}({\bf x})\tag{13}\\ \end{eqnarray}$