[Work Log] Branching ML, debugging, training

September 03, 2013
Project Tulips
Subproject Data Association v3
Working path projects/​tulips/​trunk/​src/​matlab/​data_association_3
SVN Revision 15229
Unless otherwise noted, all filesystem paths are relative to the "Working path" named above.

Blockwise precision decomposition issue

Bug: when attaching, was getting incorrect results for estimated location of attachment. Our copmutation used a subset of data points on the parent curve nearby the branch point to estimate the distribution of the branch point position. We were simply taking a submatrix of the full (decomposed) precision matrix, which wasn't correct, due to the way the full decomposed matrix was constructed.

Submatrix of decomposition != decomposition of submatrix

When using subset of variables, need to be able to get submatrix of noise precision and prior.

Let \((s' s) = S\), and let \(P\) be a matrix that selects rows of \(S\). if you want to decompose the submatrix \(P S P'\), in general, you cannot simply take the submatrix \(P s P'\). There are at-least two special case where this is valid:

  1. \(s\) is a Cholesky decomposition of \(S\). But since \(S\) is degenerate in our case, we can't use Cholesky.
  2. \(S\) is block-diagonal, and \(P\) doesn't split-up the blocks.

I thought that condition 2 was being satisfied, because I foolishly assumed the eigenvector matrix produced by eigenvalue decomposition was preserving block structure. I had to modify correspondence/flatten_sort_and_reverse.m to perform eigenvalue decomposition on each block individually. I had previously determined that operating on groups of five blocks at once is faster than operating individually, but we'll have to sacrifice this speed-up for now.


Okay, fixed the precision problem. After attaching stem 2 to stem 1, Corrs{2}.mu_b now looks reasonable.

Still getting really low ML's...


Bingo! Bug in curve_tree_ml_2.m that was introduced when we started allowing non-zero means when evaluating our conditional normal distribution. Recall the equation for marginal likelihood (reproduced from this post):

\[ \begin{align} p(y) &= \frac{Z_{l,3d} }{Z_l}\frac{1}{Z} \exp\{x^\top S(S + S K S)^{-1} S x \} \\ &= \frac{Z_{l,3d} }{Z_l}\frac{1}{Z} \exp\{x^\top s^\top(I + s K s^\top)^{-1} s x \} \\ &= \frac{Z_{l,3d} }{Z_l} \mathcal{N}\left (sx ; 0, (I + s K s^\top) \right ) \end{align} \]
Aside: in the second line, we've substituded the decomposition, \(s^\top s = S\), which we use in practice.

However, this equation assume zero mean. If we include a nonzero mean, the \(s\) matrix distributes to both \(y\) and \(\mu\):

\[ \begin{align} p(y) &= \frac{Z_{l,3d} }{Z_l}\frac{1}{Z} \exp\{(x-\mu)^\top s^\top(I + s K s^\top)^{-1} s (x - \mu) \} \\ &= \frac{Z_{l,3d} }{Z_l} \mathcal{N}\left (sx ; s \mu, (I + s K s^\top) \right ) \end{align} \]

In our implementation, we weren't multiplying \(\mu\) by \(s\). Fixing this seems to fix the low ML issue:

Corrs = attach(Corrs, 2,1,12, -12, params);
curve_tree_ml_2(Corrs, params, data_)

    ans =

       2.0903e+04

Corrs_ind = attach(Corrs, 2,0,0,0, params);
curve_tree_ml_2(Corrs_ind, params, data_)

    ans =

       2.0804e+04

In the case above, ML is roughly maximized when branch index is -12 and start index is 12, which seems to agree with the data, visually.


Next: * get attachment, reversal, and branch points from ground truth * store ml_2d with corr. update on merge. use during ml computation instead of data_ * branching in training ML * traing with branching * branch-wise reconstruction.

Posted by Kyle Simek
blog comments powered by Disqus