Note: This entry marks then end of significant work on the new likelihood function. The SVN revision is noted in the meta-box to the right.
curve_ml5.m
Attempting to use sparsity to speed up curve_ml5.m
.
First attempt: try building only the relevant elements of the prior matrix, K.
Resut: Gains in multiplication are lost in construction of K. Insignificant speedup. Rolling back.
Second attempt: convert block-diagonal matrix S to a sparse matrix.
Result: Significant gains in multiplication; tolerable losses when constructing S. ~8x speedup
Third Attempt: Convert the individual blocks S_i to sparse, then construct S from them.
Result: Further speedup of ~2x
Fourth Attempt: Change (eye(size(K)) + S * K * S')
to (speye(size(K)) + S * K * S')
; i.e. changing eye
to speye
.
Result: moderate speedup, ~1.5x.
I think we've squeezed all we can from this function. It's now 10x faster than the naive method on a problem with 4000-dimensions.
Running test/test_ml_end_to_end_2.m
Computing marginal likelihood (old way)...
Done. (6362.9 ms)
Old ML: 207.041742
Computing marginal likelihood (new way, legacy correspondence)...
Done. (7655.2 ms)
New ML (Legacy corr): 207.700616
Computing marginal likelihood (new way)...
Done. (6537.7 ms)
New ML (New corr): 793.585038
Computing marginal likelihood (new way, no MIL)...
Done. (6330.3 ms)
New ML (New corr, no MIL): 793.600835
Computing marginal likelihood (direct method)...
Done. (667.9 ms) <------------------- 10x speedup over previous
New ML (direct method): 793.300407 <-- 0.04% error!
Note speedup and negligible error compared the previous method.
curve_ml5.m
I've noticed that markov_order
must be much larger than I expected to avoid approximation error.
Recall in curve_ml2.m
, we use the Markov assumption to break-down the prior covariance, and then combine them with the likelihood cliques. In that case, we could use a Markov order between 2 and 5 without significant approximation error.
In curve_ml5.m
, using a Markov order less than 100 results in unacceptable error.
There are two reasons for this. First, we're decomposing the marginal likelihood Gaussian, not the prior. The prior is explicitly Markovian, whereas the marginal likelihood is not. The ML Guassian adds extra uncertainty to every point, which means we need to consider more nearby points to avoid erroneous conclusions.
Second, because the new approach creates a distinct index set for each view, indices are often repeated (multiple views of the same point) and considering them doesn't tell you anything about what direction the curve is heading.
The first point will likely be mitigated when we start using the new curve model, which will allow the likelihood variance to decrease dramatically. However, the markov assumption in the prior is less likely to hold under these models, so some experimentation will be needed.
The code for the new likelihood, curve_ml5.m
is now stable. It is fast and accurate in its current implementation, assuming a reasonable value for markov_order
.
Medium-term goal: calibration/training for all curve models.
TODO