No work logs this week. All week I've been splitting time between research and packing for the move to Phoenix on Monday. I've made some progress on the training framework.
Due to periodic crashing of Matlab (user error!), I added a script tmp_setup_workspace.m
, which is intended to restore all the needed variables for whatever task I'm currently working on.
Spend some time thinking about how to design a version of the marginal likelihood computation that is optimized for training.
Some time was spent deriving the analytical ML gradient w.r.t. the training parameters. I believe the result won't be too complicated, but deriving it was taking up too much time. We'll use a numerical gradient for now; will return to analytical gradient if the numerical proves to be lacking.
To save some computation during training, I precomputed the component matrices for the prior and likelihood. Constructing the prior and likelihood covariance matrices will involve scaling and summing the component parts. These cached matrices consume about 740 MB with the current training set.
However, one unavoidable bottleneck continues to be the cholesky decomposition, which isn't improved by this precomutation. I was hoping there would be some matrix inversion tricks involving linear combinations of matrices, but my research on this came up empty. My last hope is to run Cholesky on the GPU (Matlab makes this trivial), but if that doesn't speed things up, I'll resign myself to waiting 10 hours for training.
Nevertheless, the component caching still gives a 1.5x end-to-end speedup on the no_purturb_kernel case, compared the to direct implementation.
...
Training-optimized Marginal Likelihood is finished; in train/tr_curves_ml.m
. Results confirmed against reference implementation: train/tr_curves_ml_ref.m
.
train/tr_curves_ml.m