Rewrote training procedure from scratch and now getting much more sensible results. The overall goal is to isolate parameters to try to make them as independent as possible; and then use one or two-dimensional optimization to estimate them.
First, I hard-coded several variables.
Second, I estimated perturb_position_variance directly, by measuring the variance of all groups of initial points.
Third, estimated noise and smoothness variance under the assumption of near-infinite position and rate variance. This follows from the theory of cubic-spline models, which places no penalty over initial position and rate. Used gradient-free minimizer, fminsearch
, which works well for two-dimensional problems.
Fourth, estimated perturb smoothing and perturb scale, using noise and smoothness variance.
Fifth, estimated position mean and variance directly.
Note: all this is done using heuristic indexing, not the marginal likelihood optimization. This is another key change from earlier training.
Result: Much more sensible perturb smoothing parameter (~1e-5 vs 1e-4) and perturb scale (2.06 vs 0.9). End-to-end reconstruction is much more sensible; curve is centered among triangulated points.
TODO:
PERTURB_POSITION_VARIANCE MAY BE THE OVERFITTER!
Posted by Kyle Simek