The reversed curve issue only really matters during training. Our tests show that curve-flip moves will mix well, even if the maximum isn't always correct. Adding connections between parent and child curves should resolve these issues.
During training, derive curve direction from ground-truth.
Realized that my last attempt at this had two bugs:
rand()
instead of randn()
.Fixing this:
dir = randn(3,10000000);
dir = bsxfun(@times, dir, 1./sqrt(sum(dir.^2)));
var(dir(:))
ans =
0.3332
Compare this to earlier theoretical results of ~0.23.
This new result is interesting, because it is 25% higher than the emperical results we've been getting. I'm guessing that the fact all of the curves point upward reduces the variance. To prove, we'll force all points to be in the top hemishphere:
dir = randn(3,10000000);
dir(3,:) = abs(dir(3,:));
dir = bsxfun(@times, dir, 1./sum(dir.^2));
var(dir(:))
ans =
0.3149
Yep. And in practive, our values take on an even smaller range of directions.
...
Repeating for the 2D case:
dir = randn(2,10000000);
dir = bsxfun(@times, dir, 1./sqrt(sum(dir.^2)));
var(dir(:))
ans =
0.5000
This strongly suggests a pattern of variance being 1/D.
does connecting each of the curves result in better ML? Do we need to marginalize?
construct training ML for background ML construct mats for bg curve models.
Result: train/tr_train_bg.m
.
position_mean_2d: [2x1 double]
position_variance_2d: 1.7837e+04
rate_variance_2d: 0.5000
noise_variance_2d: 9.4597e-04
smoothing_variance_2d: 0.0157
Interesting that noise_variance_2d is so low. We expected it to be on the order of 1 pixel. More discussion on this later.
I retrained the BG model for only the foreground curves, and evaluated the ML under it.
position_mean_2d: [2x1 double]
position_variance_2d: 5.8116e+03
rate_variance_2d: 0.5000
noise_variance_2d: 4.8105e-04
smoothing_variance_2d: 0.0053
Smaller noise variance, smaller smoothing variance. This shrinking of variance with smaller training set is typical overfitting behavior. Not of much concern.
Here's the comparison against the ML for the trained foreground model.
bg model = 4611.886746
fg model = -8049.097873
Not good. The FG model on true foreground curves should have a better marginal likelihood than the same curves under the BG model.
Some questions
This warrants further investigation.
If I force the noise variance to be equal to that of the fg model (0.72), the ML drops significantly (fg results reprinted for convenience):
bg model = -9413.1e+03
fg model = -8049.097873
Now we're back in business. This is a good sanity check, but it doesn't explain why we can't get similar noise variances for both models when training.
Possibly the smoothing variance would need to change in this case. Retraining, with noise_variance forced to 0.72:
position_mean_2d: [2x1 double]
position_variance_2d: 5.8116e+03
rate_variance_2d: 0.5000
noise_variance_2d: 0.7204
smoothing_variance_2d: 2.7394e-05
Smoothing variance dropped dramatically. ML comparison (fg results reprinted for convenience):
bg model = -9214.632874
fg model = -8049.097873
Note that bg ML didn't significantly change (-2%) after optimizing noise variance (which did change a lot).
Might be worthwhile visualizing the optimal fits with these parameters. Are we oversmoothing? undersmoothing? These would suggest a bug.
Running reference ml: data_2 = init_data_curves_ml(data_, bg_train_params_done_force) sum([data_2.curves_ml{:}])
ans =
-5.1552e+04
Very different from the training implementation. Need to dig deeper to determine the cause.
Do we need to re-estimate the index set during training of the FG model?
Iterate: train, re-index, repeat.