Processing math: 100%

[Work Log]

August 03, 2013
Project Tulips
Subproject Data Association v2
Working path projects/​tulips/​trunk/​src/​matlab/​data_association_2
Unless otherwise noted, all filesystem paths are relative to the "Working path" named above.

Running optimizer...

Issue: Likelihood variance is collapsing to zero.

Guess: Training ML is different from naive ML and isn't penalizing noise correctly.

Test 1: save params, compare training ml and naive ml

Same result.

Discussion

The marginal likelihood is monotonically increasing as the noise variance σ2n approaches zero. This shouldn't be happening, because as the likelihood variance collapses to a delta, the marginal likelihood should become equal to the prior evaluated at the (virtual) observed locations.
limσn0p(D)=limσn0p0(x)pl(Dx)dx(Marginalization)=limσn0p0(x)f(Dx)dx(Convolution)=p0(x)δ(Dx)dx=p0(D)

This implies there's a bug in the code causing this phenomenon; the mathematical model is not the cause.

Solved

Found the bug. When computing the likelihood precision S from the unscaled precision S0, the noise variance σ2n was multiplied instead of divided:
S = S0 * sigma_n; // incorrect
S = S0 * (1/sigma_n); // corrected version

Resuming Training

Had some trouble with noise sigmas being too low. Solved by in two ways:

  1. Attempts 1-3: clamping to a minimum value and then adding a penalty depending on the amount that was clamped.
  2. Attempt 4: offset by minimum value before transforming

Command:

train_params_done = tr_train(test_Corrs_ll, train_params, data_, 400);

Attempt #1:

Handle extreme values by incuring a penalty for variances smaller than 1e-5. Result stored in train_params_done_1.

train_params_done_1 = 

            smoothing_variance: 0.0025
                noise_variance: 0.1231
             position_variance: 1.6676e+04
                 rate_variance: 0.2212
    perturb_smoothing_variance: 1
         perturb_rate_variance: 1
     perturb_position_variance: 1
                 perturb_scale: 1

Surprisingly small smoothness sigma. Surprisingly low rate variance.

Attempt #2

Testing sensitivity to the magnitude of the penalty term. Scaled up penalty by 1000. Results in train_params_done_2:

train_params_done_2 = 

            smoothing_variance: 0.0025
                noise_variance: 0.1231
             position_variance: 1.6676e+04
                 rate_variance: 0.2212
    perturb_smoothing_variance: 1
         perturb_rate_variance: 1
     perturb_position_variance: 1
                 perturb_scale: 1

Summary: no change.

Attempt #3

Testing sensitivity to the threshold where the penalty term starts to be incurred. Set MIN_NOISE_VARIANCE to 1e-3 and MIN_SMOOTHING_VARIANCE to 1e-7 (previously both 1e-5). Results in train_params_done_3:

train_params_done_3 = 

            smoothing_variance: 0.0025
                noise_variance: 0.1231
             position_variance: 1.6676e+04
                 rate_variance: 0.2212
    perturb_smoothing_variance: 1
         perturb_rate_variance: 1
     perturb_position_variance: 1
                 perturb_scale: 1

Summary: no change.

Attempt #4 Handled small variances by offsetting before transforming:

% converting param to state variable
x(1) = log(smoothing_variance - MIN_SMOOTING_VARIANCE);

% converting state variable to param
x(1) = exp(smoothing_variance) + MIN_SMOOTING_VARIANCE;

This is more elegant than the penalty hack used in the last three attempts. Results:

train_params_done = 

            smoothing_variance: 0.0025
                noise_variance: 0.1231
             position_variance: 1.6676e+04
                 rate_variance: 0.2212
    perturb_smoothing_variance: 1
         perturb_rate_variance: 1
     perturb_position_variance: 1
                 perturb_scale: 1

Summary: no change.

No-Perturb model summary:

Successfully trained model.
Required 115 function evaluations, taking 112 s.
Optimal marginal likelihood: 23714.937760.
Optimal parameters:

            smoothing_variance: 0.0025
                noise_variance: 0.1231
             position_variance: 1.6676e+04
                 rate_variance: 0.2212

Surprised at how small noise_variance was, considering the calibration noise. However, I guess the maximum-likelihood reconstruction looked pretty good, if unity-apect-ratio axis scaling is used.

Training OU-Perturb model

Getting weird results. Halted after second iteration; second iteration took 250 function evaluations; perturb_smoothing_variance gradient is zero. Results:

train_params_done = 

            smoothing_variance: 1.5003e-05
                noise_variance: 1.0087e-04
             position_variance: 7.9393e+04
                 rate_variance: 0.7879
    perturb_smoothing_variance: 0.2500
         perturb_rate_variance: 1.5183
     perturb_position_variance: 2.5159e+04
                 perturb_scale: 48.6163

Recall that new model ML's aren't deeply tested; probably a bug in there (in the kernel implementation? in the kernel theory? in the conversion from mats to kernel?). Will continue tomorrow.

Posted by Kyle Simek