Description: Run initial model estimation on real data for the first time.
Results:
Single cluster:
0.00000000e+00 0.00000000e+00
8.73210306e-03 6.75372609e-02 3.38471516e-02
1.88790755e+00 -1.88387427e+01 -9.88494611e+00
-8.09450747e-01 -2.56388154e-01 -3.61218358e-01 -4.23137123e-02 1.64628674e-02 -1.02962552e-02 -3.82633520e-01
-5.26954930e+01 -9.01436620e+00 -2.14090141e+01 1.88492958e+00 6.61450704e+00 8.49971831e+00 -2.38576056e+01
24.6497
Multiple cluster:
num_clusters:3
log weights: 0 0 0
cluster #1
0.00000000e+00 0.00000000e+00
-2.68876589e-01 -7.04390256e-02 1.45450526e-02
6.16258430e+03 6.13067965e+03 6.12930057e+03
-1.08771413e-02 5.76005556e-04 -1.04761899e-02 -1.69093096e-03 9.36991863e-04 -3.78348959e-04 1.27422332e-03
-4.89152880e+01 -9.21454922e+00 -1.77681544e+01 2.47259001e+00 6.28886806e+00 8.63120843e+00 -2.43004449e+01
26.2893
cluster #2
0.00000000e+00 0.00000000e+00
1.20215477e-01 0.00000000e+00 2.24383835e-01
6.11014579e+03 0.00000000e+00 5.96700128e+03
-1.08771413e-02 5.76005556e-04 -1.04761899e-02 -1.69093096e-03 9.36991863e-04 -3.78348959e-04 1.27422332e-03
-4.89152880e+01 -9.21454922e+00 -1.77681544e+01 2.47259001e+00 6.28886806e+00 8.63120843e+00 -2.43004449e+01
26.2893
cluster #3
0.00000000e+00 0.00000000e+00
-1.52115349e-01 2.77516102e+00 1.82492585e+00
-3.84569849e+00 -7.18131761e+02 -1.63056790e+02
-1.08771413e-02 5.76005556e-04 -1.04761899e-02 -1.69093096e-03 9.36991863e-04 -3.78348959e-04 1.27422332e-03
-4.89152880e+01 -9.21454922e+00 -1.77681544e+01 2.47259001e+00 6.28886806e+00 8.63120843e+00 -2.43004449e+01
26.2893
Discussion
Surprisingly high epsilon (~26). This is far beyond the dynamic range of the data, suggesting either (a) a bug, (b) a terrible model, or (c) failure of the analytical estimation method to find a good result. Option (a) seems more likely, since a flat line give a lower error variance than this. Perhaps our observation basis A was poorly estimated. Lets re-run with PCA method.
Description: Re-run but using PCA instead of regression to estimate observation transformation, A. (i.e. change constant use_regression_method
to false).
Results:
Single cluster:
0.00000000e+00 0.00000000e+00
8.73210306e-03 6.75372609e-02 3.38471516e-02
1.88790755e+00 -1.88387427e+01 -9.88494611e+00
-8.09450747e-01 -2.56388154e-01 -3.61218358e-01 -4.23137123e-02 1.64628674e-02 -1.02962552e-02 -3.82633520e-01
-5.26954930e+01 -9.01436620e+00 -2.14090141e+01 1.88492958e+00 6.61450704e+00 8.49971831e+00 -2.38576056e+01
24.6497
Multiple Cluster
num_clusters:3
log weights: 0 0 0
cluster #1
0.00000000e+00 0.00000000e+00
1.13041314e-02 6.75372609e-02 3.41203573e-02
3.56421036e-01 -1.88387427e+01 -1.11922319e+01
-8.09450747e-01 -2.56388154e-01 -3.61218358e-01 -4.23137123e-02 1.64628674e-02 -1.02962552e-02 -3.82633520e-01
-5.26954930e+01 -9.01436620e+00 -2.14090141e+01 1.88492958e+00 6.61450704e+00 8.49971831e+00 -2.38576056e+01
24.6497
cluster #2
0.00000000e+00 0.00000000e+00
1.05708618e-03 0.00000000e+00 0.00000000e+00
5.35534897e+01 0.00000000e+00 0.00000000e+00
-8.09450747e-01 -2.56388154e-01 -3.61218358e-01 -4.23137123e-02 1.64628674e-02 -1.02962552e-02 -3.82633520e-01
-5.26954930e+01 -9.01436620e+00 -2.14090141e+01 1.88492958e+00 6.61450704e+00 8.49971831e+00 -2.38576056e+01
24.6497
cluster #3
0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 3.06188063e-02
1.18920520e+02 0.00000000e+00 7.71079754e+01
-8.09450747e-01 -2.56388154e-01 -3.61218358e-01 -4.23137123e-02 1.64628674e-02 -1.02962552e-02 -3.82633520e-01
-5.26954930e+01 -9.01436620e+00 -2.14090141e+01 1.88492958e+00 6.61450704e+00 8.49971831e+00 -2.38576056e+01
24.6497
Discussion
No noticable improvement. During K-means, cluster collapse was frequent, which didn't occur in previous run.
Below are histograms of raw immunity readings for each marker. TNF-α, IL-2, and IL-6 have sensible distributions.
IL-1b and IL-10 have weirdly peaked distribution with heavy tails. Perhaps outliers are isolated to specific plates (investigated next).
IFN and IL-8 are borderline; peaked with heavy tails but not as bad as IL-1b and IL-10.