[Work Log] FIRE - initial analysis

March 05, 2014

Merging datasets into database

To perform interesting queries across datasets, we'll need to merge them into a consistent format. The "structure array" dataset doesn't seem to be suited to complex queries, so we'll use a cell array with rows corresponding to records and columns corresponding to fields. We will merge records from different datasets using (Subject_ID, Visit) as a unique key. We'll also merge the metadata structures of all datasets.

New files:

Can now do rudimentary table queries using matlab operators. For example, the code below loads the database and queries, "how many bool fields have coverage above 95%?"

cd /Users/ksimek/work/src/fire/src
% read data
[data, meta] = fire_read_all('../data');
% merge into single "database" table
[db, db_meta] = merge_datasets(data, meta);
% idenitfy column types and measure coverage
db_meta_2 = fire_classify_data_columns(db, db_meta);
% perform query: number of bool fields with 95% or greater coverage
I = ([db_meta_2.coverage] > 95); 
sum(cellfun(@(x) strcmp(x,'bool'), {db_meta_2(I).subtype}))

Miscellaneous thoughts

If doing dynamical analysis, consider re-aligning time, using one of the following as origin:

Posted by Kyle Simek
blog comments powered by Disqus