Haplo Prediction
predict haplogroups
Functions
weka.c File Reference

Definitions for a Weka J48 and PART classifiers. More...

#include <config.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <inttypes.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <libxml/tree.h>
#include <libxml/parser.h>
#include <libxml/valid.h>
#include <jwsc/base/error.h>
#include <jwsc/base/file_io.h>
#include <jwsc/vector/vector.h>
#include <jwsc/matrix/matrix.h>
#include "xml.h"
#include "haplo_groups.h"
#include "weka.h"

Go to the source code of this file.

Functions

static Errorread_weka_model_labels (char **labels_str_out, const char *labels_fname)
 Reads a list of Weka model labels.
Errortrain_weka_j48_model (const Vector_u32 *labels, const Matrix_i32 *markers, const char *labels_fname, const char *model_fname, const char *weka_jar_fname)
 Trains a Weka J48 model and writes it to a file.
Errortrain_weka_part_model (const Vector_u32 *labels, const Matrix_i32 *markers, const char *labels_fname, const char *model_fname, const char *weka_jar_fname)
 Trains a Weka PART model and writes it to a file.
Errorpredict_labels_with_weka_j48_model (Vector_u32 **labels_out, Vector_d **confs_out, const Matrix_i32 *markers, const char *labels_fname, const char *model_fname, const char *weka_jar_fname)
 Predicts the labels for a set of marker samples using Weka's J48 classifier.
Errorpredict_labels_with_weka_part_model (Vector_u32 **labels_out, Vector_d **confs_out, const Matrix_i32 *markers, const char *labels_fname, const char *model_fname, const char *weka_jar_fname)
 Predicts the labels for a set of marker samples using Weka's PART classifier.
static Errorread_weka_xml_doc (xmlDoc **xml_doc_out, const char *xml_fname, const char *dtd_fname)
 Reads and and optionally validates an XML document.
static Errorcreate_weka_model_tree_from_xml_node (Weka_model_tree **tree_out, const Weka_model_tree *parent, uint32_t parent_label, xmlNode *xml_node, const char *model_dirname)
 Allocates and initializes a model tree from an xml node.
static Errorcreate_model_training_data (Vector_u32 **train_labels_out, Matrix_i32 **train_markers_out, const Vector_u32 *data_labels, const Matrix_i32 *data_markers, const Vector_u32 *model_labels, Vector_u32 *const *model_altlabels)
 Creates a set of training data as a model-specific labeled subset of a larger data set.
static Errortrain_weka_j48_model_node (Weka_model_node *node, const Vector_u32 *labels, const Matrix_i32 *markers, const char *weka_jar_fname)
 Trains a model in the tree using node-specific data.
static Errortrain_weka_part_model_node (Weka_model_node *node, const Vector_u32 *labels, const Matrix_i32 *markers, const char *weka_jar_fname)
 Trains a model in the tree using node-specific data.
Errortrain_weka_j48_model_tree (Weka_model_tree **tree_out, const Vector_u32 *labels, const Matrix_i32 *markers, const char *tree_xml_fname, const char *tree_dtd_fname, const char *model_dirname, const char *weka_jar_fname)
 Trains a Weka J48 model tree .
Errortrain_weka_part_model_tree (Weka_model_tree **tree_out, const Vector_u32 *labels, const Matrix_i32 *markers, const char *tree_xml_fname, const char *tree_dtd_fname, const char *model_dirname, const char *weka_jar_fname)
 Trains a Weka PART model tree .
static Errorrecursively_predict_j48_labels_in_model_tree (Vector_u32 **labels_out, Vector_d **confs_in_out, const Weka_model_tree *tree, const Matrix_i32 *markers, const char *weka_jar_fname)
 Recursively predicts a label from a model tree.
static Errorrecursively_predict_part_labels_in_model_tree (Vector_u32 **labels_out, Vector_d **confs_in_out, const Weka_model_tree *tree, const Matrix_i32 *markers, const char *weka_jar_fname)
 Recursively predicts a label from a model tree.
Errorpredict_labels_with_weka_j48_model_tree (Vector_u32 **labels_out, Vector_d **confs_out, const Matrix_i32 *markers, const Weka_model_tree *tree, const char *weka_jar_fname)
 Predicts the labels for a set of marker samples using a Weka J48 model tree.
Errorpredict_labels_with_weka_part_model_tree (Vector_u32 **labels_out, Vector_d **confs_out, const Matrix_i32 *markers, const Weka_model_tree *tree, const char *weka_jar_fname)
 Predicts the labels for a set of marker samples using a Weka PART model tree.
Errorread_weka_model_tree (Weka_model_tree **tree_out, const char *tree_xml_fname, const char *tree_dtd_fname, const char *model_dirname)
 Reads a Weka model tree.
void free_weka_model_tree (Weka_model_tree *tree)
 Frees a Weka model tree.

Detailed Description

Definitions for a Weka J48 and PART classifiers.

Author:
Joseph Schlecht
License:
Creative Commons BY-NC-SA 3.0

Definition in file weka.c.


Function Documentation

static Error* read_weka_model_labels ( char **  labels_str_out,
const char *  labels_fname 
) [static]

Reads a list of Weka model labels.

Definition at line 76 of file weka.c.

Error* train_weka_j48_model ( const Vector_u32 labels,
const Matrix_i32 markers,
const char *  labels_fname,
const char *  model_fname,
const char *  weka_jar_fname 
)

Trains a Weka J48 model and writes it to a file.

Parameters:
labelsLabels for training, with ith element as corresponding to the ith sample in markers.
markersMarkers for training, with ith row as a sample corresponding to the ith label in labels.
labels_fnameFile name containing labels for groups in the model.
model_fnameFile name containing the model.
weka_jar_fnameWeka java archive file.
Returns:
On success, NULL is returned; otherwise an error is returned.

Definition at line 141 of file weka.c.

Error* train_weka_part_model ( const Vector_u32 labels,
const Matrix_i32 markers,
const char *  labels_fname,
const char *  model_fname,
const char *  weka_jar_fname 
)

Trains a Weka PART model and writes it to a file.

Parameters:
labelsLabels for training, with ith element as corresponding to the ith sample in markers.
markersMarkers for training, with ith row as a sample corresponding to the ith label in labels.
labels_fnameFile name containing labels for groups in the model.
model_fnameFile name containing the model.
weka_jar_fnameWeka java archive file.
Returns:
On success, NULL is returned; otherwise an error is returned.

Definition at line 232 of file weka.c.

Error* predict_labels_with_weka_j48_model ( Vector_u32 **  labels_out,
Vector_d **  confs_out,
const Matrix_i32 markers,
const char *  labels_fname,
const char *  model_fname,
const char *  weka_jar_fname 
)

Predicts the labels for a set of marker samples using Weka's J48 classifier.

Parameters:
labels_outResult parameter.
confs_outResult parameter.
markersMarker data to predict.
labels_fnameFile name containing labels used by the model.
model_fnameFile name containing the model.
weka_jar_fnameWeka java archive file.
Returns:
On success, NULL is returned; otherwise an error is returned.

Definition at line 323 of file weka.c.

Error* predict_labels_with_weka_part_model ( Vector_u32 **  labels_out,
Vector_d **  confs_out,
const Matrix_i32 markers,
const char *  labels_fname,
const char *  model_fname,
const char *  weka_jar_fname 
)

Predicts the labels for a set of marker samples using Weka's PART classifier.

Parameters:
labels_outResult parameter.
confs_outResult parameter.
markersMarker data to predict.
labels_fnameFile name containing labels used by the model.
model_fnameFile name containing the model.
weka_jar_fnameWeka java archive file.
Returns:
On success, NULL is returned; otherwise an error is returned.

Definition at line 487 of file weka.c.

static Error* read_weka_xml_doc ( xmlDoc **  xml_doc_out,
const char *  xml_fname,
const char *  dtd_fname 
) [static]

Reads and and optionally validates an XML document.

Definition at line 642 of file weka.c.

static Error* create_weka_model_tree_from_xml_node ( Weka_model_tree **  tree_out,
const Weka_model_tree parent,
uint32_t  parent_label,
xmlNode *  xml_node,
const char *  model_dirname 
) [static]

Allocates and initializes a model tree from an xml node.

Definition at line 690 of file weka.c.

static Error* create_model_training_data ( Vector_u32 **  train_labels_out,
Matrix_i32 **  train_markers_out,
const Vector_u32 data_labels,
const Matrix_i32 data_markers,
const Vector_u32 model_labels,
Vector_u32 *const *  model_altlabels 
) [static]

Creates a set of training data as a model-specific labeled subset of a larger data set.

Definition at line 825 of file weka.c.

static Error* train_weka_j48_model_node ( Weka_model_node node,
const Vector_u32 labels,
const Matrix_i32 markers,
const char *  weka_jar_fname 
) [static]

Trains a model in the tree using node-specific data.

Definition at line 896 of file weka.c.

static Error* train_weka_part_model_node ( Weka_model_node node,
const Vector_u32 labels,
const Matrix_i32 markers,
const char *  weka_jar_fname 
) [static]

Trains a model in the tree using node-specific data.

Definition at line 944 of file weka.c.

Error* train_weka_j48_model_tree ( Weka_model_tree **  tree_out,
const Vector_u32 labels,
const Matrix_i32 markers,
const char *  tree_xml_fname,
const char *  tree_dtd_fname,
const char *  model_dirname,
const char *  weka_jar_fname 
)

Trains a Weka J48 model tree .

Parameters:
tree_outResult parameter.
labelsSample group labels.
markersSample marker values.
tree_xml_fnameXML file containing the model tree information.
tree_dtd_fnameDTD file for validating the XML file, can be NULL.
model_dirnameModel directory location.
weka_jar_fnameWeka java archive file.

Definition at line 1000 of file weka.c.

Error* train_weka_part_model_tree ( Weka_model_tree **  tree_out,
const Vector_u32 labels,
const Matrix_i32 markers,
const char *  tree_xml_fname,
const char *  tree_dtd_fname,
const char *  model_dirname,
const char *  weka_jar_fname 
)

Trains a Weka PART model tree .

Parameters:
tree_outResult parameter.
labelsSample group labels.
markersSample marker values.
tree_xml_fnameXML file containing the model tree information.
tree_dtd_fnameDTD file for validating the XML file, can be NULL.
model_dirnameModel directory location.
weka_jar_fnameWeka java archive file.

Definition at line 1061 of file weka.c.

static Error* recursively_predict_j48_labels_in_model_tree ( Vector_u32 **  labels_out,
Vector_d **  confs_in_out,
const Weka_model_tree tree,
const Matrix_i32 markers,
const char *  weka_jar_fname 
) [static]

Recursively predicts a label from a model tree.

Definition at line 1114 of file weka.c.

static Error* recursively_predict_part_labels_in_model_tree ( Vector_u32 **  labels_out,
Vector_d **  confs_in_out,
const Weka_model_tree tree,
const Matrix_i32 markers,
const char *  weka_jar_fname 
) [static]

Recursively predicts a label from a model tree.

Definition at line 1208 of file weka.c.

Error* predict_labels_with_weka_j48_model_tree ( Vector_u32 **  labels_out,
Vector_d **  confs_out,
const Matrix_i32 markers,
const Weka_model_tree tree,
const char *  weka_jar_fname 
)

Predicts the labels for a set of marker samples using a Weka J48 model tree.

Parameters:
labels_outResult parameter. If *labels_out is NULL, it is allocated; otherwise its space is re-used.
confs_outResult parameter. If *confs_out is NULL, it is allocated; otherwise its space is re-used.
markersMarker data to predict. Each row is a sample for prediction, corresponding to an element in the result parameters.
treeTrained model tree to use for predicting.
weka_jar_fnameWeka java archive file.
Returns:
On success, NULL is returned; otherwise an error is returned, but the result parameters are not freed.

Definition at line 1315 of file weka.c.

Error* predict_labels_with_weka_part_model_tree ( Vector_u32 **  labels_out,
Vector_d **  confs_out,
const Matrix_i32 markers,
const Weka_model_tree tree,
const char *  weka_jar_fname 
)

Predicts the labels for a set of marker samples using a Weka PART model tree.

Parameters:
labels_outResult parameter. If *labels_out is NULL, it is allocated; otherwise its space is re-used.
confs_outResult parameter. If *confs_out is NULL, it is allocated; otherwise its space is re-used.
markersMarker data to predict. Each row is a sample for prediction, corresponding to an element in the result parameters.
treeTrained model tree to use for predicting.
weka_jar_fnameWeka java archive file.
Returns:
On success, NULL is returned; otherwise an error is returned, but the result parameters are not freed.

Definition at line 1352 of file weka.c.

Error* read_weka_model_tree ( Weka_model_tree **  tree_out,
const char *  tree_xml_fname,
const char *  tree_dtd_fname,
const char *  model_dirname 
)

Reads a Weka model tree.

Parameters:
tree_outResult parameter.
tree_xml_fnameXML file containing the model tree information.
tree_dtd_fnameDTD file for validating the XML file, can be NULL.
model_dirnameModel directory location.

Definition at line 1381 of file weka.c.

void free_weka_model_tree ( Weka_model_tree tree)

Frees a Weka model tree.

Parameters:
treeModel tree to free.

Definition at line 1423 of file weka.c.