BDVAL/Distribution-difference-by-feature

From Icbwiki

Jump to: navigation, search

Compare the distribution of each feature used in a set specific of biomarker models. Distribution differences are quantified for feature signal between two sample sets (i.e., training set vs. validation set). A P-value (Kolmogorov-smirnov) and ratio of rank statistics is evaluated for each feature in each model processed.

Mode Parameters

The following options are available in this mode

Flag Arguments Required Description
--maqcii-properties-file filename Yes The maqcii properties file such as 'maqcii-c.properties'.
--model-conditions-file filename yes The model-conditions-file such as 'model-conditions.txt'.
--models-dir directory yes The directory containing models (may be within sub-directories).
--models-list comma separated list no Default is 'all'. The models to process (or 'all' to process all models). Comma separated, such as 'DUDTR,YTNJM'.
--model-exclude-list comma separated list no Default is 'none'. The models to NOT process (or 'none' to process all models). Comma separated, such as 'DUDTR,YTNJM'.
--signal-quality-calc-class class name yes Fully qualified classname for an AbstractSignalQualityCalculator class.
--eval-dataset-root directory no Default is '-'. The eval-dataset-root directory or specify '-' to use the dataset-root directory specified in the model-conditions file.
--properties-training-label label no Default is 'training'. The label used to denote training values in the properties file.
--properties-validation-label label no Default is 'validation'. The label used to denote validation values in the properties file.
--extended-output true or false no Default is false. If true, extra output will be included.
--max-num-classes number of classes no Default is 2. The maximum number of classes (for the output file header).
Personal tools