BDVAL/Distribution-difference-by-feature
From Icbwiki
Compare the distribution of each feature used in a set specific of biomarker models. Distribution differences are quantified for feature signal between two sample sets (i.e., training set vs. validation set). A P-value (Kolmogorov-smirnov) and ratio of rank statistics is evaluated for each feature in each model processed.
[edit]
Mode Parameters
The following options are available in this mode
Flag | Arguments | Required | Description |
---|---|---|---|
--maqcii-properties-file | filename | Yes | The maqcii properties file such as 'maqcii-c.properties'. |
--model-conditions-file | filename | yes | The model-conditions-file such as 'model-conditions.txt'. |
--models-dir | directory | yes | The directory containing models (may be within sub-directories). |
--models-list | comma separated list | no | Default is 'all'. The models to process (or 'all' to process all models). Comma separated, such as 'DUDTR,YTNJM'. |
--model-exclude-list | comma separated list | no | Default is 'none'. The models to NOT process (or 'none' to process all models). Comma separated, such as 'DUDTR,YTNJM'. |
--signal-quality-calc-class | class name | yes | Fully qualified classname for an AbstractSignalQualityCalculator class. |
--eval-dataset-root | directory | no | Default is '-'. The eval-dataset-root directory or specify '-' to use the dataset-root directory specified in the model-conditions file. |
--properties-training-label | label | no | Default is 'training'. The label used to denote training values in the properties file. |
--properties-validation-label | label | no | Default is 'validation'. The label used to denote validation values in the properties file. |
--extended-output | true or false | no | Default is false. If true, extra output will be included. |
--max-num-classes | number of classes | no | Default is 2. The maximum number of classes (for the output file header). |