AntiNex AI Utilities Docs¶
AntiNex Stack Status¶
AntiNex AI Utilities is part of the AntiNex stack:
Component | Build | Docs Link | Docs Build |
---|---|---|---|
REST API | Docs | ||
Core Worker | Docs | ||
Network Pipeline | Docs | ||
AI Utils | Docs | ||
Client | Docs |
Table of Contents¶
These are the docs for the AntiNex AI Utilities repository.
Make Predictions¶
Large helper method for driving all AI-related tasks.
Handles running:
- Building Models
- Compiling Models
- Creating Datasets for Train, Test, and Predictions
- Fitting Models
- Evaluating Models
- Cross Validating Models
- Merging Predictions with Original Records
Here is the file on GitHub in case the automodule
failed to process:
-
antinex_utils.make_predictions.
build_regression_dnn
(num_features, compile_data, label='', model_json=None, model_desc=None)[source]¶ Parameters: - num_features – input_dim for the number of features in the data
- compile_data – dictionary of compile options
- label – log label for tracking this method
- model_json – keras model json to build the model
- model_desc – optional dictionary for model
-
antinex_utils.make_predictions.
build_classification_dnn
(num_features, compile_data, label='', model_json=None, model_desc=None)[source]¶ Parameters: - num_features – input_dim for the number of features in the data
- compile_data – dictionary of compile options
- label – log label for tracking this method
- model_json – keras model json to build the model
- model_desc – optional dictionary for model
-
antinex_utils.make_predictions.
check_request
(req)[source]¶ Parameters: req – dictionary to check values
Convert Records to Scaler Dataset¶
Helper method for converting records into a scaler dataset. This means all data is bounded between a range like: [-1, 1]
.
-
antinex_utils.build_scaler_dataset_from_records.
build_scaler_dataset_from_records
(record_list, label='build-scaled-dataset', min_feature=-1, max_feature=1, cast_to_type='float32')[source]¶ Parameters: - record_list – list of json records to scale between min/max
- label – log label for tracking
- min_feature – min feature range for scale normalization
- max_feature – max feature range for scale normalization
- cast_to_type – cast all of the dataframe to this datatype
Convert Records to Scaler Train and Test Datasets¶
Helper method for converting records into a scaler datasets that are split using sklearn.model_selection.train_test_split
. This means all training and tests data is bounded between a range like: [-1, 1]
.
-
antinex_utils.build_scaler_train_and_test_datasets.
build_scaler_train_and_test_datasets
(label, train_features, test_feature, df, test_size, seed, scaler_cast_to_type='float32', min_feature_range=-1, max_feature_range=1)[source]¶ Parameters: - label – log label
- train_features – features to train
- test_feature – target feature name
- df – dataframe to build scalers and test and train datasets
- test_size – percent of test to train rows
- min_feature_range – min scaler range
- max_feature_range – max scaler range
Build a Training Request¶
Helper for building a common training request.
-
antinex_utils.build_training_request.
build_training_request
(csv_file='/tmp/cleaned_attack_scans.csv', meta_file='/tmp/cleaned_metadata.json', predict_feature='label_value', ignore_features=['label_name', 'ip_src', 'ip_dst', 'eth_src', 'eth_dst'], seed=None, test_size=0.2, preproc_rules=None)[source]¶ Parameters: - csv_file – csv file built with prepare_dataset.py
- meta_file – metadata file built with prepare_dataset.py
- predict_feature – feature (column) to predict
- ignore_features – features to remove from the csv before the split of test + train data
- seed – integer to seed
- test_size – percent of records to split into test vs train
- preproc_rules – future preprocessing rules hooks
Constant Values¶
SUCCESS = 0
FAILED = 1
ERR = 2
EX = 3
NOTRUN = 4
INVALID = 5
NOTDONE = 6
Merge Inverse Datasets into Original Records¶
Helper method for merging predictions with the original rows
Preparing a New Dataset¶
Helper for preparing a new dataset.
-
antinex_utils.prepare_dataset_tools.
find_all_headers
(use_log_id=None, pipeline_files=[], label_rules=None)[source]¶ Parameters: - use_log_id – label for debugging in logs
- pipeline_files – list of files to prep
- label_rules – dict of rules to apply
-
antinex_utils.prepare_dataset_tools.
build_csv
(pipeline_files=[], fulldata_file=None, clean_file=None, post_proc_rules=None, label_rules=None, use_log_id=None, meta_suffix='metadata.json')[source]¶ Parameters: - pipeline_files – list of files to process
- fulldata_file – output of non-edited merged data
- clean_file – cleaned csv file should be ready for training
- post_proc_rules – apply these rules to post processing (clean)
- label_rules – apply labeling rules (classification only)
- use_log_id – label for tracking the job in the logs
- meta_suffix – file suffix