Only those components implemented on our platform to date are mentioned here, as can be inspected from the lists of available executables and flavors under the Unit Editor Interface.
Implementation on our platform
The user who wishes for additional functionality to be added to our platform in future should express so via a support request.
PythonML is based on the
python executable, and through this executable
the implemented ML calculations can be performed.
The following flavors are available within the current ML implementation:
pyml:setup_variables_packages: contains functions and configuration essential for all Python-ML worfklows
pyml:data_input:read_csv:pandas: for reading in CSV data using Pandas
pyml:pre_processing:standardization:sklearn: scales the data such that it has a mean of 0 and a standar deviation of 1, as implemented in Scikit-Learn
pyml:model:multilayer_perceptron:sklearn: a multilayer-perceptron implemented in Scikit-Learn
pyml:post_processing_parity_plot_matplotlib: generates a parity plot using Matpotlib
pyml:custom: contains the basic skeleton needed for ML workflow units
Custom Machine Learning Units¶
Custom machine learning units take advantage of our implementation's ability to mark Python objects as needed for
subsequent predict runs. This can be accomplished by calling the
load methods of
Additionally, by taking advantage of
settings.is_workflow_running_to_predict, users can mark certain sections of code to only run during training or
Overall, this allows workflow units to have a divergent behavior based on whether the workflow is in "train" or " predict" mode. For example, a unit can be configured to train (and then save) a model during a training job, and then to load the same model and perform a prediction during a predict job.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
import settings # The context manager exists to facilitate # saving and loading objects across Python units within a workflow. # To load an object, simply do to `context.load("name-of-the-saved-object")` # To save an object, simply do `context.save("name-for-the-object", object_here)` with settings.context as context: # Train if settings.is_workflow_running_to_train: descriptors = context.load("descriptors") target = context.load("target") # Do some transformations to the data here context.save(descriptors, "descriptor") context.save(target, "target") # Predict else: descriptors = context.load("descriptors") # Do some predictions or transformation to the data here
PythonML workflows contain two subworkflows. The first, called "Set Up the Job" performs actions such as copying data and setting environment variables necessary for the job to run. The second, called "Machine Learning," contains the actual machine learning units.
Subworkflow: Set Up the Job¶
This subworkflow facilitates setting up the PythonML job. Currently, the only thing users need to edit in this subworfklow are the names of the data files to be copied in for training or prediction purposes. Reconfiguring this subworfklow to set up for a predict job is handled automatically when the predict workflow is generated.
Subworkflow: Machine Learning¶
This subworkflow is where a user's requested machine learning units reside. This subworkflow is generally the one that users are expected to modify, to add or remove different machine learning workflow units.