Machine Learning Units¶
The following types of Machine Learning-specific units are available.
The DataFrame input/output type of units are used to define the training data, to select target properties that need to be predicted as output of the Machine Learning computation, and feature properties used as input. The data is returned in the Pandas DataFrame format 1.
Processing units are used for general data manipulation, for example cleaning missing data or remove duplicates in the data.
Data transformation might for example involve the scaling and reducing of the input data. Scaling is a method used to standardize the range of independent variables or features of data, through for instance the normalization of the data.
Feature selection units involve selecting the number of feature properties for model training. If equal to 0, will use all available features. The feature selection algorithm can also be chosen (e.g. by regression).