Automated machine learning (AutoML) automates and eliminates manual steps required to go from a data set to a predictive model. AutoML also lowers the level of expertise required to build accurate models, so you can use it whether you are an expert or have limited machine learning experience. By automating repetitive tasks, AutoML streamlines complex phases in the machine learning workflow, such as:
You can use MATLAB with AutoML to support many workflows, such as feature extraction and selection and model selection and tuning.
Feature extraction reduces the high dimensionality and variability present in the raw data and identifies variables that capture the salient and distinctive parts of the input signal. The process of feature engineering typically progresses from generating initial features from the raw data to selecting a small subset of the most suitable features. But feature engineering is an iterative process, and other methods such as feature transformation and dimensionality reduction can play a role.
Depending on the type of data, many approaches are available to generate features from raw data:
Feature selection identifies a subset of features that still provide predictive power, but with fewer features and a smaller model. Various methods for automated feature selection are available, including ranking features by their predictive power and learning feature importance along with the model parameters. Other feature selection methods iteratively determine a set of features that optimize model performance.
At the core of developing a comprehensive machine learning model is identifying which among the many available models performs best for the task at hand, and then tuning its hyperparameters to optimize performance. AutoML can optimize both model and associated hyperparameters in a single step. Efficient implementations of one-step model optimization apply meta learning to narrow the search for good models to a subset of candidate models based on characteristics of the features, and optimizes the hyperparameters for each of those candidate models efficiently by applying Bayesian optimization instead of the computationally more intensive grid and random searches.
If promising models are identified using other means (e.g., trial and error), its hyperparameters can be optimized individually by methods such as grid or random search, or Bayesian optimization as previously mentioned.
Once you have identified a performance model, you can deploy your optimized model without additional coding. To accomplish this task, apply automated code generation or integrate it within a simulation environment like Simulink®.