How to train, test and maintain AI and machine learning models

Editor's note: The following is a guest article from Steven Kursh, president of Software Analysis Group, and Art Schnure, senior consultant at Software Analysis Group.

To get insight into the skill sets required to create AI and machine learning (ML) models, it's useful to get a sense of the model creation process, which is the gradual learning done by ML software, and the challenges faced to produce a model that meets predefined success criteria.

ML software uses data to train a model, with the model constituting an AI "product" that can be reused over time through regular AI input data updates. ML software has four basic learning types:

Supervised: Involves making the algorithm learn the data while providing the correct answers using labels placed on the data. This essentially means that the classes or the values to be predicted are known and well defined for the algorithm from the very beginning.
Unsupervised: Unlike supervised methods the algorithm doesn't have correct answers or any answers at all, it is up to the algorithm's discretion to bring together similar data and understand it.
Semi-supervised: A hybrid of supervised and unsupervised learning.
Reinforcement: In reinforcement learning, rewards are given to the algorithm upon every correct prediction, driving the accuracy higher up.

Data science expertise is needed to determine the best statistical algorithms to use in the ML software to fit a particular data set.

Among the many statistical algorithms, the more popular ones are: Naive Bayes used for sentiment analysis, spam detection and recommendations; Decision Tree, used for outcome predictions; Random Forest. which merges multiple decision trees to improve predictions; Logistic Regression, used for binary classifications (A or B); Linear Regression, used for categorizing a large data set; AdaBoost, Gaussian mixture, Recommender and K-Means Clustering to organize data into groups like market segmentation.

There are many other algorithms to consider. The choice will depend on the business use case.

Training AI and ML models for use

There are three distinct learning (also known as training) stages for machine learning: training, validation and testing. Before starting, it's necessary to ensure the data is well-organized and immaculate. Though that concept is simple, getting data transformed into orderliness can be a time-consuming and detail-oriented process that might require humans.

The goal is data that is free from duplicates, typos and disconnection. After cleansing, the data gets divided up randomly into three sets used for each of the three training stages. The random data division is meant to discourage selection data biases.

Here are a few definitions relevant to model creation:

Parameter. Model parameters are values learned automatically by the ML software from the AI input data as training progresses, although a user can manually change a parameter value during the training process. Examples are the maximum number of passes to be made during a session and the training data maximum model size in bytes.
Hyperparameter. Hyperparameter values are external to ML, input beforehand by a data scientist user, so hyperparameter values are not derived from AI data and can be changed during the training process. Examples of hyperparameters are the number of clusters to be returned when using a clustering algorithm and the number of layers in a neural network.
Variable. The particular AI data input fields chosen for consideration by the ML software, with potentially using additional variables as training progresses. Variables can be age, height and weight.

Before starting training, the first stage, it's important to have labels added to the data so the ML software can proceed to obtain vital clues from the data to help it learn. Unsupervised learning does not need the added labels. ML software default parameter values can also be used to get started or the parameters can be changed individually.

Testing models for accuracy

When the training stage meets the success criteria, it's on to validation. The first pass uses a new set of data. If the results are good, proceed to the final stage, testing.

If not, it's useful to let the ML software make additional passes through the data, continuing until the ML software shows no new patterns or it reaches the maximum number of passes. The parameters are automatically modified by the ML software or by whoever is managing it as training advances.

The testing stage is the "final exam" against a new set of data — but this time lacking the "helper" data labels (for supervised learning only). If the software passes the success criteria test, it's a working model. If not, it's back to training. As before, the team can manually modify parameters or let the ML software automatically modify parameters as training progresses.

AI machine learning is a repetitive replay of ML software exposure to data, with parameters automatically changed iteratively by the ML software (and potentially by humans) to make the model smarter after each pass of the data. ML software keeps doing multiple passes of the data until it realizes no new patterns are being detected or until it reaches its maximum number of passes, causing it to stop.

AI model ongoing maintenance

Constant vigilance (monitoring) is the price of AI freedom. To determine how well an AI model is doing, an obvious tack is to monitor how closely the actual performance matches the AI prediction. If the AI predictions worsen, it's time to reenter the ML model training process to correct the model using up-to-date data.

Keep in mind that input data can easily change over time — called data drift in the trade. Data drift can cause the AI model's accuracy to deteriorate, so early data drift warnings are important to stay ahead of problems. AI tools are available to track data drift and find outlier data, such as Fiddler, Neptune, and Azure ML, which can supply early warnings so data problems can be addressed by ML updates sooner rather than later.