What exactly is AI? There is no straightforward answer to this question, and no generally accepted definition. Indeed, the definition of AI has changed over time.
In this course, we’ll define AI as solutions that use machine learning.
Machine learning algorithms learn from given data without separate rules programming. This means they are able to find structures and connections in the data using their own internal rules. Algorithms are able to use these rules to for example make sensible predictions or group data.
As an example, the following figure shows the operating principle of a construction waste sorting robot that uses machine vision.
In the picture above, the AI solution should use machine vision to identify the materials passing on the line and sort them into different baskets. To develop the solution, data is needed – in this case, images of materials in which the materials have already been identified. Based on these images, the machine learning algorithm learns to distinguish materials from each other. The algorithm that has been trained to identify materials is programmed to work as part of a software application. The software application coordinates the whole solution and moves a robotic arm that picks up the materials and places them in the right baskets.
What does AI consist of?
There are many types of AI solutions, but especially in the context of business use they consist of four main components:
1) A use case where the AI solution is needed
2) Data related to the use case
3) Machine learning algorithms
4) A software application that makes use of the machine learning results
Let's take a closer look at these components.
1) Use case
The design of AI solutions starts with the identification of a use case – a clearly defined problem that is to be solved with the help of AI. A good use case is clear and related data is available.
A company can identify a particular problem by first identifying a set of business challenges, such as large variations in quality or productivity, and then analyzing the cause. If better prediction or automatic classification of data could help solve the problem, an AI solution is worth considering.
Finding suitable use cases is one of the biggest challenges for a company looking to use AI. Business solutions generally need to consider a wide variety of factors that can affect the applicability of the use cases, and AI solutions must be both commercially sensible and sufficiently reliable. These conditions can exclude many technically attainable applications.
Later in this course, we will explore in detail how to identify and define use cases in a business environment.
Data is the raw material for AI solutions that is processed by machine learning algorithms. The quantity, quality, and structure of the data will determine the success of the end result.
The data provided to machine learning algorithms should contain information about the phenomenon it seeks to study. For example, if the training data is pictures of cats and dogs, the solution won't learn how to identify pictures of horses or hippos. If you give such a solution an image of a horse, it will predict that the horse is a cat or a dog. Thus, the training data creates boundaries within which the AI solution is able to operate.
Data quality is important for the end result. If the solution receives poor-quality training data, it won't produce the best results. Poor-quality data can lead to a situation where AI doesn't work as expected when it's used. For example, the data could cause products to be priced incorrectly, leading to unforeseen results.
On the other hand, it is good to remember that AI solutions don't require completely error-free training data – and such data is rarely available.
The amount of data is also important in the training of AI solutions. You may have heard the argument that a lot of data is needed and that the more there is, the better the resulting AI solutions can be.
In reality, the issue is not that clear cut. Imagine a situation where we want to make a prediction model of the need for maintenance of a particular device. If the training data being used doesn’t contain information related to the need for maintenance, adding more of this data won't improve the outcome.
How much data is actually needed? Unfortunately, there is no straightforward answer to this question. The amount of data required depends on several factors which are outside the scope of this course. But as a general rule, more data is better.
The selection of training data is important because it affects the results of AI solutions. The bias of the chosen training data can be either conscious or unconscious, and the latter might lead to distorted results, among other issues.
Suppose that we want to build an AI model that can predict how much training a novice runner would need to do to be able to run a marathon in four hours. For the AI model, we collect data from a club for running enthusiasts.
However, this means the results would probably be biased as those who actively run are unlikely to be representative of the entire population. Instead, the running enthusiasts are likely to be more talented runners than the average. This means that the prediction model might work well for a novice runner who is talented, but for the average beginner the results would be too optimistic.
The ethics of AI are closely related to this topic, as bias in the selection of training data can result in discrimination against different groups of people, for example in automated decision-making systems.
3) Machine learning
Machine learning is a subset of AI that has its roots in statistics. As discussed earlier, machine learning methods learn from given data without separate rule programming. The result is stored in a machine learning model.
Pre-trained machine learning models
Especially in AI solutions for non-structured data like images or text, pre-trained machine learning models are often used. These are available for tasks like identifying animals from images, for example. The advantage of pre-trained models is that you don’t necessarily need your own data to build a machine learning model. On the other hand, it’s often possible to train these models further with your own data.
We won’t delve into different machine learning methods in more detail in this course, but it’s useful to know that there are many of them, including neural networks, logistic regression, gradient boosting, and decision trees. It should also be noted that many machine learning methods are associated with the term “black box”, which means that we can’t see the exact process by which a machine learning model ended up at a particular outcome.
Elements of AI
If you’re interested in learning more about machine learning and AI in general, check out the Elements of AI course.
Categories of machine learning
Machine learning is divided into three main categories: supervised learning, unsupervised learning, and reinforcement learning. These are worth remembering, as they all have different uses.
A) Supervised learning is basically prediction. The data has explanatory variables (inputs) and a response (label). The function of machine learning algorithms is to use the values of inputs to predict labels.
For example, in the figure below, the machine learning model would use the basic data of the apartment (inputs) to create a model that can predict the price of the apartment (label).
In image processing solutions, the inputs are images and the labels are identified objects in them.
Most business AI solutions are based on supervised learning. However, this is often a challenge as the values of the labels aren't always known. For example, imagine that a restaurant’s management would like to predict the amount of food waste (label), but no information on the amount of food that ends up as waste is included in their data. In this situation, the restaurant would be unable to make an accurate prediction.
If the values of the labels aren't known, it may be possible to annotate the data. An example of this could be a company with lots of pictures of products on a production line. The pictures show that some products have defects on the surface material. In this context, annotating would mean that an expert marks the faulty points in the image, for example by drawing a rectangle around them, thus creating the value of the label “faulty”. If there is no such entry in the image then the value of the label is “error-free”.
Supervised learning is further divided into two subcategories: classification and regression problems.
In classification problems, we predict a label that is a class variable. Here are two examples:
A two-class classification could be based on information about whether the customer bought the apartment after the showing. The answer options are “yes” and “no,” so they form predictable binary classes.
Multi-class classification means that there are more than two predictable classes. This is, for example, the identification of objects from an image, in which case each identified object has its own class. If the two-class example above had had a third option in addition to the “yes” and “no” answers, it would have been a multi-class classification.
In regression problems, we predict a label that falls on a continuous scale, such as sales, temperature, or price.
The following figure illustrates the difference between regression and classification problems.
It’s also possible to turn regression problems into classification problems, which in certain situations can help create a better solution. For example, a sales forecast could be made as a multi-class classification, not a single numerical figure. These could be that sales would increase by “less than 5%", "5-10%", or "more than 10%”. Instead of a numeric sales forecast, we would get a forecast that gives class-specific probabilities for sales.
B) In unsupervised learning, data (inputs) are used, but there is no predictable label variable. Typically, the goal of unsupervised learning like this is to find structures in the data that can be used.
Perhaps the best known example of this is customer segmentation, where clustering methods are used to search the data for similar customer groups. Thus, customers belonging to the same customer segment are grouped (clustered) closer to each other than customers in different segments based on their purchasing behavior.
The unsupervised learning solutions used in business are mainly related to clustering methods, meaning we want to find similar sets within the data.
The image above shows a simple example of a clustering method. Here, the algorithm has found three similar sets that differ from each other. Human help is needed to interpret the results – for example, the interpretation of “Set A” could be that it includes expensive apartments that also have a short sale period.
There are other types of unsupervised learning, but we won't cover them in this course.
C) In reinforcement learning, AI makes decisions and then learns to make future decisions based upon the feedback from those decisions. This isn't always easy to accomplish, as it can take a long time to gather feedback. Typical examples of reinforcement learning are online advertising targeting, self-driving vehicles, or the most advanced chess programs that learn more and more from the games they play over time.
There are still not many business reinforcement learning solutions, as using this method is challenging for many applications. Although we won't explore reinforcement learning further on this course, it’s still worth keeping in mind as solutions based on it are quite likely to become more common in the future.