What exactly is AI? There is no straightforward answer to this question, and no generally accepted definition. Indeed, the definition of AI has changed over time.
In this course, we will refer to the most general definition of AI: AI refers to solutions that enable machines, particularly computer systems to exhibit intelligence, as opposed to natural intelligence of living beings. Such AI based solutions leverage different kinds & complexities of algorithms to exhibit intelligence. AI Algorithms that learn from data by using statistical methods are called Machine learning algorithms.
Machine learning algorithms learn from given data without separate rules programming. This means they are able to find structures and connections in the data using their own internal rules. Algorithms are able to use these rules to for example make sensible predictions or group data.
As an example, the following figure shows the operating principle of a construction waste sorting robot that uses machine vision.
In the picture above, the AI solution should use machine vision to identify the materials passing on the line and sort them into different baskets. To develop the solution, data is needed – in this case, images of materials in which the materials have already been identified. Based on these images, the machine learning algorithm learns to distinguish materials from each other. The algorithm that has been trained to identify materials is programmed to work as part of a software application. The software application coordinates the whole solution and moves a robotic arm that picks up the materials and places them in the right baskets.
What does AI consist of?
There are many types of AI solutions, but especially in the context of business use they consist of four main components:
1) A use case where the AI solution is needed
2) Data related to the use case
3) Machine learning algorithms
4) A software application that makes use of the machine learning results
Let's take a closer look at these components.
1) Use case
The design of AI solutions starts with the identification of a use case – a clearly defined problem that is to be solved with the help of AI. A good use case is clear and related data is available.
A company can identify a particular problem by first identifying a set of business challenges, such as large variations in quality or productivity, and then analyzing the cause. If better prediction or automatic classification of data could help solve the problem, an AI solution is worth considering.
Finding suitable use cases is one of the biggest challenges for a company looking to use AI. Business solutions generally need to consider a wide variety of factors that can affect the applicability of the use cases, and AI solutions must be both commercially sensible and sufficiently reliable. These conditions can exclude many technically attainable applications.
Later in this course, we will explore in detail how to identify and define use cases in a business environment.
2) Data
Data is the raw material for AI solutions that is processed by machine learning algorithms. The quantity, quality, and structure of the data will determine the success of the end result.
The data provided to machine learning algorithms should contain information about the phenomenon it seeks to study. For example, if the training data is pictures of cats and dogs, the solution won't learn how to identify pictures of horses or hippos. If you give such a solution an image of a horse, it will predict that the horse is a cat or a dog. Thus, the training data creates boundaries within which the AI solution is able to operate.
Data quality is important for the end result. If the solution receives poor-quality training data, it won't produce the best results. Poor-quality data can lead to a situation where AI doesn't work as expected when it's used. For example, the data could cause products to be priced incorrectly, leading to unforeseen results.
On the other hand, it is good to remember that AI solutions don't require completely error-free training data – and such data is rarely available.
The amount of data is also important in the training of AI solutions. You may have heard the argument that a lot of data is needed and that the more there is, the better the resulting AI solutions can be.
In reality, the issue is not that clear cut. Imagine a situation where we want to make a prediction model of the need for maintenance of a particular device. If the training data being used doesn’t contain information related to the need for maintenance, adding more of this data won't improve the outcome.
How much data is actually needed? Unfortunately, there is no straightforward answer to this question. The amount of data required depends on several factors which are outside the scope of this course. But as a general rule, more data is better.
The selection of training data is important because it affects the results of AI solutions. The bias of the chosen training data can be either conscious or unconscious, and the latter might lead to distorted results, among other issues.
Some data examples are also needed to improve performance of Generative AI applications. Such examples usually provide additional context that may help to ground/tune the response of Generative AI applications for your specific use case.
Suppose that we want to build an AI model that can predict how much training a novice runner would need to do to be able to run a marathon in four hours. For the AI model, we collect data from a club for running enthusiasts.
However, this means the results would probably be biased as those who actively run are unlikely to be representative of the entire population. Instead, the running enthusiasts are likely to be more talented runners than the average. This means that the prediction model might work well for a novice runner who is talented, but for the average beginner the results would be too optimistic.
The ethics of AI are closely related to this topic, as bias in the selection of training data can result in discrimination against different groups of people, for example in automated decision-making systems. The need for ethics in AI has become more profound with the proliferation of Generative AI due to the potential impact of this technology. Some of the ethical challenges related to Generative AI are the following:
Deepfakes: Generative AI can create highly convincing fake content (called deepfakes), leading to the spread of misinformation.
Bias - Generative AI can generate content that is biased towards a certain society or demographics. This is because of the inherent bias in training data.
Intellectual Property: Generative models can inadvertently replicate copyrighted material, raising concerns about intellectual property rights and fair use.
Privacy: AI models trained on personal data can leak sensitive information, violating privacy norms and regulations.
Accountability: Determining responsibility for the actions of AI systems can be complex, particularly when harm is caused.
The European Union (EU) has been at the forefront of advocating for ethical AI, driven by the recognition of these potential risks and the need to protect fundamental rights. One regulation that has been accepted by the EU is the EU AI Act. The EU AI Act is a pioneering regulatory framework proposed to govern the development and use of artificial intelligence within the European Union. It adopts a risk-based approach, categorizing AI applications into different risk levels—unacceptable, high, limited, and minimal risk. High-risk applications are subject to stringent requirements including robust data governance, transparency, human oversight, accuracy, and cybersecurity measures, while certain practices like social scoring by governments are outright banned. This legislation aims to ensure that AI technologies are safe, transparent, and respect fundamental rights, positioning the EU as a leader in ethical AI and fostering public trust in AI innovations.
3) Machine learning
Machine learning is a subset of AI that has its roots in statistics. As discussed earlier, machine learning methods learn from given data without separate rule programming. The result is stored in a machine learning model.
Pre-trained machine learning models
Especially in AI solutions for non-structured data like images or text, pre-trained machine learning models are often used. These are available for tasks like identifying animals from images, for example. The advantage of pre-trained models is that you don’t necessarily need your own data to build a machine learning model. On the other hand, it’s often possible to train/fine-tune these models further with your own data.
We won’t delve into different machine learning methods in more detail in this course, but it’s useful to know that there are many of them, including neural networks, logistic regression, gradient boosting, and decision trees. It should also be noted that many machine learning methods are associated with the term “black box”, which means that we can’t see the exact process by which a machine learning model ended up at a particular outcome.
Elements of AI
If you’re interested in learning more about machine learning and AI in general, check out the Elements of AI course.
Categories of machine learning
Machine learning is divided into three main categories: supervised learning, unsupervised learning, and reinforcement learning. These are worth remembering, as they all have different uses.
A) Supervised learning is basically prediction. The data has explanatory variables (inputs) and a response (label). The function of machine learning algorithms is to use the values of inputs to predict labels.
For example, in the figure below, the machine learning model would use the basic data of the apartment (inputs) to create a model that can predict the price of the apartment (label).
In image processing solutions, the inputs are images and the labels are identified objects in them.
Most business AI solutions are based on supervised learning. However, this is often a challenge as the values of the labels aren't always known. For example, imagine that a restaurant’s management would like to predict the amount of food waste (label), but no information on the amount of food that ends up as waste is included in their data. In this situation, the restaurant would be unable to make an accurate prediction.
If the values of the labels aren't known, it may be possible to annotate the data. An example of this could be a company with lots of pictures of products on a production line. The pictures show that some products have defects on the surface material. In this context, annotating would mean that an expert marks the faulty points in the image, for example by drawing a rectangle around them, thus creating the value of the label “faulty”. If there is no such entry in the image then the value of the label is “error-free”.
Supervised learning is further divided into two subcategories: classification and regression problems.
In classification problems, we predict a label that is a class variable. Here are two examples:
A two-class classification could be based on information about whether the customer bought the apartment after the showing. The answer options are “yes” and “no,” so they form predictable binary classes.
Multi-class classification means that there are more than two predictable classes. This is, for example, the identification of objects from an image, in which case each identified object has its own class. If the two-class example above had had a third option in addition to the “yes” and “no” answers, it would have been a multi-class classification.
In regression problems, we predict a label that falls on a continuous scale, such as sales, temperature, or price.
The following figure illustrates the difference between regression and classification problems.
It’s also possible to turn regression problems into classification problems, which in certain situations can help create a better solution. For example, a sales forecast could be made as a multi-class classification, not a single numerical figure. These could be that sales would increase by “less than 5%", "5-10%", or "more than 10%”. Instead of a numeric sales forecast, we would get a forecast that gives class-specific probabilities for sales.
B) In unsupervised learning, data (inputs) are used, but there is no predictable label variable. Typically, the goal of unsupervised learning like this is to find structures in the data that can be used.
Perhaps the best known example of this is customer segmentation, where clustering methods are used to search the data for similar customer groups. Thus, customers belonging to the same customer segment are grouped (clustered) closer to each other than customers in different segments based on their purchasing behavior.
The unsupervised learning solutions used in business are mainly related to clustering methods, meaning we want to find similar sets within the data.
The image above shows a simple example of a clustering method. Here, the algorithm has found three similar sets that differ from each other. Human help is needed to interpret the results – for example, the interpretation of “Set A” could be that it includes expensive apartments that also have a short sale period.
There are other types of unsupervised learning, but we won't cover them in this course.
C) In reinforcement learning, AI makes decisions and then learns to make future decisions based upon the feedback from those decisions. This isn't always easy to accomplish, as it can take a long time to gather feedback. Typical examples of reinforcement learning are online advertising targeting, self-driving vehicles, or the most advanced chess programs that learn more and more from the games they play over time.
There are still not many business reinforcement learning solutions, as using this method is challenging for many applications. Although we won't explore reinforcement learning further on this course, it’s still worth keeping in mind as solutions based on it are quite likely to become more common in the future.
D) Generative AI
Generative AI, also referred to as GenAI or GAI, is the domain of AI that is capable of generating data of various types e.g. unstructured data like text, computer code, voice, images, video or structured data like tabular data, often in response to a prompt given in human understandable language. There has also been active research into generating new molecules that could be useful for new drug development.
The most common types of GenAI models are:
Large Language Models (LLMs): LLMs are designed to understand and generate human language text. They are trained on vast amounts of text data to learn the patterns of language. Some examples are ChatGPT, Gemini etc.
Diffusion Models - these types of models are generally used for applications like image and video generation.
The most important components of a generative AI system from the end user perspective are the following:
Generative AI model e.g. LLM – several models have been developed and published by different companies. The most popular of these models are ChatGPT from OpenAI, Claude from Anthropic, Mistral from MistralAI, Gemini from Google and LLAMA from Meta. However, newer models are being published at a very rapid pace. The most capable models from these companies, with Meta as an exception, are closed-source and black-box models which could only be accessed via an API (or a web application from the respective vendor). Models are selected based number of criterias that may include type of output to be generated (text, image etc.), complexity of task to be accomplished, budgeted cost for use, tolerable latency of request and response, future maintenance needs, data privacy and security concerns, reliability and consistency needs, customization & finetuning needs, cloud vs on-prem infrastructure strategy.
Prompt – it is the instruction given to the Generative AI model to perform a task. A well-written and good structured prompt increases the quality of the output of the task. A well-structured prompt may include some or all of the following elements: a) persona to be adopted by the GenAI model e.g. teacher, b) instruction that clarifies the task to be performed e.g. summarize a document, c) context that may include examples of various input-output pairs, or documents that GAI model may refer to d) format or structure that the output should follow.