MANUEL RADOVANOVIĆ : Machine Learning, How Computers Learn on Their Own?

If you're looking to get started with machine learning, this post is a great place to begin, considering that most people interested in this field easily give up at the beginning due to complex explanations, disorientation in many artificial intelligence structures, numerous unknown technical terms, programming code, and everything else that simply makes them lose their will and wonder what machine learning is in the first place. Therefore, this text aims to provide you with the simplest way and a friendly introduction to machine learning for those who are new to this field. We'll break down complex concepts into simpler terms, so you can easily grasp the core ideas without getting overwhelmed by technical jargon. To start, let's discuss the primary goal of machine learning.

The goal of machine learning is to enable computers to learn from data and make decisions or predictions based on that knowledge, without the need for explicit programming of every step. The idea is for machine learning algorithms to recognize patterns, structures, and relationships within data so that they can:

Predict future outcomes: Algorithms can use existing data to predict what will happen in the future, such as changes in prices, user behavior, or risk in financial sectors.
Classify data: The goal may be for the algorithm to classify data into different categories, such as recognizing objects in images (like facial or object recognition), identifying diseases based on medical scans, or sorting emails into spam and legitimate categories.
Automate processes: Machine learning enables the automation of processes and real-time decision making, such as movie recommendations, speech recognition, self-driving car control, and personalized advertising.
Improve accuracy and efficiency: Through learning from data, systems can improve their accuracy in recognizing and classifying objects, reducing the need for human intervention.

Children are taught machine learning in a playful way

Some people think that artificial intelligence is the same as machine learning, but in essence, AI is a broader field that encompasses various techniques, while machine learning is a specific subset of AI that focuses on enabling machines to learn from data. So, let's delve deeper into what machine learning is. Machine learning is a fascinating field that enables computers to learn from data and make decisions without direct human intervention. Although it may seem complex at first glance, the basic concept is actually quite simple. Machine learning is essential for driving artificial intelligence and technological progress. Thanks to it, we have personalized recommendations on Netflix, efficient search engines, autonomous cars, and much more.

Instead of predefining all the steps to solve a problem, machine learning enables computers to independently recognize patterns in large datasets and draw conclusions based on those patterns. The basic idea is to train algorithms on data, and after training, use them for decision making, image recognition, event prediction, or data classification. For example, a machine learning algorithm can learn to recognize faces based on thousands of sample images, or predict future values based on historical data. There are different types of machine learning, including:

Supervised learning: The algorithm learns from labeled data where inputs and outputs are known to predict outputs for new, unknown inputs.
Unsupervised learning: The algorithm tries to find hidden structures in data without predefined labels or answers.
Reinforcement learning: The algorithm learns through interaction with the environment, trying to maximize a specific goal or reward.
Machine learning is applied in many industries, including medicine, finance, autonomous vehicles, speech and image recognition, content recommenders, and more.

Machine Learning in Practice: Different Types of Machine Learning Algorithms

To guide you as simply as possible through the divisions of machine learning, we first need to pay close attention to the simplest structure of artificial intelligence. Only after visualizing the AI structure can you identify machine learning's position within this framework and explore its subsequent subdivisions. That's why you should look at the following image.

A simple structure of AI - Artificial Intelligence

We believe we have a better understanding now. Let's delve deeper and analyze the fundamental structure of artificial intelligence.

AI - Artificial Intelligence: This is the broadest term, encompassing all technologies and methods that enable computers to perform tasks that require intelligence. AI includes systems that mimic human abilities such as speech recognition, decision making, problem solving, and natural language understanding.

ML - Machine Learning: As a subset of AI, machine learning is an approach that relies on the use of statistical models and algorithms that enable computers to learn from data. Instead of being manually programmed with rules, the computer learns patterns from data on its own and adapts to new situations.

DL - Deep Learning: Deep learning is a subset of machine learning. It focuses on using artificial neural networks with many layers (so-called "deep" networks) to process more complex and larger datasets. Deep learning is used for tasks such as image, speech, and natural language recognition.

PA - Predictive Analytics: Predictive analytics is a branch of analytics that uses data, statistical algorithms, and machine learning to predict future outcomes based on historical data. The goal of predictive analytics is to identify patterns and relationships in data to enable informed predictions about future events.

Here's a hierarchical representation:

AI - Artificial Intelligence

ML - Machine Learning

DP - Deep Learning

PA - Predictive Analytics

Therefore, machine learning is a part of AI, while deep learning is a specific type of machine learning used for particularly complex tasks. Now that we understand the foundational concepts of machine learning, let's discuss the different approaches to training algorithms. These include:

Supervised learning: Ideal for tasks like classification (e.g., spam detection) and regression (e.g., predicting house prices), where we have labeled data to guide the learning process.

Unsupervised learning: Commonly used for tasks like clustering (e.g., customer segmentation) and dimensionality reduction, aiming to discover hidden structures in unlabeled data.

Reinforcement learning: This approach is well-suited for tasks that involve decision-making in sequential environments, such as playing games or controlling robots.

Different types of machine learning algorithms

But the best way to truly understand this in practice is by creating and examining a basic machine learning example in Python, specifically one that predicts movie ratings on Netflix.

The Netflix Case from Idea to Model: How to Predict Movie Ratings

To maintain consistency throughout this AI tutorial series, we will continue to utilize the Anaconda distribution, Jupyter Notebook, and Python programming language. This decision was made based on the simplicity and effectiveness of this integrated development environment. For those who have yet to install these tools, comprehensive installation instructions can be found here. Once you have successfully set up your environment, we can proceed to the coding exercises.

1. Installing the surprise Library: Before we begin, let’s install the surprise library. Open your terminal or command prompt and enter the following command:

pip install surprise

2. Loading the Data: First, we’ll load the dataset. In this example, we’re using the built-in ml-100k dataset from the MovieLens 100k dataset. This dataset contains information about users, movies, and their ratings.

3. Splitting the Data into Training and Test Sets: We divide the data into two parts: the training set used to train the model and the test set used for model evaluation. Here’s how it looks in code:

# Import necessary modules

from surprise import Dataset, Reader

from surprise import KNNBasic

from surprise import accuracy

from surprise.model_selection import train_test_split

# Load the MovieLens 100k dataset

data = Dataset.load_builtin('ml-100k')

# Split the data into training and testing tests

trainset, testset = train_test_split(data, test_size = 0.25)

4. Initializing the Collaborative Filtering Algorithm: In this example, we use the KNNBasic algorithm. KNNBasic is a simple collaborative filtering algorithm based on user or item similarity. Here’s how we initialize the algorithm:

# Initialize the KNNBasic collaborative filtering algorithm
algo = KNNBasic()

# Train the algorithm on the training set
algo.fit(trainset)

5. Predicting Ratings: After training the algorithm, we can make predictions on the test set. Here’s how we do it:

# Make predictions on the test set

predictions = algo.test(testset)

# Calculate and print the RMSE - Root Mean Squard Error of the predictions

accuracy.rmse(predictions)

6. Individual Prediction: We can also predict an individual rating for a specific user and movie. For example:

# Define a user ID and an item ID for wich we want to make a predictions

user_id = str(196)

item_id = str(302)

# Get the predicted rating for the specified user and item

pred = algo.predict(user_id, item_id)

# Print the prediction

print(pred)

This example demonstrates the basic steps in applying machine learning to predict movie ratings. In real-world projects, we would use more complex models and larger datasets, but this basic example provides a good introduction to the topic. How all this was coded and what result we got, see in the following video: