Introduction to Machine Learning
15pts
Python
Java
C++
Syntax
An Introduction to Machine Learning
I'm sure you've heard of artificial intelligence over the past
couple of months. With the rise of products like Chat-GPT, Snapchat
AI, and Bing's new AI search assistant, it's kind of hard not to. In
fact, chances are you’ve probably played around with some of these
bots. However, these trendy new tools are only a small subset of AI.
Artificial intelligence (abbreviated as AI) is any sort of program
that tries to mimic human intelligence and capabilities in order to
solve a problem.
These artificial intelligence systems are built upon ML (machine
learning) algorithms. Machine learning algorithms are trained on
data which they use to apply what they have “learned” to make
decisions. A wide spectrum of programs can be considered machine
learning programs, from the linear regression function on your
calculator to complex medical diagnosis programs.
Why AI
Now, you might ask, “When would we even need machine learning
algorithms when humans could perform the same functions?” AI has two
significant advantages over humans:
- It can do things faster than humans
- It can do things more accurately than humans
The copious amount of data required to solve certain problems, along
with the margin for human error, make them a perfect field to apply
machine learning applications to.
How They Work
Well, how does a ML model work then? At a high level, a machine
learning model is basically just a regular old function. In math
class, your teacher probably covered a function, defined as y=f(x).
Given a value x, the function f(x) would output a value of y. Now
imagine that you only know the y-value in certain cases, but want to
predict the y-value in cases where you don’t know the y-value. That
is where a basic prediction function comes into play. The goal of
machine learning algorithms is
to find the optimal function f(x) such that our model can
properly predict an accurate value y for any given x.
This function could be anything from a simple linear function to an
extremely complex function.
Let’s say you have a dataset of the accuracy of NFL quarterbacks
versus their arm lengths, with arm length being the x-value and
accuracy being the y-value. If you wanted to predict the accuracy of
a quarterback with an arm length of 3.5 feet, but no such
quarterback exists in the NFL, you could use a machine learning
model to predict the hypothetical quarterback’s arm length. Let’s
walk through the steps for this.
First, you would split the given dataset you have into a training
set and a testing set for your model. It is important to split your
data so that you have enough data to accurately train the robot and
also measure its performance on data that it hasn’t seen before.
Then, you would choose between different types of models to find a
f(x) that best fits the given trend of accuracies and arm lengths.
After using our training set to find the best f(x), we would use the
test set to evaluate the function and determine how accurate it is
at generalizing the trend.
In the above sample, we only have one X variable (arm length) and
one Y variable (accuracy). However, machine learning models applied
in scientific and commercial fields can have hundreds or thousands
of variables. This is where a distinction between real and ideal
data can be made. Ideal data has an easily generalizable trend with
strong correlations between variables. Real data is data in the real
world that does not always have strong correlations between
variables.
Obviously, we would always prefer ideal data to real data. However,
most machine learning problems have messy data; you can’t always
just use the same model to solve every problem. Different problems
require different models and types of learning. Luckily, if trained
correctly, a model can generally extract patterns from messy data.
Types of Machine Learning
Within machine learning, there are two main ways a model learns:
supervised and unsupervised learning. Imagine you are
trying to classify whether an email message is spam or not.
Supervised learning would give us each input X and label the
corresponding output Y as spam or not spam. On the other hand,
unsupervised learning would give us each input X, but would not
label whether the corresponding output Y is spam or not. You can
think of supervised learning as providing a strict set of rules for
the computer to follow while unsupervised learning gives the
computer more liberty to pick up subtle patterns on its own.
TThere is also a third type of machine learning called reinforcement
learning. In reinforcement learning, a ML algorithm learns through
trial and error. The algorithm gets rewarded and penalized for
particular actions. Reinforcement learning algorithms think several
steps ahead, as they always attempt to make the optimal move.
Reinforcement learning is used in specialized cases such as video
games with AI opponents, travel planning algorithms, and budget
optimization algorithms.
Parameters
In order to find the optimal function f(x), we need to find the
parameters to use for that function. You can think of parameters as
things that turn an input into an output. For example, the classic
equation of a line, y = mx + b, has the parameters m and b. A
machine learning algorithm typically goes through the data and makes
small tweaks to its parameters until it can find the best possible
function. Different learning algorithms and model types do this in
different ways.
Types of Machine Learning
There are two main tasks that can be accomplished with ML:
regression, and classification. Regression involves making
quantitative predictions on continuous data. Classification involves
putting data into qualitative categories.
Machine learning has a variety of applications in our day-to-day
lives from social networks to personalized medicine to
recommendation algorithms and navigation. Understanding the
foundations of ML and how to use it will become an increasingly
important skill in years to come. In future articles, we will focus
more on regression and also dive into deep learning, an increasingly
popular subset of machine learning.