I’m an AI researcher, and I’ve received quite a few emails asking me just how much math is required in Artificial Intelligence.
I won’t lie: it’s a lot of math.
And this is one of the reasons AI puts off many beginners. After much research and talks with several veterans in the field, I’ve compiled this no-nonsense guide that covers all of the fundamentals of the math you’ll need to know.
The concepts mentioned below are usually covered over several semesters in college, but I’ve boiled them down to the core principles that you can focus on.
This guide is an absolute life-saver for beginners, so you can study the topics that matter most. But it's an even better resource for practitioners, such as myself, who require a quick breeze-through on these concepts.
Note: You don’t need to know all of the concepts (below) in order to get your first job in AI. All you need is a firm grasp of the fundamentals. Focus on those and consolidate them.
You can also find these resources on my Github: Jason's AI Math Roadmap.
1. Algebra You Need to Know for AI
Knowledge of algebra is perhaps fundamental to math in general. Besides mathematical operations like addition, subtraction, multiplication and division, you’ll need to know the following:
2. Linear Algebra You Need to Know for AI
Linear Algebra is the primary mathematical computation tool in Artificial Intelligence and in many other areas of Science and Engineering. With this field, you need to understand 4 primary mathematical objects and their properties:
- Scalars — a single number (can be real or natural).
- Vectors — a list of numbers, arranged in order. Consider them as points in space with each element representing the coordinate along an axis.
- Matrices — a 2-D array of numbers where each number is identified by 2 indices.
- Tensors — an N-D array (N>2) of numbers, arranged on a regular grid with N-axes. Important in Machine Learning, Deep Learning and Computer Vision.
- Eigenvectors & Eigenvalues — special vectors and their corresponding scalar quantity. Understand the significance and how to find them.
- Singular Value Decomposition — factorization of a matrix into 3 matrices. Understand the properties and applications.
- Principal Component Analysis (PCA) — understand the significance, properties, and applications.
3. Calculus You Need to Know for AI
Calculus deals with changes in parameters, functions, errors and approximations. Working knowledge of multi-dimensional calculus is imperative in Artificial Intelligence.
The following are the most important concepts (albeit non-exhaustive) in Calculus:
- Derivatives — rules (addition, product, chain rule, and so on), hyperbolic derivatives (tanh, cosh, and so on) and partial derivatives.
- Vector/Matrix Calculus — different derivative operators (Gradient, Jacobian, Hessian and Laplacian)
- Gradient Algorithms — local/global maxima and minima, saddle points, convex functions, batches and mini-batches, stochastic gradient descent, and performance comparison.
4. Statistics & Probability Concepts You Need to Know for AI
This topic will probably take up a significant chunk of your time. Good news: these concepts aren’t difficult, so there’s no reason why you shouldn’t master them.
- Basic Statistics — Mean, median, mode, variance, covariance, and so on.
- Basic rules in probability — events (dependent and independent), sample spaces, conditional probability.
- Random variables — continuous and discrete, expectation, variance, distributions (joint and conditional).
- Bayes’ Theorem — calculates validity of beliefs. Bayesian software helps machines recognize patterns and make decisions.
- Maximum Likelihood Estimation (MLE) — parameter estimation. Requires knowledge of fundamental probability concepts (joint probability and independence of events).
- Common Distributions — binomial, poisson, bernoulli, gaussian, exponential.
5. Information Theory Concepts You Need to Know for AI
This is an important field that has made significant contributions to AI and Deep Learning, and is yet unknown to many. Think of it as an amalgamation of calculus, statistics, and probability.
- Entropy — also called Shannon Entropy. Used to measure the uncertainty in an experiment.
- Cross-Entropy — compares two probability distributions and tells us how similar they are.
- Kullback Leibler Divergence — another measure of how similar two probability distributions are.
- Viterbi Algorithm — widely used in Natural Language Processing (NLP) and Speech.
- Encoder-Decoder — used in Machine Translation RNNs and other models.
Math is Fun!
If you are terrified at the mere mention of “math”, you’re probably not going to have much fun in Artificial Intelligence.
But if you’re willing to invest time to improve your familiarity with the principles underlying calculus, linear algebra, stats, and probability, nothing — not even math — should get in the way of you getting into AI.
PS: Math really is fun. As you go deeper into math, be sure to understand the beauty of a certain math concept and how it affects something. You’ll soon share the unbridled passion that many mathematicians and AI Scientists have!
A tip: Treat mathematical concepts as a pay-as-you-go: whenever a foreign concept pops up, grab it and devour it! The guide above presents a minimal, yet comprehensive, resource to understand any kind of topic or concept in AI.
Be sure to follow me on Twitter for updates on future articles. Happy learning!