NumPy (pronounced "numb pie") is one of the most important packages to grasp when you’re starting to learn Python.
The package is known for a very useful data structure called the NumPy array. NumPy also allows Python developers to quickly perform a wide variety of numerical computations.
This tutorial will teach you the fundamentals of NumPy that you can use to build numerical Python applications today.
Table of Contents
You can skip to a specific section of this NumPy tutorial using the table of contents below:
- Introduction to NumPy
- NumPy Arrays
- NumPy Methods and Operations
- NumPy Indexing and Assignment
- Final Thoughts & Special Offer
Introduction to NumPy
In this section, we will introduce the NumPy library in Python.
What is NumPy?
NumPy is a Python library for scientific computing. NumPy stand for Numerical Python. Here is the official description of the library from its website:
“NumPy is the fundamental package for scientific computing with Python. It contains among other things:
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
NumPy is licensed under the BSD license, enabling reuse with few restrictions.”
NumPy is such an important Python library that there are other libraries (including pandas) that are built entirely on NumPy.
The Main Benefit of NumPy
The main benefit of NumPy is that it allows for extremely fast data generation and handling. NumPy has its own built-in data structure called an array
which is similar to the normal Python list
, but can store and operate on data much more efficiently.
What We Will Learn About NumPy
Advanced Python practitioners will spend much more time working with pandas than they spend working with NumPy. Still, given that pandas is built on NumPy, it is important to understand the most important aspects of the NumPy library.
Over the next several sections, we will cover the following information about the NumPy library:
- NumPy Arrays
- NumPy Indexing and Assignment
- NumPy Methods and Operations
Moving On
Let’s move on to learning about NumPy arrays, the core data structure that every NumPy practitioner must be familiar with.
NumPy Arrays
In this section, we will be learning about NumPy arrays.
What Are NumPy Arrays?
NumPy arrays are the main way to store data using the NumPy library. They are similar to normal lists in Python, but have the advantage of being faster and having more built-in methods.
NumPy arrays are created by calling the array()
method from the NumPy library. Within the method, you should pass in a list.
An example of a basic NumPy array is shown below. Note that while I run the import numpy as np
statement at the start of this code block, it will be excluded from the other code blocks in this section for brevity’s sake.
import numpy as np
sample_list = [1, 2, 3]
np.array(sample_list)
The last line of that code block will result in an output that looks like this.
array([1,2,3])
The array()
wrapper indicates that this is no longer a normal Python list. Instead, it is a NumPy array.
The Two Different Types of NumPy Arrays
There are two different types of NumPy arrays: vectors and matrices.
Vectors are one-dimensional NumPy arrays, and look like this:
my_vector = np.array(['this', 'is', 'a', 'vector'])
Matrices are two-dimensional arrays and are created by passing a list of lists into the np.array()
method. An example is below.
my_matrix = [[1, 2, 3],[4, 5, 6],[7, 8, 9]]
np.array(my_matrix)
You can also expand NumPy arrays to deal with three-, four-, five-, six- or higher-dimensional arrays, but they are rare and largely outside the scope of this course (after all, this is a course on Python programming, not linear algebra).
NumPy Arrays: Built-In Methods
NumPy arrays come with a number of useful built-in methods. We will spend the rest of this section discussing these methods in detail.
How To Get A Range Of Numbers in Python Using NumPy
NumPy has a useful method called arange
that takes in two numbers and gives you an array of integers that are greater than or equal to (>=
) the first number and less than (<
) the second number.
An example of the arange
method is below.
np.arange(0,5)
#Returns array([0, 1, 2, 3, 4])
You can also include a third variable in the arange
method that provides a step-size for the function to return. Passing in 2
as the third variable will return every 2nd number in the range, passing in 5
as the third variable will return every 5th number in the range, and so on.
An example of using the third variable in the arange
method is below.
np.arange(1,11,2)
#Returns array([1, 3, 5, 7, 9])
How To Generates Ones and Zeros in Python Using NumPy
While programming, you will from time to time need to create arrays of ones or zeros. NumPy has built-in methods that allow you to do either of these.
We can create arrays of zeros using NumPy’s zeros
method. You pass in the number of integers you’d like to create as the argument of the function. An example is below.
np.zeros(4)
#Returns array([0, 0, 0, 0])
You can also do something similar using three-dimensional arrays. For example, np.zeros(5, 5)
creates a 5x5 matrix that contains all zeros.
We can create arrays of ones using a similar method named ones
. An example is below.
np.ones(5)
#Returns array([1, 1, 1, 1, 1])
How To Evenly Divide A Range Of Numbers In Python Using NumPy
There are many situations in which you have a range of numbers and you would like to equally divide that range of numbers into intervals. NumPy’s linspace
method is designed to solve this problem. linspace
takes in three arguments:
- The start of the interval
- The end of the interval
- The number of subintervals that you’d like the interval to be divided into
An example of the linspace
method is below.
np.linspace(0, 1, 10)
#Returns array([0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0])
How To Create An Identity Matrix In Python Using NumPy
Anyone who has studied linear algebra will be familiar with the concept of an ‘identity matrix’, which is a square matrix whose diagonal values are all 1
. NumPy has a built-in function that takes in one argument for building identity matrices. The function is eye
.
Examples are below:
np.eye(1)
#Returns a 1x1 identity matrix
np.eye(2)
#Returns a 2x2 identity matrix
np.eye(50)
#Returns a 50x50 identity matrix
How To Create Random Numbers in Python Using NumPy
NumPy has a number of methods built-in that allow you to create arrays of random numbers. Each of these methods starts with random
. A few examples are below:
np.random.rand(sample_size)
#Returns a sample of random numbers between 0 and 1.
#Sample size can either be one integer (for a one-dimensional array) or two integers separated by commas (for a two-dimensional array).
np.random.randn(sample_size)
#Returns a sample of random numbers between 0 and 1, following the normal distribution.
#Sample size can either be one integer (for a one-dimensional array) or two integers separated by commas (for a two-dimensional array).
np.random.randint(low, high, sample_size)
#Returns a sample of integers that are greater than or equal to 'low' and less than 'high'
How To Reshape NumPy Arrays
It is very common to take an array with certain dimensions and transform that array into a different shape. For example, you might have a one-dimensional array with 10 elements and want to switch it to a 2x5 two-dimensional array.
An example is below:
arr = np.array([0,1,2,3,4,5])
arr.reshape(2,3)
The output of this operation is:
array([[0, 1, 2],
[3, 4, 5]])
Note that in order to use the reshape
method, the original array must have the same number of elements as the array that you’re trying to reshape it into.
If you’re curious about the current shape of a NumPy array, you can determine its shape using NumPy’s shape
attribute. Using our previous arr
variable structure, an example of how to call the shape
attribute is below:
arr = np.array([0,1,2,3,4,5])
arr.shape
#Returns (6,) - note that there is no second element since it is a one-dimensional array
arr = arr.reshape(2,3)
arr.shape
#Returns (2,3)
You can also combine the reshape
method with the shape
attribute on one line like this:
arr.reshape(2,3).shape
#Returns (2,3)
How To Find The Maximum and Minimum Value Of A NumPy Array
To conclude this section, let’s learn about four useful methods for identifying the maximum and minimum values within a NumPy array. We’ll be working with this array:
simple_array = [1, 2, 3, 4]
We can use the max
method to find the maximum value of a NumPy array. An example is below.
simple_array.max()
#Returns 4
We can also use the argmax
method to find the index of the maximum value within a NumPy array. This is useful for when you want to find the location of the maximum value but you do not necessarily care what its value is.
An example is below.
simple_array.argmax()
#Returns 3
Similarly, we can use the min
and argmin
methods to find the value and index of the minimum value within a NumPy array.
simple_array.min()
#Returns 1
simple_array.argmin()
#Returns 0
Moving On
In this section, we discussed various attributes and methods of NumPy arrays. We will follow up by working through some NumPy array practice problems in the next section.
NumPy Methods and Operations
In this section, we will be working through various operations included in the NumPy library.
Throughout this section, we will be assuming that the import numpy as np
command has already been run.
The Array Used In This Section
For this section, I will be working with an array of length 4 created using np.arange
in all of the examples.
If you’d like to compare my array with the outputs used in this section, here is how I created and printed the array:
arr = np.arange(4)
arr
The array values are below.
array([0, 1, 2, 3])
How To Perform Arithmetic In Python Using Number
NumPy makes it very easy to perform arithmetic with arrays. You can either perform arithmetic using the array and a single number, or you can perform arithmetic between two NumPy arrays.
We explore each of the major mathematical operations below.
Addition
When adding a single number to a NumPy array, that number is added to each element in the array. An example is below:
2 + arr
#Returns array([2, 3, 4, 5])
You can add two NumPy arrays using the +
operator. The arrays are added on an element-by-element basis (meaning the first elements are added together, the second elements are added together, and so on).
An example is below.
arr + arr
#Returns array([0, 2, 4, 6])
Subtraction
Like addition, subtraction is performed on an element-by-element basis for NumPy arrays. You can find example for both a single number and another NumPy array below.
arr - 10
#Returns array([-10, -9, -8, -7])
arr - arr
#Returns array([0, 0, 0, 0])
Multiplication
Multiplication is also performed on an element-by-element basis for both single numbers and NumPy arrays.
Two examples are below.
6 * arr
#Returns array([ 0, 6, 12, 18])
arr * arr
#Returns array([0, 1, 4, 9])
Division
By this point, you’re probably not surprised to learn that division performed on NumPy arrays is done on an element-by-element basis. An example of dividing arr
by a single number is below:
arr / 2
#Returns array([0. , 0.5, 1. , 1.5])
Division does have one notable exception compared to the other mathematical operations we have seen in this section. Since we cannot divide by zero, doing so will cause the corresponding field to be populated by a nan
value, which is Python shorthand for “Not A Number”. Jupyter Notebook will also print a warning that looks like this:
RuntimeWarning: invalid value encountered in true_divide
An example of dividing by zero is with a NumPy array is shown below.
arr / arr
#Returns array([nan, 1., 1., 1.])
We will learn how to deal with nan
values in more detail later in this course.
Complex Operations in NumPy Arrays
Many operations cannot simply be performed by applying the normal syntax to a NumPy array. In this section, we will explore several mathematical operations that have built-in methods in the NumPy library.
How To Calculate Square Roots Using NumPy
You can calculate the square root of every element in an array using the np.sqrt
method:
np.sqrt(arr)
#Returns array([0., 1., 1.41421356, 1.73205081])
Many other examples are below (note that you will not be tested on these, but it is still useful to see the capabilities of NumPy):
np.exp(arr)
#Returns e^element for every element in the array
np.sin(arr)
#Calculate the trigonometric sine of every value in the array
np.cos(arr)
#Calculate the trigonometric cosine of every value in the array
np.log(arr)
#Calculate the base-ten logarithm of every value in the array
Moving On
In this section, we explored the various methods and operations available in the NumPy Python library. We will text your knowledge of these concepts in the practice problems presented next.
NumPy Indexing and Assignment
In this section, we will explore indexing and assignment in NumPy arrays.
The Array I’ll Be Using In This Section
As before, I will be using a specific array through this section. This time it will be generated using the np.random.rand
method. Here’s how I generated the array:
arr = np.random.rand(5)
Here is the actual array:
array([0.69292946, 0.9365295 , 0.65682359, 0.72770856, 0.83268616])
To make this array easier to look at, I will round every element of the array to 2 decimal places using NumPy’s round
method:
arr = np.round(arr, 2)
Here’s the new array:
array([0.69, 0.94, 0.66, 0.73, 0.83])
How To Return A Specific Element From A NumPy Array
We can select (and return) a specific element from a NumPy array in the same way that we could using a normal Python list: using square brackets.
An example is below:
arr[0]
#Returns 0.69
We can also reference multiple elements of a NumPy array using the colon operator. For example, the index [2:]
selects every element from index 2 onwards. The index [:3]
selects every element up to and excluding index 3. The index [2:4]
returns every element from index 2 to index 4, excluding index 4. The higher endpoint is always excluded.
A few example of indexing using the colon operator are below.
arr[:]
#Returns the entire array: array([0.69, 0.94, 0.66, 0.73, 0.83])
arr[1:]
#Returns array([0.94, 0.66, 0.73, 0.83])
arr[1:4]
#Returns array([0.94, 0.66, 0.73])
Element Assignment in NumPy Arrays
We can assign new values to an element of a NumPy array using the =
operator, just like regular python lists. A few examples are below (note that this is all one code block, which means that the element assignments are carried forward from step to step).
array([0.12, 0.94, 0.66, 0.73, 0.83])
arr
#Returns array([0.12, 0.94, 0.66, 0.73, 0.83])
arr[:] = 0
arr
#Returns array([0., 0., 0., 0., 0.])
arr[2:5] = 0.5
arr
#Returns array([0. , 0. , 0.5, 0.5, 0.5])
Array Referencing in NumPy
NumPy makes use of a concept called ‘array referencing’ which is a very common source of confusion for people that are new to the library.
To understand array referencing, let’s first consider an example:
new_array = np.array([6, 7, 8, 9])
second_new_array = new_array[0:2]
second_new_array
#Returns array([6, 7])
second_new_array[1] = 4
second_new_array
#Returns array([6, 4]), as expected
new_array
#Returns array([6, 4, 8, 9])
#which is DIFFERENT from its original value of array([6, 7, 8, 9])
#What the heck?
As you can see, modifying second_new_array
also changed the value of new_array
.
Why is this?
By default, NumPy does not create a copy of an array when you reference the original array variable using the =
assignment operator. Instead, it simply points the new variable to the old variable, which allows the second variable to make modification to the original variable - even if this is not your intention.
This may seem bizarre, but it does have a logical explanation. The purpose of array referencing is to conserve computing power. When working with large data sets, you would quickly run out of RAM if you created a new array every time you wanted to work with a slice of the array.
Fortunately, there is a workaround to array referencing. You can use the copy
method to explicitly copy a NumPy array.
An example of this is below.
array_to_copy = np.array([1, 2, 3])
copied_array = array_to_copy.copy()
array_to_copy
#Returns array([1, 2, 3])
copied_array
#Returns array([1, 2, 3])
As you can see below, making modifications to the copied array does not alter the original.
copied_array[0] = 9
copied_array
#Returns array([9, 2, 3])
array_to_copy
#Returns array([1, 2, 3])
So far in the section, we have only explored how to reference one-dimensional NumPy arrays. We will now explore the indexing of two-dimensional arrays.
Indexing Two-Dimensional NumPy Arrays
To start, let’s create a two-dimensional NumPy array named mat
:
mat = np.array([[5, 10, 15],[20, 25, 30],[35, 40, 45]])
mat
"""
Returns:
array([[ 5, 10, 15],
[20, 25, 30],
[35, 40, 45]])
"""
There are two ways to index a two-dimensional NumPy array:
mat[row, col]
mat[row][col]
I personally prefer to index using the mat[row][col]
nomenclature because it is easier to visualize in a step-by-step fashion. For example:
#First, let's get the first row:
mat[0]
#Next, let's get the last element of the first row:
mat[0][-1]
You can also generate sub-matrices from a two-dimensional NumPy array using this notation:
mat[1:][:2]
"""
Returns:
array([[20, 25, 30],
[35, 40, 45]])
"""
Array referencing also applies to two-dimensional arrays in NumPy, so be sure to use the copy
method if you want to avoid inadvertently modifying an original array after saving a slice of it into a new variable name.
Conditional Selection Using NumPy Arrays
NumPy arrays support a feature called conditional selection
, which allows you to generate a new array of boolean values that state whether each element within the array satisfies a particular if
statement.
An example of this is below (I also re-created our original arr
variable since its been awhile since we’ve seen it):
arr = np.array([0.69, 0.94, 0.66, 0.73, 0.83])
arr > 0.7
#Returns array([False, True, False, True, True])
You can also generate a new array of values that satisfy this condition by passing the condition into the square brackets (just like we do for indexing).
An example of this is below:
arr[arr > 0.7]
#Returns array([0.94, 0.73, 0.83])
Conditional selection can become significantly more complex than this. We will explore more examples in this section’s associated practice problems.
Moving On
In this section, we explored NumPy array indexing and assignment in thorough detail. We will solidify your knowledge of these concepts further by working through a batch of practice problems in the next section.
Final Thoughts & Special Offer
Thanks for reading this article on NumPy, which is one of my favorite Python packages and a must-know library for every Python developer.
This tutorial is an excerpt from my course Python For Finance and Data Science. If you're interested in learning more core Python skills, the course is 50% off for the first 50 freeCodeCamp readers that sign up - click here to get your discounted course now!