In operations between NumPy arrays (ndarray
), each shape
is automatically converted to be the same by broadcasting.
This article describes the following contents.
np.broadcast_to()
np.broadcast_arrays()
The official documentation explaining the broadcasting is below.
Use reshape()
or np.newaxis
if you want to reshape ndarray
to any shape you want.
Broadcasting in NumPy follows a strict set of rules to determine the interaction between the two arrays:
Note that the number of dimensions of ndarray
can be obtained with the ndim
attribute and the shape with the shape
attribute.
To make these rules clear, let’s consider a few examples in detail.
Let’s look at adding a two-dimensional array to a one-dimensional array:
M = np.ones((2, 3))
a = np.arange(3)
Let’s consider an operation on these two arrays. The shape of the arrays are
M.shape = (2, 3)
a.shape = (3,)
We see by rule 1 that the array a
has fewer dimensions, so we pad it on the left with ones:
M.shape -> (2, 3)
a.shape -> (1, 3)
By rule 2, we now see that the first dimension disagrees, so we stretch this dimension to match:
M.shape -> (2, 3)
a.shape -> (2, 3)
The shapes match, and we see that the final shape will be (2, 3)
:
M + a
array([[ 1., 2., 3.], [ 1., 2., 3.]])
Let’s take a look at an example where both arrays need to be broadcast:
a = np.arange(3).reshape((3, 1))
b = np.arange(3)
Again, we’ll start by writing out the shape of the arrays:
a.shape = (3, 1)
b.shape = (3,)
Rule 1 says we must pad the shape of b
with ones:
a.shape -> (3, 1)
b.shape -> (1, 3)
And rule 2 tells us that we upgrade each of these ones to match the corresponding size of the other array:
a.shape -> (3, 3)
b.shape -> (3, 3)
Because the result matches, these shapes are compatible. We can see this here:
a + b
array([[0, 1, 2], [1, 2, 3], [2, 3, 4]])
Now let’s take a look at an example in which the two arrays are not compatible:
M = np.ones((3, 2))
a = np.arange(3)
This is just a slightly different situation than in the first example: the matrix M
is transposed. How does this affect the calculation? The shape of the arrays are
M.shape = (3, 2)
a.shape = (3,)
Again, rule 1 tells us that we must pad the shape of a
with ones:
M.shape -> (3, 2)
a.shape -> (1, 3)
By rule 2, the first dimension of a
is stretched to match that of M
:
M.shape -> (3, 2)
a.shape -> (3, 3)
Now we hit rule 3–the final shapes do not match, so these two arrays are incompatible, as we can observe by attempting this operation:
M + a
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-13-9e16e9f98da6> in <module>() ----> 1 M + a ValueError: operands could not be broadcast together with shapes (3,2) (3,)
Note the potential confusion here: you could imagine making a
and M
compatible by, say, padding a
‘s shape with ones on the right rather than the left. But this is not how the broadcasting rules work! That sort of flexibility might be useful in some cases, but it would lead to potential areas of ambiguity. If right-side padding is what you’d like, you can do this explicitly by reshaping the array (we’ll use the np.newaxis
keyword):
a[:, np.newaxis].shape
(3, 1)
M + a[:, np.newaxis]
array([[ 1., 1.], [ 2., 2.], [ 3., 3.]])
Also, note that while we’ve been focusing on the +
operator here, these broadcasting rules apply to any binary ufunc
. For example, here is the logaddexp(a, b)
function, which computes log(exp(a) + exp(b))
with more precision than the naive approach:
np.logaddexp(M, a[:, np.newaxis])
array([[ 1.31326169, 1.31326169], [ 1.69314718, 1.69314718], [ 2.31326169, 2.31326169]])
We’ll now take a look at a couple of simple examples of where broadcasting can be useful.
We know that ufuncs allow a NumPy users to remove the need to explicitly write slow Python loops. Broadcasting extends this ability. One commonly seen example is when centering an array of data. Imagine you have an array of 10 observations, each of which consists of 3 values. Using the standard convention, we’ll store this in a 10×3 array:
X = np.random.random((10, 3))
We can compute the mean of each feature using the mean
aggregate across the first dimension:
Xmean = X.mean(0)
Xmean
array([ 0.53514715, 0.66567217, 0.44385899])
And now we can center the X
array by subtracting the mean (this is a broadcasting operation):
X_centered = X - Xmean
To double-check that we’ve done this correctly, we can check that the centered array has near-zero mean:
X_centered.mean(0)
array([ 2.22044605e-17, -7.77156117e-17, -1.66533454e-17])
To within machine precision, the mean is now zero.
One place that broadcasting is very useful is in displaying images based on two-dimensional functions. If we want to define a function z=f(x,y), broadcasting can be used to compute the function across the grid:
# x and y have 50 steps from 0 to 5
x = np.linspace(0, 5, 50)
y = np.linspace(0, 5, 50)[:, np.newaxis]
z = np.sin(x) ** 10 + np.cos(10 + y * x) * np.cos(x)
We’ll use Matplotlib to plot this two-dimensional array:
%matplotlib inline
import matplotlib.pyplot as plt
plt.imshow(z, origin='lower', extent=[0, 5, 0, 5],
cmap='viridis')
plt.colorbar();
The result is a compelling visualization of the two-dimensional function.
The following 2D and 1D arrays are used as examples. To make it easier to understand the result of the broadcast, one of them uses zeros()
to set all the elements to 0
.
import numpy as np
a = np.zeros((3, 3), np.int)
print(a)
# [[0 0 0]
# [0 0 0]
# [0 0 0]]
print(a.shape)
# (3, 3)
b = np.arange(3)
print(b)
# [0 1 2]
print(b.shape)
# (3,)
The shape
of 1D array is (3,)
instead of (3)
because tuples with 1 element have a comma at the end.
The result of the addition of these two ndarray
is as follows.
print(a + b)
# [[0 1 2]
# [0 1 2]
# [0 1 2]]
Let’s transform the array with a smaller number of dimensions (1D array b
) according to the rules described above.
First, according to rule 1, the array is transformed from shape (3,)
to (1, 3)
by adding a new dimension of size 1
at the head. The reshape()
method is used.
b_1_3 = b.reshape(1, 3)
print(b_1_3)
# [[0 1 2]]
print(b_1_3.shape)
# (1, 3)
Next, the size of each dimension is stretched according to rule 2. The array is stretched from (1, 3)
to (3, 3)
. The stretched part is a copy of the original part. np.tile()
is used.
print(np.tile(b_1_3, (3, 1)))
# [[0 1 2]
# [0 1 2]
# [0 1 2]]
Note that reshape()
and np.tile()
are used here for the sake of explanation, but if you want to get the broadcasted array, there are functions np.broadcast_to()
and np.broadcast_arrays()
for that purpose. See below.
The result of addition with the 2D array of (3, 1)
is as follows.
b_3_1 = b.reshape(3, 1)
print(b_3_1)
# [[0]
# [1]
# [2]]
print(b_3_1.shape)
# (3, 1)
print(a + b_3_1)
# [[0 0 0]
# [1 1 1]
# [2 2 2]]
In this case, since the number of dimensions is already the same, the array is stretched from (3, 1)
to (3, 3)
according to rule 2.
print(np.tile(b_3_1, (1, 3)))
# [[0 0 0]
# [1 1 1]
# [2 2 2]]
In the previous examples, only one of the arrays is converted, but there are cases where both are converted by broadcasting.
The following is the result of adding arrays whose shapes are (1, 3)
and (3, 1)
.
print(b_1_3)
# [[0 1 2]]
print(b_1_3.shape)
# (1, 3)
print(b_3_1)
# [[0]
# [1]
# [2]]
print(b_3_1.shape)
# (3, 1)
print(b_1_3 + b_3_1)
# [[0 1 2]
# [1 2 3]
# [2 3 4]]
Both (1, 3)
and (3, 1)
are stretched to (3, 3)
.
print(np.tile(b_1_3, (3, 1)))
# [[0 1 2]
# [0 1 2]
# [0 1 2]]
print(np.tile(b_3_1, (1, 3)))
# [[0 0 0]
# [1 1 1]
# [2 2 2]]
print(np.tile(b_1_3, (3, 1)) + np.tile(b_3_1, (1, 3)))
# [[0 1 2]
# [1 2 3]
# [2 3 4]]
The same applies if one of them is 1D array.
c = np.arange(4)
print(c)
# [0 1 2 3]
print(c.shape)
# (4,)
print(b_3_1)
# [[0]
# [1]
# [2]]
print(b_3_1.shape)
# (3, 1)
print(c + b_3_1)
# [[0 1 2 3]
# [1 2 3 4]
# [2 3 4 5]]
1D array is converted like (4,)
-> (1, 4)
-> (3, 4)
, and 2D array-like (3, 1)
-> (3, 4)
.
print(np.tile(c.reshape(1, 4), (3, 1)))
# [[0 1 2 3]
# [0 1 2 3]
# [0 1 2 3]]
print(np.tile(b_3_1, (1, 4)))
# [[0 0 0 0]
# [1 1 1 1]
# [2 2 2 2]]
print(np.tile(c.reshape(1, 4), (3, 1)) + np.tile(b_3_1, (1, 4)))
# [[0 1 2 3]
# [1 2 3 4]
# [2 3 4 5]]
Note that the dimension is stretched only when the original size is 1
. Otherwise, it cannot be broadcasted, and an error is raised, as described below.
Rule 1 applies even if the difference in the number of dimensions is two or more.
Using 3D and 1D arrays as examples, the addition results are as follows:
a = np.zeros((2, 3, 4), dtype=np.int)
print(a)
# [[[0 0 0 0]
# [0 0 0 0]
# [0 0 0 0]]
#
# [[0 0 0 0]
# [0 0 0 0]
# [0 0 0 0]]]
print(a.shape)
# (2, 3, 4)
b = np.arange(4)
print(b)
# [0 1 2 3]
print(b.shape)
# (4,)
print(a + b)
# [[[0 1 2 3]
# [0 1 2 3]
# [0 1 2 3]]
#
# [[0 1 2 3]
# [0 1 2 3]
# [0 1 2 3]]]
The shape is changed as (4, )
-> (1, 1, 4)
-> (2, 3, 4)
.
b_1_1_4 = b.reshape(1, 1, 4)
print(b_1_1_4)
# [[[0 1 2 3]]]
print(np.tile(b_1_1_4, (2, 3, 1)))
# [[[0 1 2 3]
# [0 1 2 3]
# [0 1 2 3]]
#
# [[0 1 2 3]
# [0 1 2 3]
# [0 1 2 3]]]
As mentioned above, the dimension is stretched only if the original size is 1
. If the sizes of the dimensions are different and the sizes of both arrays are not 1
, it cannot be broadcasted, and an error is raised.
a = np.zeros((4, 3), dtype=np.int)
print(a)
# [[0 0 0]
# [0 0 0]
# [0 0 0]
# [0 0 0]]
print(a.shape)
# (4, 3)
b = np.arange(6).reshape(2, 3)
print(b)
# [[0 1 2]
# [3 4 5]]
print(b.shape)
# (2, 3)
# print(a + b)
# ValueError: operands could not be broadcast together with shapes (4,3) (2,3)
The same applies to the following case.
a = np.zeros((2, 3, 4), dtype=np.int)
print(a)
# [[[0 0 0 0]
# [0 0 0 0]
# [0 0 0 0]]
#
# [[0 0 0 0]
# [0 0 0 0]
# [0 0 0 0]]]
print(a.shape)
# (2, 3, 4)
b = np.arange(3)
print(b)
# [0 1 2]
print(b.shape)
# (3,)
# print(a + b)
# ValueError: operands could not be broadcast together with shapes (2,3,4) (3,)
In this example, if a new dimension is added at the end, the array can be broadcasted.
b_3_1 = b.reshape(3, 1)
print(b_3_1)
# [[0]
# [1]
# [2]]
print(b_3_1.shape)
# (3, 1)
print(a + b_3_1)
# [[[0 0 0 0]
# [1 1 1 1]
# [2 2 2 2]]
#
# [[0 0 0 0]
# [1 1 1 1]
# [2 2 2 2]]]
It is easy to understand whether it can be broadcasted or not by right-aligned shape
.
NG
(2, 3, 4)
( 3)
OK
(2, 3, 4)
( 3, 1) -> (1, 3, 1) -> (2, 3, 4)
If the sizes are different when right-aligned and compared vertically, one of them must be 1
to be broadcasted. For example, in the case of images, a color image is a 3D array whose shape is (height, width, 3)
(3
means red, green, and blue), while a grayscale image is a 2D array whose shape is (height, width)
. In the case of computing the value of each color in a color image and the value of a grayscale image, it is impossible to broadcast even if the height
and width
are the same.
You need to add a dimension to the end of the grayscale image with np.newaxis
, np.expand_dims()
, and so on.
NG
(h, w, 3)
( h, w)
OK
(h, w, 3)
(h, w, 1) -> (h, w, 3)
np.broadcast_to()
Use np.broadcast_to()
to broadcast ndarray
with the specified shape
.
The first argument is the original ndarray
, and the second is a tuple or list indicating shape
. The broadcasted ndarray
is returned.
a = np.arange(3)
print(a)
# [0 1 2]
print(a.shape)
# (3,)
print(np.broadcast_to(a, (3, 3)))
# [[0 1 2]
# [0 1 2]
# [0 1 2]]
print(type(np.broadcast_to(a, (3, 3))))
# <class 'numpy.ndarray'>
An error occurs when specifying a shape that cannot be broadcasted.
# print(np.broadcast_to(a, (2, 2)))
# ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (3,) and requested shape (2,2)
np.broadcast_arrays()
Use np.broadcast_arrays()
to broadcast multiple ndarray
.
Specify multiple arrays separated by commas. A list of ndarray
is returned.
a = np.arange(3)
print(a)
# [0 1 2]
print(a.shape)
# (3,)
b = np.arange(3).reshape(3, 1)
print(b)
# [[0]
# [1]
# [2]]
print(b.shape)
# (3, 1)
arrays = np.broadcast_arrays(a, b)
print(type(arrays))
# <class 'list'>
print(len(arrays))
# 2
print(arrays[0])
# [[0 1 2]
# [0 1 2]
# [0 1 2]]
print(arrays[1])
# [[0 0 0]
# [1 1 1]
# [2 2 2]]
print(type(arrays[0]))
# <class 'numpy.ndarray'>
An error occurs when specifying a combination of arrays that cannot be broadcasted.
c = np.zeros((2, 2))
print(c)
# [[0. 0.]
# [0. 0.]]
print(c.shape)
# (2, 2)
# arrays = np.broadcast_arrays(a, c)
# ValueError: shape mismatch: objects cannot be broadcast to a single shape
Let’s review the two rules for broadcasting in NumPy.
1
to the head (left) of the array with the smaller dimension.1
are stretched to the size of the other array.1
in either of the two arrays, it cannot be broadcasted, and an error is raised.Resources:
https://note.nkmk.me/en/python-numpy-broadcasting/
https://jakevdp.github.io/PythonDataScienceHandbook/02.05-computation-on-arrays-broadcasting.html