File Input and Output with Arrays
NumPy is able to save and load data to and from disk either in text or binary format. In this section I only discuss NumPy’s built-in binary format, since most users will prefer pandas and other tools for loading text or tabular data.
np.save and np.load are the two workhorse functions for efficiently saving and loading array data on disk. Arrays are saved by default in an uncompressed raw binary format with file extension .npy:
In [1]: arr = np.arange(10)
In [2]: np.save('some_array', arr)
If the file path does not already end in .npy, the extension will be appended. The array on disk can then be loaded with np.load:
In [3]: np.load('some_array.npy')
Output: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
You save multiple arrays in an uncompressed archive using np.savez and passing the arrays as keyword arguments:
In [4]: np.savez('array_archive.npz', a=arr, b=arr)
When loading an .npz file, you get back a dict-like object that loads the individual arrays lazily:
In [5]: arch = np.load('array_archive.npz')
In [6]: arch['b']
Output: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
If your data compresses well, you may wish to use numpy.savez_compressed instead:
In [7]: np.savez_compressed('arrays_compressed.npz', a=arr, b=arr)
Linear Algebra
Linear algebra, like matrix multiplication, decompositions, determinants, and other square matrix math, is an important part of any array library. Unlike some languages like MATLAB, multiplying two two-dimensional arrays with * is an element-wise product instead of a matrix dot product. Thus, there is a function dot, both an array method and a function in the numpy namespace, for matrix multiplication:
In [8]: x = np.array([[1., 2., 3.], [4., 5., 6.]])
In [9]: y = np.array([[6., 23.], [-1, 7], [8, 9]])
In [10]: x
Output:
array([[ 1., 2., 3.],
[ 4., 5., 6.]])
In[11]: y
Output:
array([[ 6., 23.],
[ -1., 7.],
[ 8., 9.]])
In [12]: x.dot(y)
Output:
array([[ 28., 64.],
[ 67., 181.]])
x.dot(y) is equivalent to np.dot(x, y):
In [13]: np.dot(x, y)
Output:
array([[ 28., 64.],
[ 67., 181.]])
A matrix product between a two-dimensional array and a suitably sized onedimensional array results in a one-dimensional array:
In [14]: np.dot(x, np.ones(3))
Output: array([ 6., 15.])
The @ symbol (as of Python 3.5) also works as an infix operator that performs matrix multiplication:
In [15]: x @ np.ones(3)
Out[230]: array([ 6., 15.])
numpy.linalg has a standard set of matrix decompositions and things like inverse and determinant. These are implemented under the hood via the same industry standard linear algebra libraries used in other languages like MATLAB and R, such as
BLAS, LAPACK, or possibly (depending on your NumPy build) the proprietary Intel MKL (Math Kernel Library):
In [16]: from numpy.linalg import inv, qr
In [17]: X = np.random.randn(5, 5)
In [18]: mat = X.T.dot(X)
In [19]: inv(mat)
Output: array([[ 10.98129066, -15.92038594, 15.72674408, 21.1310146 ,
-6.51087108],
[-15.92038594, 30.06284502, -24.84925283, -37.06739528,
11.16662613],
[ 15.72674408, -24.84925283, 23.88020665, 33.07205801,
-10.13130535],
[ 21.1310146 , -37.06739528, 33.07205801, 48.09766757,
-14.48628627],
[ -6.51087108, 11.16662613, -10.13130535, -14.48628627,
4.51183536]])
In [20]: mat.dot(inv(mat))
Output: array([[ 1.00000000e+00, 4.63516236e-15, -7.90247705e-16,
-2.52654829e-14, 3.51039567e-15],
[ 2.37423690e-15, 1.00000000e+00, -4.97047729e-16,
-2.92387271e-15, 3.31878540e-16],
[-2.03815975e-14, -8.25999899e-16, 1.00000000e+00,
1.28952488e-14, 3.55561725e-16],
[-3.09793540e-16, 1.72968381e-14, -1.04505675e-14,
1.00000000e+00, 2.08115082e-15],
[ 1.56525172e-15, 1.46036808e-15, -1.44537894e-14,
1.39818361e-14, 1.00000000e+00]])
In [21]: q, r = qr(mat)
In [22]: r
Output: array([[-5.57483748, -2.84807903, 9.20903703, -4.15877657, 6.39223007],
[ 0. , -1.23458268, 1.32555702, -0.99111824, 2.92884267],
[ 0. , 0. , -1.02302314, -1.18714977, -6.26127069],
[ 0. , 0. , 0. , -1.34927574, -4.44960019],
[ 0. , 0. , 0. , 0. , 0.04472416]])
The expression X.T.dot(X) computes the dot product of X with its transpose X.T.
Function | Description |
diag | Return the diagonal (or off-diagonal) elements of a square matrix as a 1D array, or convert a 1D array into a square matrix with zeros on the off-diagonal |
dot | Matrix multiplication |
trace | Compute the sum of the diagonal elements |
det | Compute the matrix determinant |
eig | Compute the eigenvalues and eigenvectors of a square matrix |
inv | Compute the inverse of a square matrix |
pinv | Compute the Moore-Penrose pseudo-inverse of a matrix |
qr | Compute the QR decomposition |
svd | Compute the singular value decomposition (SVD) |
solve | Solve the linear system Ax = b for x, where A is a square matrix |
lstsq | Compute the least-squares solution to Ax = b |