INTRODUCTION TO NUMPY

Ravi shankar
13 min readJun 30, 2021

BY-RAVI SHANKAR

NumPy, which stands for Numerical Python, is a library consisting of multidimensional array objects and a collection of routines for processing those arrays. Using NumPy, mathematical and logical operations on arrays can be performed.

NumPy is a Python package. It stands for ‘Numerical Python’. It is a library consisting of multidimensional array objects and a collection of routines for processing of array.

Numeric, the ancestor of NumPy, was developed by Jim Hugunin. Another package Numarray was also developed, having some additional functionalities. In 2005, Travis Oliphant created NumPy package by incorporating the features of Numarray into Numeric package. There are many contributors to this open-source project.

Why use NumPy Arrays?

Numpy is one of the most commonly used packages for scientific computing in Python. It provides a multidimensional array object, as well as variations such as masks and matrices, which can be used for various math operations. Numpy is compatible with, and used by many other popular Python packages, including pandas and matplotlib.

Why is numpy so popular? Quite simply, because it’s faster than regular Python arrays, which lack numpy’s optimized and pre-compiled C code that does all the heavy lifting. Another reason is that numpy arrays and operations are vectorized, which means they lack explicit looping or indexing in the code. This makes the code not only more readable, but also more similar to standard mathematical notation.

The following example illustrates the vectorization difference between standard Python and numpy.

For two arrays A and B of the same size, if we wanted to do a vector multiplication in Python:

c = []
for i in range(len(a)):
c.append(a[i]*b[i])

In numpy, this can simply be done with the following line of code:

c = a*b

Numpy makes many mathematical operations used widely in scientific computing fast and easy to use, such as:

  • Vector-Vector multiplication
  • Matrix-Matrix and Matrix-Vector multiplication
  • Element-wise operations on vectors and matrices (i.e., adding, subtracting, multiplying, and dividing by a number )
  • Element-wise or array-wise comparisons
  • Applying functions element-wise to a vector/matrix ( like pow, log, and exp)
  • A whole lot of Linear Algebra operations can be found in NumPy.linalg
  • Reduction, statistics, and much more

Limitations

Inserting or appending entries to an array is not as trivially possible as it is with Python’s lists. The np.pad(...) routine to extend arrays actually creates new arrays of the desired shape and padding values, copies the given array into the new one and returns it. NumPy's np.concatenate([a1,a2]) operation does not actually link the two arrays but returns a new one, filled with the entries from both given arrays in sequence. Reshaping the dimensionality of an array with np.reshape(...) is only possible as long as the number of elements in the array does not change. These circumstances originate from the fact that NumPy's arrays must be views on contiguous memory buffers. A replacement package called Blaze attempts to overcome this limitation.

Algorithms that are not expressible as a vectorized operation will typically run slowly because they must be implemented in “pure Python”, while vectorization may increase memory complexity of some operations from constant to linear, because temporary arrays must be created that are as large as the inputs. Runtime compilation of numerical code has been implemented by several groups to avoid these problems; open source solutions that interoperate with NumPy include scipy.weave, numexpr and Numba. Cython and Pythran are static-compiling alternatives to these.

Many modern large-scale scientific computing applications have requirements that exceed the capabilities of the NumPy arrays. For example, NumPy arrays are usually loaded into a computer’s memory, which might have insufficient capacity for the analysis of large datasets. Further, NumPy operations are executed on a single CPU. However, many linear algebra operations can be accelerated by executing them on clusters of CPUs or of specialized hardware, such as GPUs and TPUs, which many deep learning applications rely on. As a result, several alternative array implementations have arisen in the scientific python ecosystem over the recent years, such as Dask for distributed arrays and TensorFlow or JAX for computations on GPUs. Because of its popularity, these often implement a subset of Numpy’s API or mimic it, so that users can change their array implementation with minimal changes to their code required.A recently introduced library named CUPy, accelerated by Nvidia’s CUDA framework, has also shown potential for faster computing, being a ‘drop-in replacement’ of NumPy.

Building and installing NumPy

Prerequisites

Building NumPy requires the following software installed:

  1. Python 2.6.x, 2.7.x, 3.2.x or newer
  2. On Debian and derivatives (Ubuntu): python, python-dev (or python3-dev)
  3. On Windows: the official python installer at www.python.org is enough
  4. Make sure that the Python package distutils is installed before continuing. For example, in Debian GNU/Linux, installing python-dev also installs distutils.
  5. Python must also be compiled with the zlib module enabled. This is practically always the case with pre-packaged Pythons.
  6. Compilers
  7. To build any extension modules for Python, you’ll need a C compiler. Various NumPy modules use FORTRAN 77 libraries, so you’ll also need a FORTRAN 77 compiler installed.
  8. Note that NumPy is developed mainly using GNU compilers. Compilers from other vendors such as Intel, Absoft, Sun, NAG, Compaq, Vast, Porland, Lahey, HP, IBM, Microsoft are only supported in the form of community feedback, and may not work out of the box. GCC 4.x (and later) compilers are recommended.
  9. Linear Algebra libraries
  10. NumPy does not require any external linear algebra libraries to be installed. However, if these are available, NumPy’s setup script can detect them and use them for building. A number of different LAPACK library setups can be used, including optimized LAPACK libraries such as ATLAS, MKL or the Accelerate/vecLib framework on OS X.

Basic Installation

To install NumPy run:

python setup.py install

To perform an in-place build that can be run from the source folder run:

python setup.py build_ext --inplace

The NumPy build system uses distutils and numpy.distutils. setuptools is only used when building via pip or with python setupegg.py. Using virtualenv should work as expected.

Note: for build instructions to do development work on NumPy itself, see :ref:`development-environment`.

Parallel builds

From NumPy 1.10.0 on it’s also possible to do a parallel build with:

python setup.py build -j 4 install --prefix $HOME/.local

This will compile numpy on 4 CPUs and install it into the specified prefix.

numpy.ndarray

class numpy.ndarray(shape, dtype=float, buffer=None, offset=0, strides=None, order=None)

An array object represents a multidimensional, homogeneous array of fixed-size items. An associated data-type object describes the format of each element in the array (its byte-order, how many bytes it occupies in memory, whether it is an integer, a floating point number, or something else, etc.)

Arrays should be constructed using array, zeros or empty (refer to the See Also section below). The parameters given here refer to a low-level method (ndarray(…)) for instantiating an array.

For more information, refer to the numpy module and examine the methods and attributes of an array.

Parameters(for the __new__ method; see Notes below)shapetuple of ints

Shape of created array.

dtypedata-type, optional

Any object that can be interpreted as a numpy data type.

bufferobject exposing buffer interface, optional

Used to fill the array with data.

offsetint, optional

Offset of array data in buffer.

stridestuple of ints, optional

Strides of data in memory.

order{‘C’, ‘F’}, optional

Row-major (C-style) or column-major (Fortran-style) order.

See also

array

Construct an array.

zeros

Create an array, each element of which is zero.

empty

Create an array, but leave its allocated memory unchanged (i.e., it contains “garbage”).

dtype

Create a data-type.

numpy.typing.NDArray

A generic version of ndarray.

Notes

There are two modes of creating an array using __new__:

  1. If buffer is None, then only shape, dtype, and order are used.
  2. If buffer is an object exposing the buffer interface, then all keywords are interpreted.

No __init__ method is needed because the array is fully initialized after the __new__ method.

Examples

These examples illustrate the low-level ndarray constructor. Refer to the See Also section above for easier ways of constructing an ndarray.

First mode, buffer is None:

>>> np.ndarray(shape=(2,2), dtype=float, order='F')
array([[0.0e+000, 0.0e+000], # random
[ nan, 2.5e-323]])

Second mode:

>>> np.ndarray((2,), buffer=np.array([1,2,3]),
... offset=np.int_().itemsize,
... dtype=int) # offset = 1*itemsize, i.e. skip first element
array([2, 3])

AttributesTndarray

The transposed array.

databuffer

Python buffer object pointing to the start of the array’s data.

dtypedtype object

Data-type of the array’s elements.

flagsdict

Information about the memory layout of the array.

flatnumpy.flatiter object

A 1-D iterator over the array.

imagndarray

The imaginary part of the array.

realndarray

The real part of the array.

sizeint

Number of elements in the array.

itemsizeint

Length of one array element in bytes.

nbytesint

Total bytes consumed by the elements of the array.

ndimint

Number of array dimensions.

shapetuple of ints

Tuple of array dimensions.

stridestuple of ints

Tuple of bytes to step in each dimension when traversing an array.

ctypesctypes object

An object to simplify the interaction of the array with the ctypes module.

basendarray

Base object if memory is from some other object.

Basic Slicing and Advanced Indexing in NumPy Python

NumPy or Numeric Python is a package for computation on homogenous n-dimensional arrays. In numpy dimensions are called as axes.

Indexing using index arrays

Indexing can be done in numpy by using an array as an index. In case of slice, a view or shallow copy of the array is returned but in index array a copy of the original array is returned. Numpy arrays can be indexed with other arrays or any other sequence with the exception of tuples. The last element is indexed by -1 second last by -2 and so on.

NumPy Array Indexing

Access Array Elements

Array indexing is the same as accessing an array element.

You can access an array element by referring to its index number.

The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

Example

Get the first element from the following array:

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[0])

Example

Get the second element from the following array.

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[1])

Example

Get third and fourth elements from the following array and add them.

import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[2] + arr[3])

Access 2-D Arrays

To access elements from 2-D arrays we can use comma separated integers representing the dimension and the index of the element.

Example

Access the 2nd element on 1st dim:

import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print(‘2nd element on 1st dim: ‘, arr[0, 1])

Example

Access the 5th element on 2nd dim:

import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print(‘5th element on 2nd dim: ‘, arr[1, 4])

Access 3-D Arrays

To access elements from 3-D arrays we can use comma separated integers representing the dimensions and the index of the element.

Example

Access the third element of the second array of the first array:

import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

print(arr[0, 1, 2])

Example Explained

arr[0, 1, 2] prints the value 6.

And this is why:

The first number represents the first dimension, which contains two arrays:
[[1, 2, 3], [4, 5, 6]]
and:
[[7, 8, 9], [10, 11, 12]]
Since we selected 0, we are left with the first array:
[[1, 2, 3], [4, 5, 6]]

The second number represents the second dimension, which also contains two arrays:
[1, 2, 3]
and:
[4, 5, 6]
Since we selected 1, we are left with the second array:
[4, 5, 6]

The third number represents the third dimension, which contains three values:
4
5
6
Since we selected 2, we end up with the third value:
6

Negative Indexing

Use negative indexing to access an array from the end.

Example

Print the last element from the 2nd dim:

import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print(‘Last element from 2nd dim: ‘, arr[1, -1])

NumPy Array Slicing

Slicing arrays

Slicing in python means taking elements from one given index to another given index.

We pass slice instead of index like this: [start:end].

We can also define the step, like this: [start:end:step].

If we don’t pass start its considered 0

If we don’t pass end its considered length of array in that dimension

If we don’t pass step its considered 1

Example

Slice elements from index 1 to index 5 from the following array:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5])

Example

Slice elements from index 4 to the end of the array:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[4:])

Example

Slice elements from the beginning to index 4 (not included):

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[:4])

Negative Slicing

Use the minus operator to refer to an index from the end:

Example

Slice from the index 3 from the end to index 1 from the end:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[-3:-1])

STEP

Use the step value to determine the step of the slicing:

Example

Return every other element from index 1 to index 5:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5:2])

Example

Return every other element from the entire array:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[::2])

Slicing 2-D Arrays

Example

From the second element, slice elements from index 1 to index 4 (not included):

import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[1, 1:4])

Note: Remember that second element has index 1.

Example

From both elements, return index 2:

import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[0:2, 2])

Example

From both elements, slice index 1 to index 4 (not included), this will return a 2-D array:

import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[0:2, 1:4])

Internal memory layout of an ndarray

NumPy Copies and Views

ndarray.view() method

The ndarray.view() method returns the new array object which contains the same content as the original array does. Since it is a new array object, changes made on this object do not reflect the original array.

Consider the following example.

Example

import numpy as np

a = np.array([[1,2,3,4],[9,0,2,3],[1,2,3,19]])

print(“Original Array:\n”,a)

print(“\nID of array a:”,id(a))

b = a.view()

print(“\nID of b:”,id(b))

print(“\nprinting the view b”)

print(b)

b.shape = 4,3;

print(“\nChanges made to the view b do not reflect a”)

print(“\nOriginal array \n”,a)

print(“\nview\n”,b)

Output:

Original Array:
[[ 1 2 3 4]
[ 9 0 2 3]
[ 1 2 3 19]]
ID of array a: 140280414447456ID of b: 140280287000656printing the view b
[[ 1 2 3 4]
[ 9 0 2 3]
[ 1 2 3 19]]
Changes made to the view b do not reflect aOriginal array
[[ 1 2 3 4]
[ 9 0 2 3]
[ 1 2 3 19]]
view
[[ 1 2 3]
[ 4 9 0]
[ 2 3 1]
[ 2 3 19]]

ndarray.copy() method

It returns the deep copy of the original array which doesn’t share any memory with the original array. The modification made to the deep copy of the original array doesn’t reflect the original array.

Consider the following example.

Example

import numpy as np

a = np.array([[1,2,3,4],[9,0,2,3],[1,2,3,19]])

print(“Original Array:\n”,a)

print(“\nID of array a:”,id(a))

b = a.copy()

print(“\nID of b:”,id(b))

print(“\nprinting the deep copy b”)

print(b)

b.shape = 4,3;

print(“\nChanges made to the copy b do not reflect a”)

print(“\nOriginal array \n”,a)

print(“\nCopy\n”,b)

Output:

Original Array:
[[ 1 2 3 4]
[ 9 0 2 3]
[ 1 2 3 19]]
ID of array a: 139895697586176ID of b: 139895570139296printing the deep copy b
[[ 1 2 3 4]
[ 9 0 2 3]
[ 1 2 3 19]]
Changes made to the copy b do not reflect aOriginal array
[[ 1 2 3 4]
[ 9 0 2 3]
[ 1 2 3 19]]
Copy
[[ 1 2 3]
[ 4 9 0]
[ 2 3 1]
[ 2 3 19]]

NumPy Creating Arrays

NumPy is used to work with arrays. The array object in NumPy is called ndarray.

We can create a NumPy ndarray object by using the array() function.

Example

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

print(type(arr))

type(): This built-in Python function tells us the type of the object passed to it. Like in above code it shows that arr is numpy.ndarray type.

To create an ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray:

Example

Use a tuple to create a NumPy array:

import numpy as np

arr = np.array((1, 2, 3, 4, 5))

print(arr)

Dimensions in Arrays

A dimension in arrays is one level of array depth (nested arrays).

nested array: are arrays that have arrays as their elements.

0-D Arrays

0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

Example

Create a 0-D array with value 42

import numpy as np

arr = np.array(42)

print(arr)

1-D Arrays

An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.

These are the most common and basic arrays.

Example

Create a 1-D array containing the values 1,2,3,4,5:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

2-D Arrays

An array that has 1-D arrays as its elements is called a 2-D array.

These are often used to represent matrix or 2nd order tensors.

NumPy has a whole sub module dedicated towards matrix operations called numpy.mat

Example

Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)

3-D arrays

An array that has 2-D arrays (matrices) as its elements is called 3-D array.

These are often used to represent a 3rd order tensor.

Example

Create a 3-D array with two 2-D arrays, both containing two arrays with the values 1,2,3 and 4,5,6:

import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr)

NumPy Data Types

--

--