What is NumPy? NumPy is a general-purpose array-processing package. It provides a high-performance multidimensional array object, and tools for working with these arrays. It is the fundamental package for scientific computing with Python. It is open-source software. It contains various features including these important ones:
- A powerful N-dimensional array object
- Sophisticated (broadcasting) functions
- Tools for integrating C/C++ and Fortran code
- Useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined using Numpy which allows NumPy to seamlessly and speedily integrate with a wide variety of databases. Installation:
- Mac and Linux users can install NumPy via pip command:
pip install numpy
- Windows does not have any package manager analogous to that in linux or mac. Please download the pre-built windows installer for NumPy from here (according to your system configuration and Python version). And then install the packages manually.
Note: All the examples discussed below will not run on an online IDE. 1. Arrays in NumPy: NumPy’s main object is the homogeneous multidimensional array.
- It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers.
- In NumPy dimensions are called axes. The number of axes is rank.
- NumPy’s array class is called ndarray. It is also known by the alias array.
Example :
[[ 1, 2, 3], [ 4, 2, 5]] Here, rank = 2 (as it is 2-dimensional or it has 2 axes) first dimension(axis) length = 2, second dimension has length = 3 overall shape can be expressed as: (2, 3)
- Python3
Output :
Array is of type: No. of dimensions: 2 Shape of array: (2, 3) Size of array: 6 Array stores elements of type: int64
2. Array creation: There are various ways to create arrays in NumPy.
- For example, you can create an array from a regular Python list or tuple using the array function. The type of the resulting array is deduced from the type of the elements in the sequences.
- Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers several functions to create arrays with initial placeholder content. These minimize the necessity of growing arrays, an expensive operation. For example: np.zeros, np.ones, np.full, np.empty, etc.
- To create sequences of numbers, NumPy provides a function analogous to range that returns arrays instead of lists.
- arange: returns evenly spaced values within a given interval. step size is specified.
- linspace: returns evenly spaced values within a given interval. num no. of elements are returned.
- Reshaping array: We can use reshape method to reshape an array. Consider an array with shape (a1, a2, a3, …, aN). We can reshape and convert it into another array with shape (b1, b2, b3, …, bM). The only required condition is: a1 x a2 x a3 … x aN = b1 x b2 x b3 … x bM . (i.e original size of array remains unchanged.)
- Flatten array: We can use flatten method to get a copy of array collapsed into one dimension. It accepts order argument. Default value is ‘C’ (for row-major order). Use ‘F’ for column major order.
Note: Type of array can be explicitly defined while creating array.
- Python3
Output :
Array created using passed list: [[ 1. 2. 4.] [ 5. 8. 7.]] Array created using passed tuple: [1 3 2] An array initialized with all zeros: [[ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.]] An array initialized with all 6s. Array type is complex: [[ 6.+0.j 6.+0.j 6.+0.j] [ 6.+0.j 6.+0.j 6.+0.j] [ 6.+0.j 6.+0.j 6.+0.j]] A random array: [[ 0.46829566 0.67079389] [ 0.09079849 0.95410464]] A sequential array with steps of 5: [ 0 5 10 15 20 25] A sequential array with 10 values between 0 and 5: [ 0. 0.55555556 1.11111111 1.66666667 2.22222222 2.77777778 3.33333333 3.88888889 4.44444444 5. ] Original array: [[1 2 3 4] [5 2 4 2] [1 2 0 1]] Reshaped array: [[[1 2 3] [4 5 2]] [[4 2 1] [2 0 1]]] Original array: [[1 2 3] [4 5 6]] Fattened array: [1 2 3 4 5 6]
3. Array Indexing: Knowing the basics of array indexing is important for analysing and manipulating the array object. NumPy offers many ways to do array indexing.
- Slicing: Just like lists in python, NumPy arrays can be sliced. As arrays can be multidimensional, you need to specify a slice for each dimension of the array.
- Integer array indexing: In this method, lists are passed for indexing for each dimension. One to one mapping of corresponding elements is done to construct a new arbitrary array.
- Boolean array indexing: This method is used when we want to pick elements from array which satisfy some condition.
- Python3
Output :
Array with first 2 rows and alternatecolumns(0 and 2): [[-1. 0.] [ 4. 6.]] Elements at indices (0, 3), (1, 2), (2, 1),(3, 0): [ 4. 6. 0. 3.] Elements greater than 0: [ 2. 4. 4. 6. 2.6 7. 8. 3. 4. 2. ]
4. Basic operations: Plethora of built-in arithmetic functions are provided in NumPy.
- Operations on single array: We can use overloaded arithmetic operators to do element-wise operation on array to create a new array. In case of +=, -=, *= operators, the existing array is modified.
- Python3
Output :
Adding 1 to every element: [2 3 6 4] Subtracting 3 from each element: [-2 -1 2 0] Multiplying each element by 10: [10 20 50 30] Squaring each element: [ 1 4 25 9] Doubled each element of original array: [ 2 4 10 6] Original array: [[1 2 3] [3 4 5] [9 6 0]] Transpose of array: [[1 3 9] [2 4 6] [3 5 0]]
- Unary operators: Many unary operations are provided as a method of ndarray class. This includes sum, min, max, etc. These functions can also be applied row-wise or column-wise by setting an axis parameter.
- Python3
Output :
Largest element is: 9 Row-wise maximum elements: [6 7 9] Column-wise minimum elements: [1 1 2] Sum of all array elements: 38 Cumulative sum along each row: [[ 1 6 12] [ 4 11 13] [ 3 4 13]]
- Binary operators: These operations apply on array elementwise and a new array is created. You can use all basic arithmetic operators like +, -, /, , etc. In case of +=, -=, = operators, the existing array is modified.
- Python3
Output:
Array sum: [[5 5] [5 5]] Array multiplication: [[4 6] [6 4]] Matrix multiplication: [[ 8 5] [20 13]]
- Universal functions (ufunc): NumPy provides familiar mathematical functions such as sin, cos, exp, etc. These functions also operate elementwise on an array, producing an array as output.
Note: All the operations we did above using overloaded operators can be done using ufuncs like np.add, np.subtract, np.multiply, np.divide, np.sum, etc.
- Python3
Output:
Sine values of array elements: [ 0.00000000e+00 1.00000000e+00 1.22464680e-16] Exponent of array elements: [ 1. 2.71828183 7.3890561 20.08553692] Square root of array elements: [ 0. 1. 1.41421356 1.73205081]
4. Sorting array: There is a simple np.sort method for sorting NumPy arrays. Let’s explore it a bit.
- Python3
Output:
Array elements in sorted order: [-1 0 1 2 3 4 4 5 6] Row-wise sorted array: [[ 1 2 4] [ 3 4 6] [-1 0 5]] Column wise sort by applying merge-sort: [[ 0 -1 2] [ 1 4 5] [ 3 4 6]] Array sorted by names: [('Aakash', 2009, 9.0) ('Ajay', 2008, 8.7) ('Hrithik', 2009, 8.5) ('Pankaj', 2008, 7.9)] Array sorted by graduation year and then cgpa: [('Pankaj', 2008, 7.9) ('Ajay', 2008, 8.7) ('Hrithik', 2009, 8.5) ('Aakash', 2009, 9.0)]
NumPy Data Types
Data Types in Python
By default Python have these data types:
strings
- used to represent text data, the text is given under quote marks. e.g. "ABCD"integer
- used to represent integer numbers. e.g. -1, -2, -3float
- used to represent real numbers. e.g. 1.2, 42.42boolean
- used to represent True or False.complex
- used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5jData Types in NumPy
NumPy has some extra data types, and refer to data types with one character, like
i
for integers,u
for unsigned integers etc.Below is a list of all data types in NumPy and the characters used to represent them.
i
- integerb
- booleanu
- unsigned integerf
- floatc
- complex floatm
- timedeltaM
- datetimeO
- objectS
- stringU
- unicode stringV
- fixed chunk of memory for other type ( void )Checking the Data Type of an Array
The NumPy array object has a property called
dtype
that returns the data type of the array:Example
Get the data type of an array object:
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr.dtype)Example
Get the data type of an array containing strings:
import numpy as np
arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)Creating Arrays With a Defined Data Type
We use the
array()
function to create arrays, this function can take an optional argument:dtype
that allows us to define the expected data type of the array elements:Example
Create an array with data type string:
import numpy as np
arr = np.array([1, 2, 3, 4], dtype='S')
print(arr)
print(arr.dtype)For
i
,u
,f
,S
andU
we can define size as well.Example
Create an array with data type 4 bytes integer:
import numpy as np
arr = np.array([1, 2, 3, 4], dtype='i4')
print(arr)
print(arr.dtype)Example
A non integer string like 'a' can not be converted to integer (will raise an error):
import numpy as np
arr = np.array(['a', '2', '3'], dtype='i')Converting Data Type on Existing Arrays
The best way to change the data type of an existing array, is to make a copy of the array with the
astype()
method.The
astype()
function creates a copy of the array, and allows you to specify the data type as a parameter.The data type can be specified using a string, like
'f'
for float,'i'
for integer etc. or you can use the data type directly likefloat
for float andint
for integer.Example
Change data type from float to integer by using
'i'
as parameter value:import numpy as np
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype('i')
print(newarr)
print(newarr.dtype)Example
Change data type from float to integer by using
int
as parameter value:import numpy as np
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype(int)
print(newarr)
print(newarr.dtype)Example
Change data type from integer to boolean:
import numpy as np
arr = np.array([1, 0, 3])
newarr = arr.astype(bool)
print(newarr)
print(newarr.dtype)