The advent of python as (almost) the language of choice for AI enthusiast programmers has stirred a lot of interest in data structures like arrays, specially Numpy arrays. I have always struggled with understanding them and if I don’t use them for a while, I have to be reminded. So decided to write a little intro that I can refer to down the road.
What are Numpy arrays:
NumPy arrays are data structures that allow for efficient manipulation and computation of numerical data in Python. They are similar to regular Python lists, but they are optimized for numerical operations and can handle large amounts of data more efficiently than regular lists. What makes them faster and more efficient than lists is that a Numpy array has can only have one datatype for each item it is holding, so all the item must be integers, or strings and so on. But lists on the other hand, can carry different data types at the same time. By having all items as one datatype, arrays don’t have to store the datatype for each item which makes them a lot more memory efficient and processing data held by arrays is much faster.
Dimensions in Numpy Arrays:
In NumPy, the dimensions of an array refer to the number of axes or “degrees of freedom” that the array has. For example, a 1-dimensional array has only one axis, a 2-dimensional array has two axes (sometimes referred to as rows and columns), and a 3-dimensional array has three axes (sometimes referred to as pages, rows, and columns).
Each axis of an array represents a different dimension of the data that the array contains. For example, in a 2-dimensional array, the first axis represents the rows of the array, and the second axis represents the columns. In a 3-dimensional array, the first axis represents the pages of the array, the second axis represents the rows, and the third axis represents the columns.
The dimensions of an array are important because they determine the shape and size of the array, and they affect how the array can be accessed and manipulated. For example, the dimensions of an array determine the number of indices that are needed to access an element of the array, and they also determine the size and shape of the array when it is displayed or plotted.
Overall, the dimensions of an array are a fundamental property of the array that determine its behavior and how it can be used in numerical computations.
Creating Numpy Arrays:
To create a NumPy array, you can use the np.array() function, which takes a regular Python list or array-like object as its argument and returns a NumPy array. For example:
import numpy as np
# create a 1-dimensional array
array1d = np.array([1, 2, 3, 4, 5])
# create a 2-dimensional array
array2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
# create a 3-dimensional array
array3d = np.array([[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]],
[[13, 14, 15, 16], [17, 18, 19, 20], [21, 22, 23, 24]]])
In these examples, we create three NumPy arrays of different dimensions (1-dimensional, 2-dimensional, and 3-dimensional) using the np.array() function. You can specify the dimensions of the array by providing the appropriate number of nested lists in the argument to the np.array() function.
Alternatively, you can create an array with a specific shape and fill it with a specific value using the np.full() or np.zeros() functions. For example:
# create a 2x3 array filled with zeros
array2d = np.zeros((2, 3))
# create a 3x4x5 array filled with the value 1
array3d = np.full((3, 4, 5), 1)
In these examples, we create NumPy arrays of a specified shape and fill them with zeros or a specific value using the np.zeros() and np.full() functions, respectively. This can be useful when you want to create an array with a specific shape and initial values, without having to specify each element of the array individually.
Dimensions in Numpy Arrays:
In NumPy, the dimensions of an array are the number of axes or components that the array has. For example, a 1D array has one dimension (one axis), a 2D array has two dimensions (two axes), and a 3D array has three dimensions (three axes).
To create a multidimensional NumPy array, you can use the numpy.array() function and specify the dimensions of the array using the shape parameter. This parameter should be a tuple of integers that specifies the size of the array along each dimension.
For example, to create a 3D NumPy array with 2 depths, 3 rows, and 4 columns, you can use the following code:
import numpy as np
arr = np.array(shape=(2, 3, 4))
The arr variable will now contain a 3D NumPy array with the specified dimensions. The new array will have uninitialized values, so you will need to populate the array with values before using it as by default, all elements are initiated to 0. You can also create an array of a specific shape with elements initiated to random numbers:
arr = np.random.random(size=(2, 3, 4))
If you don’t know the shape or dimensions of an array, you can use the following code to find out:
#this will return the shape of the array
arr.shape
#the following will return the number of dimensions of the array
arr.ndim
Accessing Specific Elements of an Array:
To access elements in a NumPy array, you can use square brackets and specify the indices of the elements that you want to access. This is similar to what you would do in lists or other datatypes where you can access elements by their index numbers, however, an array can have multiple dimensions therefore have to take that into account.
For example, suppose you have the following NumPy array:
arr = np.array([[1, 2, 3], [4, 5, 6]])
To access the element in the first row and first column of arr, you can use the following code:
element = arr[0, 0]
The above code will return the value 1, which is the element at the position (0, 0) in arr.
In general, to access an element at the i-th row and j-th column of a NumPy array arr, you can use the following code:
element = arr[i, j]
Note that in NumPy, the indices start at 0, so the first row is at index 0, the second row is at index 1, and so on. The same is true for the columns: the first column is at index 0, the second column is at index 1, and so on.
To access elements in a NumPy array with more than 2 dimensions, you can use a comma-separated list of indices to specify the position of the element that you want to access. For example, suppose you have the following 3D NumPy array:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
The above array has 3 dimensions, and the dimensions are represented by the number of rows, columns, and depths in the array. To access the element in the first row, first column, and first depth of arr, you can use the following code:
element = arr[0, 0, 0]
The above code will return the value 1, which is the element at the position (0, 0, 0) in arr.
In general, to access an element at the i-th row, j-th column, and k-th depth of a NumPy array arr, you can use the following code:
element = arr[i, j, k]
The indices for each dimension are separated by a comma, and the number of indices that you use should match the number of dimensions in the array.
You can also set a specific element to a certain value by using the same principal. So for example, if we want to set the element at position (0,2,0) to 3, following would be the code:
arr[0, 2, 0] = 3
Reshaping a Numpy Array:
To change the shape of a NumPy array, you can use the ndarray.reshape() method. This method allows you to specify the new shape of the array in the form of a tuple, and it returns a new array with the specified shape.
For example, suppose you have the following 1D NumPy array:
arr = np.array([1, 2, 3, 4, 5, 6])
To change the shape of arr to a 2D array with 3 rows and 2 columns, you can use the following code:
new_arr = arr.reshape(3, 2)
The new_arr variable will now contain a 2D NumPy array with the same elements as arr, but with a different shape. The new array will look like this:
[[1, 2],
[3, 4],
[5, 6]]
Note that the ndarray.reshape() method does not modify the original array, but rather it returns a new array with the specified shape. Which means, you have to cast it to a variable to save what is returned, the actual array is not changed. If you want to change the shape of the original array, you can use the ndarray.resize() method instead. This method will modify the original array in-place, so you don't have to assign the result to a new variable.
For example, to change the shape of arr in-place to a 2D array with 3 rows and 2 columns, you can use the following code:
arr.resize(3, 2)
After running the above code, the arr variable will now contain the following array:
[[1, 2],
[3, 4],
[5, 6]]
Now instead of reshaping into different dimensions, what if you want to “rotate” the array? This is called transposing. To transpose a NumPy array, you can use the ndarray.T attribute of an array. This attribute returns a new array with the same elements as the original array, but with the rows and columns swapped.
For example, suppose you have the following 2D NumPy array:
arr = np.array([[1, 2, 3], [4, 5, 6]])
The above array has 2 rows and 3 columns, so it looks like this:
[[1, 2, 3],
[4, 5, 6]]
To transpose the array, you can use the ndarray.T attribute like this:
new_arr = arr.T
The new_arr variable will now contain a new 2D NumPy array with the same elements as arr, but with the rows and columns swapped. The new array will look like this:
[[1, 4],
[2, 5],
[3, 6]]
Note that the ndarray.T attribute does not modify the original array, but rather it returns a new array with the rows and columns swapped. If you want to transpose the original array in-place, you can use the ndarray.transpose() method instead. This method will modify the original array in-place, so you don't have to assign the result to a new variable.
For example, to transpose the arr array in-place, you can use the following code:
arr.transpose()
After running the above code, the arr variable will now contain the following array:
[[1, 4],
[2, 5],
[3, 6]]
In part 2, we will talk about slicing Numpy arrays as well as doing arithmetic operations on Numpy arrays.
Bình luận