Notebook

1 Objects and type: a quick recap
2 Vectors, Matrices and Arrays using NumPy
3 Libraries in Python: introducing NumPy for working with arrays
4 Creating vectors, matrices and arrays with NumPy
5 Further ways of creating NumPy arrays
6 Next steps

All content here is under a Creative Commons Attribution CC-BY 4.0 and all source code is released under a BSD-2 clause license. Parts of these materials were inspired by https://github.com/engineersCode/EngComp/ (CC-BY 4.0), L.A. Barba, N.C. Clementi.

Please reuse, remix, revise, and reshare this content in any way, keeping this notice.

Module 4: Overview¶

We cover the following topics here:

Recap of objects and types
What do we mean by vectors, matrices and arrays
Using a Python library: introducing NumPy
Creating vectors, matrices and arrays with NumPy
Special matrices in NumPy (e.g. identity matrices, random numbers)

Preparing for this module¶

Not much, just a general understanding of scalars, matrices and arrays.

This session appears lengthy, but it is a recap of very familiar topics.

Quickly go over what you are comfortable with; we hope to get everyone to the same level of understanding.

Objects and type: a quick recap¶

"Arrays store objects of the same type."

There's a lot in that sentence:

*objects*: do you recall what an object is in Python?
*type*: do you recall what a type is?

A quick recap might be helpful (refer to session 1 for a refresher)

*everything in Python is an object*. For example, a numeric value, a string, a list, a tuple. These are all just objects. Objects can be assigned to a variable, and they can also be the inputs for a function.
type(object) will tell you which type of object you have. For example type(45.2) will give float as a reply.

So now you should understand that an *array* is just a collection of these objects. Let's take a look with an example.

Here is a collection of floating points objects:

[45.2, 91.2, 67.2, -23.78]

The type of the object is float (we could have also used int (integer) objects). The 4 objects are collected in a list, and that list is also an object.

Remember you can always confirm the *type* of an *object* as follows. Try it:

type(45.2)
type(42)
type('some text')
type([45.2, 91.2, 67.2, -23.78])

In [ ]:

Vectors, Matrices and Arrays using NumPy¶

Let's quickly get a few definitions out of the way, and start. Start by collecting some objects together, first singly (scalar), then in a list (vector), then as a 'spreadsheet' (matrix), then as an array (3-dimensional, or higher dimensional).

Scalars¶

If our collection of (numeric) objects coincidentally is only a single number, we call that a *scalar*.

scalar_1 = 45.2```

scalar_2 = 0```

scalar_3 = -12```

Vectors¶

A collection of scalars in a single row, or column, is very much like a list in regular Python. This collection we then call a *vector*.

list_1 = [1, 2, 6, -2, 0]```

list_2 = [0, 0, 0, 0, 0, 0, 0, 0]```

list_3 = [254.2, 501, 368.4, 697, 476.5, 188.2, 525.6, 451, 514]```

We say this collection has a single dimension: a single row of numbers, or a single column of numbers. If there coincidentally is 1 number in the collection, we simply call that a scalar. But in theory we can store as many numbers as we like in our vector.

Think, for example, the impeller speed of a batch reactor, measured every minute, during the duration of a batch. This 1-dimensional sequence is called a vector.

Matrix¶

If we take several 1-dimensional vectors, but each one of the same length, and put them together, side-by-side then we get a *matrix*.

matrix_1 = [ [1, 2, 6, -2], [4, 3, 2, 1] ] # has 2 rows and 4 columns```

matrix_2 = [ [0, 0, 0], [0, 0, 0], [0, 0, 0] ] # has 3 rows and 3 columns```

matrix_3 = [ [9, 8, 7, 6], [5, 4, 4, 3] ] # also has 2 rows and 4 columns```

You could crudely store, as we showed above, a matrix by using a list of lists, where the main list (the outside list) contains objects which themselves are lists. This is perfectly valid in Python: remember that a list can contain objects of any type, including other lists. But while this "list-of-lists" approach can store your data, it would not be great for calculations.

Try this: (the result is complely unintuitive for mathematical operations)

matrix_1 + matrix_3
matrix_3 + 7

Another point to note is that a vector is simply a matrix, but where one of the dimensions is equal to 1: either 1 row, or 1 column.

Matrices are widely used in engineering and data analysis. Often each row is an object, or a sample, or an observation. And each column represents some sort of value measured on that object or sample. For example:

	Measurement 1	Measurement 2	Measurement 3	Measurement 4
Sample 1	5.5	0.55	-23.4	561522.2
Sample 2	6.7	0.44	-22.2	526616.4
Sample 3	4.9	0.61	-38.1	612515.7

This matrix would have 3 rows and 4 columns.

Array¶

If we take several 2-dimensional matrices, but each one with the same number of rows and columns, and put them together, then we get a *3-dimensional array*.

A matrix was a list-of-lists. We can go up to a third dimension and make a list-of-lists-of-lists.

Why stop there? We can go to higher and higher dimensions. We use a general names for such a collection of (numeric) objects: an *array*.

An array is an n-dimensional structure of numbers. You can therefore say:

a vector is a 1-dimensional data structure
a matrix is a 2-dimensional data structure
an array is an n-dimensional data structure

For example, a 3-dimensional array here shows data collected in a lab: we are performing the experiment several times (N, the layers - each layer is a matrix actually - that lies on top of each other).

In each experiment we collect a matrix of data from several sensors. There are K sensors. We set the sensors to collect data on a regular interval, once every 3 seconds, for example, so that we end up with exactly the same number of samples per sensor, J values per sensor.

Storing the data like this is useful, because now you could perform calculations on all experiments over all time, for all sensors in array X.

For example: you can calculate the average in the direction of arrow J, to reduce the array to a matrix. That matrix would be the average value of the sensor for the experiments. That reduced matrix would have N rows and K columns.

Engineering applications benefit from using vectors, or matrices or arrays: they are sequences of data all of the same type. Arrays behave a lot like lists in Python, except for the constraint that all elements have the same type.

Libraries in Python: introducing NumPy for working with arrays¶

There is an important Python library in science and engineering, called NumPy, that provides support for n-dimensional array data structures (a.k.a, ndarray).

Later on we will learn about the library called pandas (Python Data Analysis Library), which is better suited than NumPy for many situations. But underneath each pandas dataframe (we will define that term later), exists a NumPy array. So understanding NumPy is key to understanding pandas. Learning NumPy is also an easy step for people coming from MATLAB.

Let us import NumPy and get started.

Importing libraries¶

First, a word on importing libraries to expand your running Python session. Because libraries are large collections of code and are for special purposes, they are not loaded automatically when you launch Python (or IPython, or Jupyter). You have to import a library using the import command. For example, to import NumPy, you can enter:

import numpy

Once you execute that command, you can call any NumPy function using the dot notation, prepending the library name. For example, some commonly used functions are:

Part of the community effort of creating the Python libraries, is also an effort at maintaining excellent documentation.

To try:¶

Click and read one of those links to explore the documentation - the pages each have the same layout, so once you know where to look, you can quickly search and refer to the documentation for other functions.

Also try: dir(numpy). Do you remember what the dir(...) function does?

The dir(...) function applies to any *object* in Python, and numpy here, once imported, is also an object.

What *type* is numpy ?

In [ ]:

Importing libraries as an alias¶

You will find a lot of source code that uses a different syntax for importing. Most often you will see:

import numpy as np

All this does is create an alias for numpy with the shorter string np, so you then would call a NumPy function like this: np.linspace() instead of the lengthier numpy.linspace().

This is just an alternative way of doing it. It is arguably better that you are explicit (using the full numpy.), but practicality, code reuse, and screen real-estate often dictate that people write it simply as np. Both are fine.

import numpy
import numpy as np    # both do the same

In [ ]:

Creating your first array .... well vector, to be specific¶

To create a NumPy array from an existing Python list of numbers, we use numpy.array(), like this:

my_list = [3, 4, 7, -2, 11]
np.array(my_list)

# or more compactly, without the intermediate variable:
np.array([3, 4, 7, -2, 11])

Try it yourself:

Create an array of 11 numbers below, some negative, some positive, some integers, some floating point

# Create a vector of 11 numbers
import numpy as np
eleven = np.array([ ... ])
print(eleven)
print(len(eleven))  # verify the length

In [ ]:

Python allows you to create lists of mixed types, for example, strings, floating point, integers, etc. What happens if you try creating a NumPy array from a mixed list of object types?

*What happens?*

In this list there are 3 objects, of 3 different types. Try running the code below to verify:

my_list = ['abc', 123, 456.7] 
np.array(my_list)

In [ ]:

Creating vectors, matrices and arrays with NumPy¶

NumPy offers many ways to create arrays. Also read this overview.

Creating your first vector with NumPy¶

Scroll through the first link above to see just how many ways there are.

One of the simplest vectors we can create is a vector of just ones (1's). Try the numpy.ones() command below. We must tell NumPy how many array elements we would like.

# To try: change the '5' to some other integer number
import numpy as np
np.ones(shape=5)    # Using the explicit function call
np.ones(5)          # often we use this shortcut instead

There is also a command to create a vector of zeros:

np.zeros(shape=3)
np.zeros(3)

Here you see that Python functions can be called by specifying the function input name: in this example the single input shape is specified in np.zeros(shape=...).

In [ ]:

Creating your first matrix (a two-dimensional array)¶

For this we use the .ones() or .zeros() command, but we just specify the shape argument to differently. Instead of an integer, we provide a tuple.

twoD = np.ones(shape=(5,7))
print(twoD)

# Verify that the shape is what you expect:
print(twoD.shape)
print('------------')

naughts = np.zeros((5,7))
print(naughts)
print(type(naughts))      # you have now created an object with type `numpy.ndarray`

Every NumPy array can be queried using the .shape attribute. That means, add .shape to the array, and you will ask Python to return the attribute of that array called shape.

In [ ]:

Creating your first multi-dimensional array¶

Why stop at two-dimensions? Create a 3-dimensional array with 2 rows, 3 columns and 4 layers: in other words a $2 \times 3 \times 4$ array.

Just adjust the tuple provided to the shape argument:

threeD = np.zeros(shape=(2,3,4))
print(threeD)
print(threeD.shape)

Is this what you expected to see? You might have to imagine the 3rd dimension going in and out of the screen.

Try to create a matrix with 4 rows and 5 columns, where every value in the matrix is the number 8. Do this by making a matrix of only .ones() and multiplying that matrix by the value of 8.

Now do the same thing, using the np.full command. If you need help, please see the Numpy documentation for the np.full command .

# Step 1:
eights = np.ones( ___ ) * ___
print(eights)

# Step 2:
eight_again = np.full(shape=___, fill_value=___)
print(eight_again)

In [ ]:

Summary so far¶

You have created vectors, matrices and arrays. These have a specific .shape attribute that you can check.

There is are several attributes of interest, but one that you will find useful is the .ndim (the number of dimensions). Try it on one of your prior arrays.

These objects are of the type numpy.ndarray: an n-dimensional array.

In [ ]:

Further ways of creating NumPy arrays¶

In this section we will look at creating arrays, particularly matrices, in an efficient manner.

Identity matrices: what if you need an identity matrix (a matrix with 1's on the diagonal)?
Random matrices: arrays filled with random numbers
Vector sequences: say you need a vector where the entries are [0, 1, 2, 3, 4, ..., 9]
Matrix from a vector: take a vector (of say 12 entries) and reshape it into an array (of 3 rows and 4 columns)

In the next section we will look at each one of these.

Identity matrices¶

A square matrix with 1's on the diagonal and zeros everywhere else is known as an identity matrix. For example a $4\times 4$ identity matrix is: $$I_4 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$$

import numpy as np

# Read the help text for the `identity` function:
help(np.identity)
id5 = np.identity(n=5)
print(id5)

In [ ]:

A similar function to np.identity(...) is np.eye(...). It is a play on words, where eye refers to the uppercase letter $I$. The above above $4\times 4$ matrix is often written as $I_4$ in mathematical notation.

Try the following, to see what they produce:

also_id5 = np.eye(5)
print(also_id5)
print('-----')

yet_again = np.eye(5, 5)
print(yet_again)
print('-----')

another_id5 = np.eye(5, 5, 0)  # start the 1's in the 0th position (i.e. row 1 and column 1)
print(another_id5)
print('-----')

# What if we want diagonal ones, but not on the main diagonal,
# but starting in the first row and third column rather?
print(np.eye(5, 5, 2))

After the above, can you explain the difference between np.identity() and np.eye()?

In [ ]:

Arrays of random numbers¶

For simulations it is often helpful to create and use arrays of random values. Each value might be a starting position or state. Or sometimes you just want to test a piece of code, not only with 1's and 0's, but any random values.

For this it is helpful to create arrays of any shape, filled with random values:

import numpy as np

# Random floats between 0 (included) and 1 (not included)
rnd_matrix = np.random.random(size=(4,3))   
print(rnd_matrix)

# Or try a multi-dimensional array
rnd_array = np.random.random(size=(4, 2, 3))
print(rnd_array)

In [ ]:

Sometimes we want random integers though, between some lower and upper (high) bounds. The random values may include the low values, but will be till just under the high value specified.

# Run this code a few times to verify that you get -3, but never a +7
rnd_int = np.random.randint(low=-3, high=7, size=(4, 5))
print(rnd_int)

In [ ]:

Sequences¶

Vectors containing a sequence, such as [0, 1, 2, ... 9] or [2, 4, 6, 8, ... 12] are often used as a starting point for calculations. To create these we use the numpy.arange() and numpy.linspace() commands.

Syntax:

numpy.arange(start, stop, step)

start by default is zero
stop is not inclusive (in other words, NumPy will stop just before this value), and
the step has a default value of 1.

As mentioned above, Python functions can be called by specifying the input arguments (start and stop and step are the argument names).

Try it out below:

import numpy as np
np.arange(4)

# We could have also written, but you will 
# agree that this is unnecessary, as the defaults
# are already good enough. But this is explicit:
np.arange(start=0, stop=4, step=1)

np.arange(start=2, stop=6, step=1)

# Leave `step` unspecified if it is just "1"
np.arange(start=2, stop=6)  

# Most common usage: leave all arguments unspecified
np.arange(2, 6)             

# Jump in steps of 2
np.arange(start=2, stop=9, step=2)  
np.arange(2, 9, 2)

We saw the built-in Python range function in an earlier module. So what is the difference between the NumPy library's np.arange function and the built-in range function?

Try replacing np.arange(...) with range and see what differences you notice.

Try using np.arange(...), but step in increments of 0.5, or 0.33333 instead. Note that you cannot do this with the range(...) function.

Create a sequence of values starting at $-4$ and ending just below $+4$, in steps of $1$

Create a sequence of values starting at $-2$ and ending just below $+2$, in steps of $0.5$. How many elements are in the sequence? Remember the len function? What about the .shape attribute?

Start at $+2$ and step *down* in increments of $0.25$, until just before $-2$. How many elements are in the sequence?

In [ ]:

There is also the np.linspace() command, which is similar to np.arange(). The differences are:

you specify the length of your sequence, instead of a step size.
the stop value *is included* by default, but it can be removed.

It returns an array with evenly spaced numbers over the specified interval.

Syntax:

np.linspace(start, stop, num)

where the default value of num=50. Type help(np.linspace) to see how you can either include or exclude the endpoint.

To try:¶

Confirm that you indeed get a sequence of 50 values when you do not specify num. Also confirm that the stop value is the last value in the vector.

Try to get a vector with fewer elements, say 6, instead of 50.

Go backwards again: create a sequence where the numbers decrease in value.

In [ ]:

Reshaping¶

One you have a sequence of numbers in a long vector, you might want to fold them up in a matrix, or an multi-dimensional array.

Use the reshape function of a NumPy array to do that.

vector = np.arange(12)
matrix = vector.reshape((3, 4))

Note the order! NumPy will first fill each row, so the first row will be [0, 1, 2, 3] and then the next row will be [4, 5, 6, 7], and so on.

Try it:

vector = np.arange(12)
print('This is a vector with a shape of: ' + str(vector.shape))
matrix = vector.reshape((4, 3))
matrix = vector.reshape((2, 6))
print('This is a matrix with a shape of: ' + str(matrix.shape))
matrix = vector.reshape((4, 4)) # intentional error

In [ ]:

Next steps¶

Above we have created vectors, matrices and arrays in all sorts of formats. With ones, zeros, diagonals, random numbers, and sequences of numbers.

Next it is time to put these to use, and perform calculations on them. This is in the next module, module 5.

*Feedback and comments about this worksheet?* Please provide any anonymous comments, feedback and tips.

In [1]:

# IGNORE this. Execute this cell to load the notebook's style sheet.
from IPython.core.display import HTML
css_file = './images/style.css'
HTML(open(css_file, "r").read())

Out[1]:

In [ ]:

Table of Contents

Module 4: Overview¶

Preparing for this module¶

Objects and type: a quick recap¶

Vectors, Matrices and Arrays using NumPy¶

Scalars¶

Vectors¶

Matrix¶

Array¶

Libraries in Python: introducing NumPy for working with arrays¶

Importing libraries¶

To try:¶

Importing libraries as an alias¶

Creating your first array .... well vector, to be specific¶

Creating vectors, matrices and arrays with NumPy¶

Creating your first vector with NumPy¶

Creating your first matrix (a two-dimensional array)¶

Creating your first multi-dimensional array¶

Summary so far¶

Further ways of creating NumPy arrays¶

Identity matrices¶

Arrays of random numbers¶

Sequences¶

To try:¶

Reshaping¶

Next steps¶