Python Basics
General information on using Python for Data Science and Machine Learning
Updated: 03 September 2023
Python Basics
Based on this Cognitive Class Course
Labs
Jupyter Notebooks with Examples on these can be found in the labs folder
The Labs are from this Cognitive Class Course and are under the MIT License
Types
Hello World
We can simply print out a string in Python as follows
1print('Hello, World!')Python Version
We can check our version as follows
1import sys2print(sys.version)The sys module is a built-in module that has many system specific parameters and functions
Comments
Comments can be done by using the #
1# Python commentsDocstrings
Python also allows for use of docstrings which can appear immediately after a function, class definition, or at the top of a module, these are done as follows
1def hello():2 '''3 This function will say hello4 It also takes no input arguments5 '''6 return 'Hello'7hello()Also note that Python uses ' and " to mean the same thing
Types of Objects
Python is object oriented, and dynamically typed. We can get the type of a value in python with the type function
1type(12) # int2type(2.14) # float3type("Hello") # str4type(True) # bool5type(False) # boolWe can get information about a type using the sys object properties, for example
1sys.float_infoType Conversion
We can use the following to convert between types
1float(2)2int(1.1)3int('1')4str(1)5str(1.1)6int(True) # 17int(False) # 08float(True) # 1.09bool(1) # TrueExpressions
Expressions in python can include integers, floats, and strings, depending on the operation
We can do the following
11 + 2 # addition21 - 2 # subtraction31 / 2 # division41 // 2 # integer divisionInteger division will round off to the nearest integer
It is also helpful to note that Python will obey BODMAS
Variables
Variables can simply be assigned without being defined first, and are dynamically types
1x = 22y = x / 23
4x = 2 + 45x = 'Hello'In a notebook we can simply evaluate the value of a variable or expression by placing it as the last line of a cell
Strings
Defining Strings
Strings can be defined with either ' or ", and can be a combination of any characters
1'Hello World'2'H3110 Wor!$'3"Hello World"Indexing
Strings are simply an ordered sequence of characters, we can index these as any other array with [] as follows
1name = 'John'2name[0] # J3name[3] # nWe can also index negatively as follows
1name = 'John'2name[-1] # n3name[-4] # JLength
We can get the length of a string with len()
1len(name) # 4Slicing
We can slice strings as follows
1name = 'John Smith'2name[0:4] # John3name[5:7] # SmOr generally as
1string[start:end]Stride
We can also input the stride, which will select every nth value within a certain range
1string[::stride]2string[start:stop:stride]For example
1name[::3] # Jnmh2name[0:4:2] # JhConcatenation
We can concatenate strings as follows
1text = 'Hello'2text + text # HelloHello3text * 3 # HelloHelloHelloEscape Characters
At times we may need to escape some characters in a Python string, these are as follows
| Character | Escape |
|---|---|
| newline | <NEW LINE> |
| \ | \ |
| ’ | \’ |
| ” | \“ |
| ASCII Bell | \a |
| ASCII Backspace | \b |
| ASCII FF | \f |
| ASCII LF | \n |
| ASCII CR | \r |
| ASCII Tab | \t |
| ASCII VT | \v |
| Octal Character | \ooo |
| Hex Character | \xhh |
We can also do multi line strings with the """ or '''
If we have a string that would otherwise need escaping, we can use a string literal as follows
1text = r'\%\n\n\t'2text # '\%\n\n\t'String Operations
We have a variety of string operations such as
1text = 'Hello;2text.upper() # HELLO3text.lower() # hello4text.replace('Hel', 'Je') # Jello5text.find('l') # 26text.find('ell') # 17text.find('asnfoan') # -1Tuples
Define
A tuple is a way for us to store data of different types, this can be done simply as follows
1my_tuple = ('Hello', 3, 0.14)2type(my_tuple) # tupleA key thing about tuples is that they are immutable. We can reassign the entire tuple, but not change its values
Indexing
We can index a tuple the same way as a string or list using positive or negative indexing
1my_tuple[1] # 32my_tuple[-2] # 3Concatenation
We can also concatenate tuples
1my_tuple += ('pies', 'are', 3.14)2my_tuple # ('Hello', 3, 0.14, 'pies', 'are', 3.14)Slice and Stride
We can slice and stride as usual with
1my_tuple[start:end]2my_tuple[::2]3my_tuple[0:4:2]Sorting
We can sort a tuple with the sorted function
1sorted(tuple)The sorted function will return a list
Nesting
Since tuples can hold anything, they can also hold tuples
1my_tuple = ('hello', 4)2my_tuple2 = (my_tuple, 'bye')We can access elements of tuples with double indexing as follows
1my_tuple2[0][1] # 4Lists
Defining
A list is an easy way for us to store data of any form, such as numbers, strings, tuples, and lists
Lists are mutable and have many operations that enable us to work with them more easily
1my_list = [1,2,3,'Hello']Indexing
Lists can also be indexed using the usual method both negatively and positively
1my_list[1] # 22my_list[-1] # HelloOperations
Slice and Stride
1my_list[start:end] # slicing2my_list[::stride]3my_list[start:end:stride]Extend
Extend will add each object to the end of the list
1my_list = [1,2]2my_list.extend([item1, item2])3my_list # [1, 2, item1, item2]Append
Append will add the input as a single object to the last value of the list
1my_list = [1,2]2my_list.append([item1, item2])3my_list # [1, 2, [item1, item2]]Modify an element
List elements can be modified by referencing the index
1my_list = [1,2]2my_list[1] = 33my_list # [1,3]Delete an Element
1my_list = [1,2,3]2del(my_list[1])3my_list # [1,3]We can delete elements by index as well
String Splitting
We can split a string into a list as follows
1my_list = 'hello'.split()2my_list # [h,e,l,l,o]3
4my_list = 'hello, world, !'.split(',')5my_list # ['hello', 'world', '!']Cloning
Lists are stored by reference in Python, if we want to clone a list we can do it as follows
1new_list = my_list[:]Sets
A set is a unique collection of objets in Python, sets will automatically remove duplicate items
Defining a Set
1my_set = {1, 2, 3, 1, 2}2my_set # {1, 2, 3}Set Operations
Set from a List
We can create a set from a list with the set function
1my_set = set(my_list)Add Element
We can add elements to a set with
1my_set.add("New Element")If the element already exists nothing will happen
Remove Element
We can remove an element from a set with
1my_set.remove("New Element")Check if Element is in Set
We can check if an element is in a set by using in which will return a bool
1"New Element" in my_set # FalseSet Logic
When using sets we can compare them with one another
Intersection
We can find the intersection between sets with & or with the intersection function
1set_1 & set_22set_1.intersection(set_2)Difference
We can fin d the difference in a specific set relative to another set with
1set_1.difference(set_2)Which will give us the elements that set_1 has that set_2 does not
Union
We can get the union of two sets with
1set_1.union(set_2)Superset
We can check if one set is a superset of another with
1set_1.issuperset(set_2)Subset
We can check if one set is a subset of another with
1set_1.isSubset(set_2)Dictionaries
Dictionaries are like lists, but store data by a key instead of an index
Keys can be strings, numbers, or any immutable object such as a tuple
Defining
We can define a dictionary as a set of key-value pairs
1my_dictionary = {"key1": 1, "key2": "2", "key3": [3, 3, 3], "key4": (4, 4, 4), ('key5'): 5, (0, 1): 6, 92: 'hello'}Accessing a Value
We can access a value by using its key, such as
1my_dictionary['key1'] # 12my_dictionary[(0,1)] # 63my_dictionary[5] # 'hello'Get All Keys
We can get all the keys in a dictionary as follows
1my_dictionary.keys()Append a Key
Key-value pairs can be added to a dictionary as follows
1my_dictionary['New Key'] = new_valueDelete an Entry
We can delete an entry by key using
1del('New Key)Verify that Key is in Dictionary
We can use the in operator to check if a key exists in a dictionary
1'My Key' in my_dictionaryConditions and Branching
Comparison Operators
We have a few different comparison operators which will produce a boolean based on their condition
| Operation | Operator | i = 1 |
|---|---|---|
| equal | == | i == 1 |
| not equal | != | i != 0 |
| greater than | > | i > 0 |
| less than | < | i < 2 |
| greater than or equal | >= | i >= 0 and i >= 1 |
| less than or equal | <= | i <= 2 and i <= 1 |
Logical Operators
Python has the following logical operators
| Operation | Operator | i = 1 |
|---|---|---|
| and | and | i == 1 and i < 2 |
| or | or | i == 1 or i == 2 |
| not | not | not(i != 0) |
String Comparison
When checking for equality Python will check if the strings are the same
1'hello' != 'bye' # TrueComparing strings is based on the ASCII Code for the string, for example 'B' > 'A' because the ASCII Code for B is 102 and A is 101
When comparing strings like this the comparison will be done in order of the characters in the string
Branching
Branching allows us to run different statements depending on a condition
If
The if statement will only run the code that forms part of its block if the condition is true
1i = 02if i == 0:3 print('Hello')If-Else
An if-else can be done as follows
1i = 02if i == 1:3 print('Hello')4else:5 print('Bye')Elif
If we want to have multiple if conditions, but only have the first one that is true be executed we can do
1i = 02if i == 1:3 print('Hello')4elif i == 0:5 print('Hello World')6elif i > 1:7 print('Hello World!!')8else:9 print('Bye')Loops
For Loops
A for loop in Python iterates through a list and executes its internal code block
1loop_vals = [1,6,2,9]2for i in loop_vals:3 print(i)4#1 6 2 9Range
If we want to iterate through the values without using a predefined list, we can use the range function to generate a list of values for us to to iterate through
The range function works as follows
1ran = range([start,], stop, [,step])2ran # [start, start + step, start + 2*step, ... , stop -1]The range function only requires the stop value, the other two are optional,the stop value is not inclusive
1range(5) # [0,1,2,3,4]2range(5, 10) # [5,6,7,8,9]3range(5, 10, 2) # [5,7,9]Using this we can iterate through the values of our array as follows
1loop_vals = [1,6,2,9]2for i in range(len(loop_vals)):3 print(loop_vals[i])While Loops
While loops will continue until the stop condition is no longer true
1i = 02while (i < 10):3 print(i)4 i ++5# 0 1 3 4 5 6 7 8 9Functions
Defining
Functions in Python are defined and called as follows
1def hello():2 print('Hello')3
4hello() # HelloWe can have arguments in our function
1def my_print(arg1, arg2):2 print(arg1, arg2)3
4my_print('Hello', 'World') # Hello WorldFunctions can also return values
1def my_sum(val1, val1):2 answer = val1 + val23 return answer4
5my_sum(1,2) # 3A function can also have a variable number of arguments such as
1def sum_all(*vals):2 return sum(vals)3
4sum_all(1,2,3) # 6The vals object will be taken in as a tuple
Function input arguments can also have default values as follows
1def has_default(arg1 = 4):2 print(arg1)3
4has_default() # 45has_default(5) # 5Or with multiple arguments
1def has_defaults(arg1, arg2 = 4):2 print(arg1, arg2)3
4has_defaults(5) # 5 45has_defaults(5,6) # 5 6Help
We can get help about a function by calling the help function
1help(print)Will give us help about the print function
Scope
Functions have access to variables that are globally defined, as well as their own local scope. Locally defined variables are not accessible from outside the function unless we declare it as global as follows
1def make_global():2 global global_var = 53
4make_global()5global_var # 5Note that the global_var will not be defined until our function is at least called once
Objects and Classes
Defining a Class
We can define a class Circle which has a constructor, a radius and a colour as well as a function to increase its radius and to plot the Circle
We make use of matplotlib to plot our circle here
1import matplotlib.pyplot as plt2%matplotlib inline3
4class Circle(object):5
6 def __init__(self, radius=3, color='blue'):7 self.radius = radius8 self.color = color9
10 def add_radius(self, r)11 self.radius += r12 return(self.radius)13
14 def draw_circle(self):15 plt.gca().add_patch(plt.Circle((0, 0), radius=self.radius, fc=self.color))16 plt.axis('scaled')17 plt.show()Instantiating an Object
We can create a new Circle object by using the classes constructor
1red_circle = Circle(10, 'red')Interacting with our Object
We can use the dir function to get a list of all the methods on an object, many of which are defined by Python already
1dir(red_circle)We can get our object’s property values by simply referring to them
1red_circle.color # red2red_circle.radius # 10We can also manually change the object’s properties with
1red_circle.color = 'pink'We can call our object’s functions the same way
1red_circle.add_radius(10) # 202red_circle.radius # 20The red_circle can be plotted by calling the draw_circle function
Reading Files
Note that the preferred method for reading files is using with
Open
We can use the built-in open function to read a file which will provide us with a File object
1example1 = '/data/test.txt'2file1 = open(example1,'r')The 'r' sets open to read mode, for write mode we can use 'w', and 'a' for append mode
Properties
File objects have some properties such as
1file1.name2file1.modeRead
We can read the file contents to a string with the following
1file_content = file1.read()Close
Lastly we need to close our File object with
1file1.closeWe can verify that the file is closed with
1file1.closed # TrueWith
A better way to read files is by using using the with statement which will automatically close the file, even if we encounter an exception
1with open(example1) as file1:2 file_content = file1.read()We can also read the file in by pieces either based on characters or on lines
Read File by Characters
We can read the first four characters with
1with open(example1,'r') as file1:2 content = file1.read(4)Note that this will still continue to parse the file, and not start over each time we call read(), so we can read the first seven characters is like so
1with open(example1,'r') as file1:2 content = file1.read(4)3 content += file1.read(3)Read File by Lines
Our File object looks a lot like a list with each line a new element in the list
We can read our file by lines as follows
1with open(example1,'r') as file1:2 content = file1.readline()We can read each line of our file into a list with the readline function like so
1content = []2with open(example1,'r') as file1:3 for line in file1:4 content.append(line)Or with the readlines function like so
1with open(example1, 'r') as file1:2 content = file1.readlines()Writing Files
We can also make use of open to write content to a file as follows
1out_path = 'data/output.txt'2with open(out_path, 'w') as out_file:3 out_file.write('content')The write function works the same as the read function in that each time we call it, it will just write a single line to the file, if we want to write multiple lines to our file w need to do this as follows
1content = ['Line 1 content', 'Line 2 content', 'Line 3 content']2with open(out_path, 'w') as out_file:3 for line in content:4 out_file.write(line)Copy a File
We can copy data from one file to another by simultaneously reading and writing between the files
1with open('readfile.txt','r') as readfile:2 with open('newfile.txt','w') as writefile:3 for line in readfile:4 writefile.write(line)Pandas
Pandas is a library that is useful for working with data as a DataFrame in Python
Importing Pandas
The Pandas library will need to be installed and then imported into our notebook as
1import pandas as pdCreating a DataFrame
We can create a new DataFrame in Pandas as follows
1df = pd.DataFrame({'Name':['John','Jack','Smith','Jenny','Maria'],2 'Age':[23,12,34,13,42],3 'Height':[1.2,2.3,1.1,1.6,0.5]})Read CSV as DataFrame
We can read a csv as a DataFrame with Pandas by doing the following
1csv_path ='data.csv'2df = pd.read_csv(csv_path)Read XLSX as DataFrame
We need to install an additional dependency to do this firstm and then read it with the pd.read_excel function
1!pip install xlrd2xlsx_path = 'data.xlsx'3df = pd.read_excel(xlsx_path)View DataFrame
We can view the first few lines of our DataFrame as follows
1df.head()Assume our data looks like the following
| Name | Age | Height | |
|---|---|---|---|
| 0 | John | 23 | 1.2 |
| 1 | Jack | 12 | 2.3 |
| 2 | Smith | 34 | 1.1 |
| 3 | Jenny | 13 | 1.6 |
| 4 | Maria | 42 | 0.5 |
Working with DataFrame
Assigning Columns
We can read the data from a specific column as follows
1ages = df[['age']]| Age | |
|---|---|
| 0 | 23 |
| 1 | 12 |
| 2 | 34 |
| 3 | 13 |
| 4 | 42 |
We can also assign multiple columns
1age_vs_height = df[['Age', 'Height']]| Age | Height | |
|---|---|---|
| 0 | 23 | 1.2 |
| 1 | 12 | 2.3 |
| 2 | 34 | 1.1 |
| 3 | 13 | 1.6 |
| 4 | 42 | 0.5 |
Reading Cells
We can read a specific cell in one of two ways. The iloc fnction allows us to access a cell with the row and column index, and the loc function lets us do this with the row index and column name
1df.iloc[1,2] # 2.32df.loc[1, 'Height'] # 2.3Slicing
We can also do slicing using loc and iloc as follows
1df.iloc[1:3, 0:2]| Name | Age | |
|---|---|---|
| 1 | Jack | 12 |
| 2 | Smith | 34 |
1df.loc[0:2, 'Age':'Height']| Age | Height | |
|---|---|---|
| 0 | 23 | 1.2 |
| 1 | 12 | 2.3 |
| 2 | 34 | 1.1 |
Saving Data to CSV
Using Pandas, we can save our DataFrame to a CSV with
1df.to_csv('my_dataframe.csv')Arrays
The Numpy Library allows us to work with arrays the same as we would mathematically, in order to use Numpy we need to import it as follows
1import numpy as npArrays are similar to lists but are fixed size, and each element is of the same type
1D Arrays
Defining an Array
We can simply define an array as follows
1a = np.array([1,2,3]) # casting a list to arrayTypes
An array can only store data of a single type, we can find the type of the data in an array with
1a.dtypeManipulating Values
We can easily manipulate values in an array by changing them as we would in a list. The same can be done with splicing and striding operations
1a = np.array([1,2,3]) # array([1,2,3])2a[1] = 5 # array([5,2,3])3b = c[1:3] # array([2,3])We can also use a list to select a specific indexes and even assign values to those indexes
1a = np.array([1,2,3]) # array([1,2,3])2select = [1,2]3b = a[select] # array([1,2])4a[select] = 0 # array([1,0,0])Attributes
An array has various properties and functions such as
1a = np.array([1,2,3])2a.size # size3a.ndim # number of dimensions4a.shape # shape5a.mean() # mean of values6a.max() # max value7a.min() # min valueArray Operations
We have a few different operations on arrays such as
1u = np.array([1,0])2v = np.array([0,1])3u+v # vector addition4u*v # array multiplication5np.dot(u,v) # dot product6np.cross(u,v) # cross product7u.T # transpose arrayLinspace
The linspace function can be used to generate an array with values over a specific interval
1np.linspace(start, end, num=divisions)2np.linspace(-2,2,num=5) # array([-2., -1., 0., 1., 2.])3np.linspace(0,2*np.pi,num=10)4# array([0. , 0.6981317 , 1.3962634 , 2.0943951 , 2.7925268 ,5# 3.4906585 , 4.1887902 , 4.88692191, 5.58505361, 6.28318531])Plotting Values
We can apply a function to these values by using array operations, such as those mentioned above as well as others like
1x = np.linspace(0,2*np.pi, num=100)2y = np.sin(x) + np.cos(x)2D Arrays
Defining a 2D Array
Two dimensional Arrays can be defined by a list that contains nested lists of the same size as follows
1a = np.array([[11,12,13],[21,22,23],[31,32,33]])We can similarly make use of the previously defined array operations
Accessing Values
Values in a 2D array can be indexed in either one of two ways
1a[1,2] # 232a[1][2] # 23Slicing
We can perform slicing as follows
1a[0][0:2] # array([11, 12])2a[0:2,2] # array([13, 23])Mathematical Operations
We can perform the usual mathematical operations with 2D arrays as with 1D
Dancing Man
The following Script will make a dancing man if run in Jupyter > because why not
1from IPython.display import display, clear_output2import time3
4val1 = '(•_•)\n<) )╯\n/ \\'5val2 = '\(•_•)\n( (>\n/ \\'6val3 = '(•_•)\n<) )>\n/ \\'7
8while True:9 for pos in [val1, val2, val3]:10 clear_output(wait=True)11 print(pos)12 time.sleep(0.6)