Python Fundamentals Course
Welcome to the definitive Python course designed for both beginners and intermediate programmers. This comprehensive curriculum will take you from the core syntax of Python to building real-world applications using powerful libraries and frameworks. You'll master essential concepts, including data structures, object-oriented programming, data analysis, web development, and more. Join us to build a solid foundation and unlock your potential in the world of programming with Python.
Module 1: Python Fundamentals
Python Syntax and Variables
Python is celebrated for its clean, simple syntax that prioritizes readability. Unlike many other programming languages, Python uses indentation to define code blocks, making it visually intuitive and easy to understand. This structural feature is not just a style choice; it’s a fundamental part of the language’s rules, and incorrect indentation will result in a syntax error. A single tab or four spaces is the common standard for indentation. The absence of semicolons at the end of statements and curly braces for blocks (like if
statements or for
loops) further simplifies the code, allowing developers to focus on the logic rather than the boilerplate. These design choices contribute to Python’s reputation as an excellent language for beginners.
Understanding variables is the first step in learning any programming language. In Python, a variable is essentially a reserved memory location used to store a value. When you create a variable, you are telling Python to allocate a space in memory for that data. Python is a dynamically-typed language, which means you don't need to declare the variable's type (like integer, string, etc.) explicitly. You simply assign a value, and Python figures out the type on its own. For example, age = 30
creates an integer variable, while name = "Alice"
creates a string variable. This flexibility speeds up development but also requires the programmer to be mindful of the data they are working with.
Rules for Naming Variables
While Python gives you freedom in naming variables, there are a few rules and conventions to follow to write clean and maintainable code. A variable name must start with a letter (a-z, A-Z) or an underscore (_). It cannot start with a number. The rest of the name can contain letters, numbers, and underscores. Variable names are also case-sensitive, meaning myVariable
and myvariable
are treated as two different variables. It's considered a best practice in the Python community to use snake_case for variable and function names (e.g., first_name
instead of firstName
) for better readability. Using a descriptive variable name, such as customer_age
instead of just age
, is also highly recommended to make your code self-documenting and easier for others (and your future self) to understand.
Variable assignment is straightforward. You use the equals sign (=
) to assign a value to a variable. You can also assign multiple variables at once, which is a convenient Python feature. For instance, x, y, z = 1, 2, 3
assigns each variable a corresponding value. Additionally, you can assign the same value to multiple variables, such as a = b = c = 10
. This is a quick way to initialize several variables to the same starting value. Lastly, understanding the difference between assigning a value and referencing an object is crucial. In Python, variables are simply names that point to objects in memory. When you assign one variable to another (e.g., a = b
), you're not copying the value but rather making both names point to the same object. This can have important implications, especially when dealing with mutable data types like lists, which we'll explore in a later module.
# Basic variable assignment
name = "Charlie"
age = 25
is_student = True
pi = 3.14159
# Displaying variables
print(f"Name: {name}")
print(f"Age: {age}")
print(f"Is student? {is_student}")
# Multiple assignment
x, y, z = "one", "two", "three"
print(x, y, z)
# Assigning the same value
a = b = c = "Python"
print(a, b, c)
Try It Yourself: Create and Manipulate Variables
In the code block below, declare three variables: school_name
as a string, number_of_students
as an integer, and is_accredited
as a boolean. Then, use the print()
function to display the value of each variable.
# Your code here
school_name = "ReadyHT Academy"
number_of_students = 1500
is_accredited = True
print(f"School Name: {school_name}")
print(f"Number of Students: {number_of_students}")
print(f"Is Accredited: {is_accredited}")
Module 1: Python Fundamentals
Data Types and Type Conversion
In Python, every value has a data type, which is a classification that specifies the type of value a variable can hold. Understanding these core data types is fundamental to writing effective code. The most common types include integers (whole numbers like 10
, -5
), floats (decimal numbers like 3.14
, -0.5
), strings (text enclosed in quotes like "hello"
, 'world'
), and booleans (True
or False
). Python also includes more complex built-in types such as lists, tuples, and dictionaries, which we'll cover in a later module. Knowing the data type is crucial because it determines what operations you can perform on a value. For example, you can perform mathematical operations on integers and floats, but not on strings, and you can only use logical operators (and
, or
, not
) on booleans.
Python provides a built-in function, type()
, which allows you to check the data type of any variable. This is particularly useful for debugging or when you're working with data from external sources and need to confirm its structure. For example, type(42)
will return <class 'int'>
, while type("42")
will return <class 'str'>
. The distinction is critical because, while they look similar, their behavior in code is completely different. Python's dynamic typing is a double-edged sword; it's convenient, but it also means you need to be careful about unintended type mismatches that can lead to errors.
Type Conversion (Casting)
Often in programming, you'll need to convert data from one type to another, a process known as type conversion or casting. Python offers several built-in functions for this purpose, including int()
, float()
, str()
, and bool()
. These functions take a value as an argument and attempt to convert it to the specified type. For example, if you have a string representing a number, like "100"
, you can convert it to an integer using int("100")
. This is a common scenario when processing user input, which is always received as a string. Similarly, converting a float to an integer using int()
will truncate the decimal part, while converting a number to a string with str()
is essential for concatenation or display.
It's important to be aware of the limitations of type conversion. You can't convert a non-numeric string like "hello"
directly to an integer or float; this will raise a ValueError
. Similarly, while converting numbers to booleans is possible (0 becomes False
, any other number becomes True
), the reverse isn't always meaningful. Type conversion is a powerful tool for manipulating data, but it must be used thoughtfully to avoid runtime errors. For example, when you read data from a file, it's often a string. To perform calculations, you'll need to cast those string values to numerical types first. This process is a fundamental part of data cleaning and preparation, which is essential for any serious programming task.
# Declaring different data types
integer_var = 150
float_var = 15.5
string_var = "150"
boolean_var = True
# Checking data types
print(f"Type of integer_var: {type(integer_var)}")
print(f"Type of float_var: {type(float_var)}")
print(f"Type of string_var: {type(string_var)}")
# Type conversion examples
# String to integer
str_to_int = int(string_var)
print(f"String to int: {str_to_int}, type: {type(str_to_int)}")
# Float to integer (truncates)
float_to_int = int(float_var)
print(f"Float to int: {float_to_int}, type: {type(float_to_int)}")
# Integer to float
int_to_float = float(integer_var)
print(f"Int to float: {int_to_float}, type: {type(int_to_float)}")
# Integer to string
int_to_string = str(integer_var)
print(f"Int to string: {int_to_string}, type: {type(int_to_string)}")
# Non-zero integer to boolean
int_to_bool = bool(150)
print(f"150 to bool: {int_to_bool}")
# Zero integer to boolean
zero_to_bool = bool(0)
print(f"0 to bool: {zero_to_bool}")
Try It Yourself: Convert Data Types
Given the variable price_str = "199.99"
, write code to convert it to a floating-point number. Then, convert the result to an integer. Print the type and value of each new variable to confirm your conversions.
# Your code here
price_str = "199.99"
price_float = float(price_str)
price_int = int(price_float)
print(f"Original string: {price_str}, type: {type(price_str)}")
print(f"Float value: {price_float}, type: {type(price_float)}")
print(f"Integer value: {price_int}, type: {type(price_int)}")
Module 1: Python Fundamentals
Control Structures and Loops
In programming, a program rarely executes sequentially from top to bottom. Control structures and loops are the building blocks that allow you to dictate the flow of execution, enabling your code to make decisions and perform repetitive tasks. The most fundamental control structure is the if
statement. This allows you to execute a block of code only if a specific condition is true. The syntax is simple: if condition:
, followed by an indented block of code. You can extend this logic with elif
(else if) to check for multiple conditions and else
to provide a fallback block of code that runs if none of the preceding conditions are met. These statements are the backbone of decision-making in any program, from checking if a user's password is correct to determining which path an AI character should take in a game.
Loops are essential for automating repetitive tasks. Python provides two primary types of loops: the for
loop and the while
loop. A for
loop is used for iterating over a sequence (like a list, tuple, or string) or other iterable objects. The loop will execute once for each item in the sequence. A common use of the for
loop is with the built-in range()
function, which generates a sequence of numbers, perfect for executing a block of code a specific number of times. The while
loop, on the other hand, repeatedly executes a block of code as long as a condition is true. It’s useful when you don't know in advance how many times you need to loop. For example, a while
loop could be used to continuously ask a user for input until they provide a valid response. It's crucial to ensure that the condition of a while
loop will eventually become false; otherwise, you'll create an infinite loop, which can cause your program to freeze.
Loop Control Statements
To give you more control over the flow of your loops, Python provides three special statements: break
, continue
, and pass
. The break
statement is used to exit a loop immediately, regardless of whether the loop's condition is still true. It’s often used in situations where you’ve found what you’re looking for and don’t need to continue iterating. The continue
statement skips the rest of the current iteration of the loop and moves on to the next one. This is useful when you want to bypass certain items in a sequence without stopping the loop entirely. Lastly, the pass
statement is a null operation; it does nothing. It's often used as a placeholder in a part of the code where a statement is syntactically required but you don't want to execute any code yet. This is common when you’re building the structure of a program and will fill in the details later. Mastering these control flow mechanisms allows you to write efficient, elegant, and powerful code that can handle a wide variety of scenarios.
# If-Elif-Else Statement
score = 85
if score >= 90:
print("Grade: A")
elif score >= 80:
print("Grade: B")
elif score >= 70:
print("Grade: C")
else:
print("Grade: F")
# For Loop
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
print(fruit)
# While Loop
count = 0
while count < 3:
print(f"Count is {count}")
count += 1
# Using 'break'
for i in range(10):
if i == 5:
break
print(i) # Prints 0, 1, 2, 3, 4
# Using 'continue'
for i in range(5):
if i == 2:
continue
print(i) # Prints 0, 1, 3, 4
Try It Yourself: FizzBuzz with Loops
Write a for
loop that iterates from 1 to 100. For each number, print "Fizz" if it's divisible by 3, "Buzz" if it's divisible by 5, "FizzBuzz" if it's divisible by both, and the number itself otherwise. Use an if/elif/else
structure within the loop.
# Your code here
for num in range(1, 101):
if num % 3 == 0 and num % 5 == 0:
print("FizzBuzz")
elif num % 3 == 0:
print("Fizz")
elif num % 5 == 0:
print("Buzz")
else:
print(num)
Module 1: Python Fundamentals
Functions and Lambda Expressions
Functions are a cornerstone of effective programming, allowing you to organize code into reusable, modular blocks. In Python, you define a function using the def
keyword, followed by the function name, a set of parentheses for parameters, and a colon. The code inside the function is indented. Functions can take arguments, which are values passed into the function to be used in its operations. They can also return a value using the return
keyword. A function that doesn't explicitly return a value will implicitly return None
. By breaking down your program into smaller, logical functions, you can make your code easier to read, maintain, and debug. This principle is known as the DRY (Don't Repeat Yourself) principle, as functions help you avoid writing the same block of code multiple times. Functions are also first-class citizens in Python, which means you can pass them as arguments to other functions, return them from functions, and assign them to variables, opening up advanced programming techniques.
Python offers another way to create small, anonymous functions on the fly: lambda expressions. A lambda function is a single-expression function that can take any number of arguments but can only have one expression. It is defined using the lambda
keyword. The syntax is lambda arguments: expression
. Lambda functions are primarily used when you need a simple function for a short period of time, typically as an argument to a higher-order function (a function that takes other functions as arguments), such as map()
, filter()
, or sorted()
. For example, if you wanted to sort a list of tuples based on the second element, you could use a lambda expression as the key for the sorted()
function. While they are powerful for simple, one-off tasks, their single-expression nature means they cannot contain statements like if/else
or loops, making them less suitable for complex logic. For any multi-line or more complex function, a standard def
function is the way to go.
Arguments and Parameters
A key part of working with functions is understanding how to pass data to them. Python supports several types of arguments. Positional arguments are the most common, where the order of arguments matters. The first argument passed will be assigned to the first parameter, the second to the second, and so on. Keyword arguments allow you to pass arguments by name, which is useful for improving readability, especially with functions that have many parameters. Default arguments allow you to specify a default value for a parameter; if a value isn't provided during the function call, the default is used. This makes functions more flexible and easier to use. Lastly, you can use *args
and **kwargs
to handle a variable number of positional and keyword arguments, respectively. These special syntaxes are invaluable when you're writing functions that need to be flexible enough to handle a varying number of inputs, such as a function that can calculate the sum of any number of values.
# A simple function
def greet(name):
"""This function greets the person passed in as a parameter."""
return f"Hello, {name}!"
print(greet("Alice"))
# A function with a default argument
def power(base, exponent=2):
return base ** exponent
print(power(3)) # Uses default exponent (2)
print(power(3, 3)) # Overrides default exponent
# A simple lambda expression
square = lambda x: x * x
print(f"Square of 5 is: {square(5)}")
# Using lambda with a higher-order function (sorted)
points = [{'x': 2, 'y': 3}, {'x': 4, 'y': 1}]
points.sort(key=lambda p: p['y'])
print(points) # sorts by 'y' value
Try It Yourself: Create a Function and a Lambda
1. Write a function called calculate_area
that takes two arguments, width
and height
, and returns their product.
2. Create a list of numbers: numbers = [1, 5, 2, 8, 3]
. Use a lambda function with Python's sorted()
function to sort the list in descending order. Print the result.
# Your function
def calculate_area(width, height):
return width * height
print(f"The area is: {calculate_area(10, 5)}")
# Your lambda expression
numbers = [1, 5, 2, 8, 3]
sorted_numbers = sorted(numbers, key=lambda x: -x)
print(f"Sorted descending: {sorted_numbers}")
# Alternative lambda for sorted
sorted_numbers_alt = sorted(numbers, reverse=True)
print(f"Sorted with reverse: {sorted_numbers_alt}")
Module 1: Python Fundamentals
Error Handling and Debugging
Writing perfect code on the first try is nearly impossible. Errors are an inevitable part of the programming process, and understanding how to handle them is a crucial skill. In Python, errors that occur during program execution are called exceptions. When an exception occurs, the program's normal flow is interrupted, and it typically crashes. Common exceptions include NameError
(when you try to use a variable that hasn't been defined), TypeError
(when an operation is performed on an inappropriate data type), and ValueError
(when a function receives an argument of the correct type but an inappropriate value). The key to writing robust code is to anticipate these potential errors and handle them gracefully, preventing your program from crashing and providing a better user experience. This is where Python's error handling mechanisms come into play.
Python uses a try-except
block to handle exceptions. The code that might cause an error is placed inside the try
block. If an exception occurs during the execution of the try
block, the program's flow jumps to the except
block. Here, you can write code to handle the error, such as printing a user-friendly message, logging the error for later analysis, or providing a default value. You can specify the exact type of exception you want to catch (e.g., except ValueError:
) or use a generic except Exception:
to catch all types of errors. The ability to specify exceptions makes your error handling more precise. Additionally, you can include an else
block that runs only if the try
block executes without any errors. The finally
block, if included, will always run, whether an exception occurred or not. This is particularly useful for cleanup operations, like closing a file or a database connection, which must happen no matter what. The strategic use of try-except
blocks is a hallmark of professional, production-ready code.
Introduction to Debugging
While error handling deals with expected runtime errors, debugging is the process of finding and fixing unexpected bugs in your code. The simplest form of debugging involves using print()
statements to track the values of variables at different points in your program. This can help you identify where the program's state deviates from what you expect. However, for more complex issues, a dedicated debugger is a much more powerful tool. Python's built-in pdb
module allows you to set breakpoints in your code, step through the execution line by line, inspect variable values, and much more. Most modern Integrated Development Environments (IDEs) like VS Code or PyCharm have excellent graphical debuggers that provide an even more intuitive interface for this process. Knowing how to use a debugger effectively is an indispensable skill that can save you countless hours of frustration. When a bug arises, a systematic approach—reproducing the bug, isolating the problem, and then using a debugger to trace the code's execution—is the most efficient way to find a solution. Embracing a debugging mindset will transform the way you approach programming challenges.
# Basic try-except block
try:
num1 = int(input("Enter a number: "))
num2 = 10 / num1
print(f"Result: {num2}")
except ValueError:
print("Invalid input. Please enter a number.")
except ZeroDivisionError:
print("Cannot divide by zero.")
# Try-except-else-finally block
def process_file(filename):
try:
f = open(filename, 'r')
content = f.read()
except FileNotFoundError:
print(f"Error: The file '{filename}' was not found.")
return None
else:
print("File read successfully!")
return content
finally:
if 'f' in locals():
f.close()
print("File closed.")
print(process_file("nonexistent_file.txt"))
Try It Yourself: Create an Error-Proof Calculator
Write a program that asks the user for two numbers and an operator (+, -, *, /). Use a try-except
block to handle potential ValueError
(if the user enters non-numeric input) and ZeroDivisionError
(if the user tries to divide by zero). For any other error, use a generic except
block. Your code should gracefully handle these errors without crashing.
# Your code here
try:
num1 = float(input("Enter the first number: "))
operator = input("Enter an operator (+, -, *, /): ")
num2 = float(input("Enter the second number: "))
if operator == '+':
print(f"Result: {num1 + num2}")
elif operator == '-':
print(f"Result: {num1 - num2}")
elif operator == '*':
print(f"Result: {num1 * num2}")
elif operator == '/':
print(f"Result: {num1 / num2}")
else:
print("Invalid operator.")
except ValueError:
print("Error: Please enter a valid number.")
except ZeroDivisionError:
print("Error: Division by zero is not allowed.")
except Exception as e:
print(f"An unexpected error occurred: {e}")
Module 2: Data Structures
Lists, Tuples, and Sets
Python provides several built-in data structures to efficiently store and organize data. The first and most versatile is the **list**. A list is an ordered, mutable collection of items, meaning you can change its contents after creation. You define a list using square brackets []
, with items separated by commas. Lists can hold elements of different data types and are indexed, allowing you to access any element by its position. Common list methods include .append()
to add an item to the end, .insert()
to add an item at a specific index, .remove()
to delete an item, and .sort()
to sort the list. Slicing, which uses a colon :
within the square brackets, is a powerful feature that lets you extract a portion of a list, making it highly flexible for data manipulation. Lists are the go-to data structure for most collection-based tasks in Python, but their mutability can sometimes lead to unexpected side effects if not handled carefully, especially when passing them to functions.
Next are **tuples**, which are similar to lists but with one key difference: they are immutable. This means that once a tuple is created, its contents cannot be changed, added to, or removed. You define a tuple using parentheses ()
. Tuples are also ordered and indexed. Their immutability makes them a good choice for data that should not be altered, such as the coordinates of a point or the days of the week. Because they are immutable, tuples can be slightly more efficient in terms of memory and performance compared to lists. They are also often used when a function needs to return multiple values, as Python's syntax allows you to return a comma-separated sequence of values that are automatically packaged into a tuple. For example, return "John", 30
returns a tuple. This immutability also makes tuples hashable, which means they can be used as keys in a dictionary or as elements in a set, unlike lists.
Sets: Unique and Unordered
Finally, we have **sets**, which are unordered collections of unique items. You define a set using curly braces {}
or the set()
constructor. The defining characteristic of a set is that it cannot contain duplicate elements. If you try to add a duplicate, it simply ignores the command. This makes sets ideal for tasks like removing duplicates from a list or performing mathematical set operations like unions, intersections, and differences. Sets are also highly optimized for checking for the presence of an item, making them much faster than lists for membership testing. However, because they are unordered, you cannot access items by index. Sets provide methods like .add()
and .remove()
for changing their contents. The choice between a list, a tuple, or a set depends entirely on your use case: use a list for a mutable, ordered collection; a tuple for an immutable, ordered collection; and a set for an unordered collection of unique items.
# Lists: Ordered and mutable
my_list = [1, 2, "hello", 4.5]
my_list.append(5) # Add an item
print(my_list) # Output: [1, 2, 'hello', 4.5, 5]
print(my_list[2]) # Access by index: 'hello'
my_list[0] = 10 # Modify an item
print(my_list) # Output: [10, 2, 'hello', 4.5, 5]
# Tuples: Ordered and immutable
my_tuple = (1, "world", 3)
print(my_tuple[1]) # Access by index: 'world'
# my_tuple[1] = "python" # This will cause an error
# Sets: Unordered and unique
my_set = {1, 2, 3, 2, 1}
print(my_set) # Output: {1, 2, 3} (duplicates are removed)
my_set.add(4) # Add an item
print(my_set) # Output: {1, 2, 3, 4}
my_set.add(1) # This has no effect
print(my_set) # Output: {1, 2, 3, 4}
Try It Yourself: Manipulate Data Structures
1. Create a list called colors
with at least one duplicate color. Print the original list.
2. Convert the list to a set to remove duplicates, and then convert the set back into a list. Print the final list.
3. Create a tuple containing three of your favorite movies. Try to change one of the movie titles to see what happens.
# Your code here
colors = ["red", "blue", "green", "red", "yellow", "blue"]
print(f"Original list: {colors}")
unique_colors_set = set(colors)
unique_colors_list = list(unique_colors_set)
print(f"List with duplicates removed: {unique_colors_list}")
favorite_movies = ("Inception", "The Matrix", "Interstellar")
print(f"Favorite movies tuple: {favorite_movies}")
# Uncommenting the line below will cause a TypeError
# favorite_movies[0] = "Dune"
print("Trying to change an item in a tuple will result in an error.")
Module 2: Data Structures
Dictionaries and Data Manipulation
After mastering lists, tuples, and sets, the next crucial data structure to learn is the **dictionary**. A dictionary is an unordered, mutable collection of data that stores items as key-value pairs. You define a dictionary using curly braces {}
, with each key-value pair separated by a colon, like this: {"key": "value"}
. The key must be a unique, immutable object (such as a string, number, or tuple), while the value can be any data type, including other dictionaries or lists. Dictionaries are incredibly powerful for representing structured data, such as a user profile, a product catalog, or configuration settings. Instead of accessing data by a numerical index, you access it by its unique key, which makes retrieving information fast and intuitive. This key-based retrieval is one of the most powerful features of Python dictionaries, as it provides a direct link to the data you need without having to iterate through the entire collection.
Manipulating dictionaries is a common task in Python programming. You can add a new key-value pair to a dictionary by simply assigning a value to a new key: my_dict["new_key"] = "new_value"
. You can update an existing value in the same way. To remove a key-value pair, you can use the del
keyword or the .pop()
method. The .pop()
method is often preferred because it returns the value of the removed key, which can be useful. Dictionaries also have several built-in methods for data retrieval and manipulation. The .keys()
method returns a view of all the keys, .values()
returns a view of all the values, and .items()
returns a view of all the key-value pairs. Iterating over these views is a common pattern for processing dictionary data. Dictionaries are at the heart of many applications, from web development (handling JSON data) to data science (organizing experimental results), so a deep understanding of them is essential for any Python developer.
Data Manipulation and Comprehensions
Python's data structures are most powerful when combined with its elegant syntax for manipulation. One of the most Pythonic and efficient ways to create new lists, dictionaries, or sets is through **comprehensions**. A list comprehension, for example, allows you to create a new list in a single, concise line of code. The general format is [expression for item in iterable if condition]
. This is a compact and readable alternative to a traditional for
loop. For example, instead of writing a loop to square all numbers in a list, you can simply write [x**2 for x in my_list]
. Similarly, dictionary comprehensions allow you to create dictionaries dynamically. This can be used to swap keys and values or to filter items. Comprehensions not only make your code more concise but are also often faster than their loop-based counterparts. Mastering these one-line data manipulation techniques is a key skill that distinguishes a proficient Python programmer. Comprehensions make code cleaner and more expressive, which is a core tenet of Python's design philosophy.
# A dictionary for a user profile
user_profile = {
"name": "Alex",
"age": 28,
"email": "alex@example.com",
"is_active": True
}
# Accessing values
print(f"User's name: {user_profile['name']}")
# Adding a new key-value pair
user_profile['city'] = "New York"
print(user_profile)
# Updating a value
user_profile['age'] = 29
print(user_profile)
# Removing a key-value pair
del user_profile['is_active']
print(user_profile)
# List Comprehension
numbers = [1, 2, 3, 4, 5]
squared_numbers = [num ** 2 for num in numbers]
print(f"Squared numbers: {squared_numbers}")
# Dictionary Comprehension
words = ['apple', 'banana', 'cat']
word_lengths = {word: len(word) for word in words}
print(f"Word lengths: {word_lengths}")
Try It Yourself: Filter a Dictionary and Use a Comprehension
1. Create a dictionary called students
where keys are student names and values are their grades (e.g., {"Alice": 95, "Bob": 82, "Charlie": 78}
).
2. Using a dictionary comprehension, create a new dictionary called passed_students
that only includes students with a grade of 80 or higher. Print the new dictionary.
# Your code here
students = {"Alice": 95, "Bob": 82, "Charlie": 78, "David": 90, "Eve": 75}
print(f"Original student grades: {students}")
passed_students = {name: grade for name, grade in students.items() if grade >= 80}
print(f"Students who passed (>= 80): {passed_students}")
Module 2: Data Structures
String Processing and Regular Expressions
Strings are a fundamental data type in Python, used to represent text. While they may seem simple, Python provides an extensive set of tools for manipulating and processing strings, which is essential for tasks ranging from data parsing to generating user-facing messages. One of the most basic operations is string concatenation, which combines two or more strings. However, for more complex string construction, f-strings (formatted string literals) are the modern, recommended approach. They allow you to embed expressions directly inside string literals, making it incredibly easy to create dynamic output. String methods are also a vital part of string processing. Methods like .upper()
, .lower()
, .strip()
(for removing whitespace), .split()
(for breaking a string into a list), and .join()
(for combining a list of strings) are indispensable for cleaning, formatting, and extracting information from text. Understanding these built-in tools is the first step toward handling any text-based data with confidence.
For more advanced pattern matching and text searching, Python's **regular expressions** (often abbreviated as "regex" or "RE") are the tool of choice. Regular expressions are a sequence of characters that define a search pattern. They are used to find, replace, and extract complex patterns from strings. The re
module is Python's built-in library for working with regular expressions. Key functions in this module include re.search()
to find the first occurrence of a pattern, re.findall()
to find all non-overlapping occurrences, and re.sub()
to replace occurrences of a pattern with a new string. While the syntax of regular expressions can be dense and intimidating at first, it provides unparalleled power for tasks like validating email addresses, parsing log files, or extracting specific data from a long text. The power of regular expressions lies in their ability to define complex patterns, such as "any sequence of one or more digits followed by a comma," in a compact and flexible way.
Key Regular Expression Concepts
To use regular expressions effectively, you need to understand some basic components. **Metacharacters** like .
(matches any character), *
(matches zero or more of the preceding character), +
(matches one or more), and ?
(matches zero or one) are the building blocks of patterns. **Character classes**, defined by square brackets []
, allow you to match a set of characters (e.g., [a-z]
matches any lowercase letter). **Anchors** like ^
(matches the start of a string) and $
(matches the end of a string) help you specify the position of a match. Learning to combine these elements allows you to create highly specific and powerful search patterns. For example, the pattern ^(\w+)@([\w\.-]+)\.([a-z\.]{2,6})$
is a common (though not perfect) regex for validating email addresses. While it may look complicated, it’s a concise way of saying "match a word at the beginning, followed by an '@' symbol, followed by another word with a dot, and ending with a domain extension." The complexity of regex is matched by its utility, making it a valuable skill for anyone working with text data.
import re
# String formatting with f-strings
name = "Charlie"
age = 30
message = f"Hello, my name is {name} and I am {age} years old."
print(message)
# String methods
sentence = " Python is a great programming language. "
cleaned_sentence = sentence.strip().upper()
words = cleaned_sentence.split()
print(f"Cleaned and split: {words}")
joined_words = " | ".join(words)
print(f"Joined words: {joined_words}")
# Regular expressions
text = "The quick brown fox jumps over the lazy dog. My phone number is 123-456-7890."
phone_pattern = r"\d{3}-\d{3}-\d{4}"
# Search for the first match
match = re.search(phone_pattern, text)
if match:
print(f"Found a phone number: {match.group()}")
# Find all matches
all_numbers = re.findall(r"\d+", text)
print(f"All numbers found: {all_numbers}")
# Replace a pattern
new_text = re.sub(r"fox", "cat", text)
print(f"New text: {new_text}")
Try It Yourself: Validate a Simple Email Address
Using the re
module, write a function that takes a string as input and checks if it's a valid email address using a regular expression. A simple pattern to use is r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
. The function should return True
if it's a match and False
otherwise. Test it with a few valid and invalid emails.
# Your code here
import re
def is_valid_email(email):
pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
if re.match(pattern, email):
return True
return False
# Test cases
print(f"'test@example.com' is valid: {is_valid_email('test@example.com')}")
print(f"'invalid-email' is valid: {is_valid_email('invalid-email')}")
print(f"'user.name@domain.co.uk' is valid: {is_valid_email('user.name@domain.co.uk')}")
print(f"'user@domain' is valid: {is_valid_email('user@domain')}")
Module 2: Data Structures
File I/O and Data Persistence
In many applications, data isn't just processed in memory; it needs to be stored and retrieved from files. This is where **File Input/Output (I/O)** comes in. Python makes working with files straightforward with its built-in open()
function. The open()
function takes two main arguments: the file path and the mode in which to open the file. Common modes include 'r'
for reading, 'w'
for writing (which will overwrite the file if it exists), and 'a'
for appending (which adds new content to the end of the file). Once a file is opened, you can use methods like .read()
to get the entire content of the file as a single string, .readline()
to read a single line, or .readlines()
to read all lines into a list. For writing, you use the .write()
method. After you're done with a file, it's crucial to close it using the .close()
method to release system resources. Forgetting to close a file can lead to various issues, including data corruption.
While manually closing files works, the most Pythonic and safest way to handle files is using a **with
statement**. The with open(...) as file:
syntax creates a context manager that automatically handles the closing of the file, even if an error occurs inside the block. This simplifies your code and prevents resource leaks. For example, to read a file, you would write with open('my_file.txt', 'r') as file: content = file.read()
. The file is guaranteed to be closed as soon as the indented block is exited. This pattern is not only cleaner but also more robust, as it works seamlessly with Python's exception handling system. Understanding this simple but powerful idiom is a sign of a more advanced Python programmer.
Data Persistence with Python
Beyond simple text files, Python provides mechanisms for serializing and deserializing objects, a process known as **data persistence**. This means you can take a complex Python object (like a dictionary, list, or a custom class instance) and convert it into a stream of bytes that can be stored in a file, and then later reconstruct the object from that file. The built-in pickle
module is the standard library for this purpose. The pickle.dump()
function serializes an object and writes it to a file, while pickle.load()
reads the file and reconstructs the object. The benefit of using pickle
is that it preserves the object's structure and data types, so you don't have to manually parse and convert data when you load it back. However, be aware that pickle
is not secure against maliciously crafted data and is not interoperable with other languages. For cross-language data exchange, other formats like JSON or CSV are more appropriate, which we'll discuss in the next lesson. Still, for simple persistence of Python-specific data structures, pickle
is an invaluable tool for saving and loading program state.
# Writing to a file using 'with' statement
content_to_write = "Hello, this is the first line.\n" \
"This is the second line."
with open("example.txt", "w") as f:
f.write(content_to_write)
print("File 'example.txt' has been created and written to.")
# Reading from a file
with open("example.txt", "r") as f:
file_content = f.read()
print("\n--- Content of example.txt ---")
print(file_content)
# Reading line by line
print("\n--- Reading line by line ---")
with open("example.txt", "r") as f:
for line in f:
print(line.strip()) # .strip() removes whitespace/newlines
import pickle
# Pickling (serializing) a Python object
data = {"name": "Charlie", "score": 95, "courses": ["Python", "Data Science"]}
with open("data.pickle", "wb") as f:
pickle.dump(data, f)
print("\nPython object saved to 'data.pickle'.")
# Unpickling (deserializing) the object
with open("data.pickle", "rb") as f:
loaded_data = pickle.load(f)
print("\nPython object loaded from 'data.pickle':")
print(loaded_data)
Try It Yourself: Append to a File and Read All Lines
1. Open the file "example.txt" in append mode ('a'
). Write a new line of text to it.
2. Re-open the file in read mode and read all lines into a list. Print the list of lines.
# Your code here
with open("example.txt", "a") as f:
f.write("\nThis is a new line appended to the file.")
print("New line appended to 'example.txt'.")
with open("example.txt", "r") as f:
lines = f.readlines()
print("\nContent of file as a list of lines:")
print(lines)
Module 2: Data Structures
JSON and CSV Handling
While the `pickle` module is great for Python-specific data, many real-world applications require data exchange with other programming languages or systems. For this, standardized, human-readable formats are necessary. Two of the most common are **JSON** (JavaScript Object Notation) and **CSV** (Comma-Separated Values). JSON is a lightweight, text-based data format that is easy for humans to read and write and easy for machines to parse and generate. It’s the de-facto standard for data exchange on the web and is often used by APIs. The structure of JSON maps directly to Python's data structures: JSON objects become dictionaries, and JSON arrays become lists. Python's built-in `json` module provides all the tools you need to work with this format. The `json.dump()` and `json.dumps()` functions convert Python objects to JSON, while `json.load()` and `json.loads()` convert JSON back into Python objects. Using these functions, you can easily save a dictionary to a file as JSON or parse a JSON string received from a web server.
For tabular data, such as spreadsheets or database exports, **CSV** is the most widely used format. A CSV file is a plain-text file where each line represents a data record, and the fields in each record are separated by commas. Python's built-in `csv` module simplifies the process of reading and writing CSV files, handling tricky details like commas within a quoted field. The module provides a `reader` object for iterating over lines in a CSV file and a `writer` object for writing to one. This abstraction means you don't have to manually split and join strings based on commas, which can be error-prone. The `csv` module ensures that your data is read and written correctly, preserving the integrity of each field. This is a crucial library for anyone involved in data analysis or data processing, as CSV files are a pervasive format for storing large datasets.
Working with JSON and CSV in Practice
Let's look at how to put these modules into practice. To write a Python dictionary to a JSON file, you would use `json.dump()` within a `with` statement. The function takes the Python object and a file object as arguments. When reading, `json.load()` does the opposite, returning the Python object from the file. For CSV, the process is slightly different. When reading, you create a `csv.reader` object and then iterate through it, with each row returned as a list of strings. To write a CSV file, you create a `csv.writer` object and then use its `writerow()` method to write a list of strings as a new row. The `csv` module is especially useful because it can handle different delimiters (not just commas) and quoting styles, making it highly flexible. In many modern data workflows, you will find yourself constantly converting between Python data structures, JSON, and CSV, so having a solid grasp of these modules is indispensable.
import json
import csv
# Python dictionary to JSON
data = {
"name": "Jane Doe",
"age": 32,
"is_active": True,
"hobbies": ["reading", "hiking", "cooking"]
}
with open("data.json", "w") as f:
json.dump(data, f, indent=4)
print("Python object saved to 'data.json'.")
# Read JSON back into a Python object
with open("data.json", "r") as f:
loaded_data = json.load(f)
print("JSON object loaded into Python:")
print(loaded_data)
# Write a list of lists to a CSV file
header = ["name", "age", "city"]
records = [
["Alice", 25, "London"],
["Bob", 30, "Paris"],
["Charlie", 35, "Tokyo"]
]
with open("users.csv", "w", newline='') as f:
writer = csv.writer(f)
writer.writerow(header)
writer.writerows(records)
print("\nList of lists saved to 'users.csv'.")
# Read from a CSV file
with open("users.csv", "r") as f:
reader = csv.reader(f)
for row in reader:
print(row)
Try It Yourself: Convert Data from CSV to JSON
Write a script that reads the `users.csv` file created above and converts its data into a list of dictionaries, where each dictionary represents a user. Then, save this list of dictionaries to a new JSON file named `users_data.json`.
# Your code here
import csv
import json
users = []
with open("users.csv", "r") as f:
reader = csv.DictReader(f) # Use DictReader to get dictionaries
for row in reader:
users.append(row)
with open("users_data.json", "w") as f:
json.dump(users, f, indent=4)
print("Data converted from CSV to JSON and saved to 'users_data.json'.")
Module 3: Object-Oriented Programming
Classes and Objects
Object-Oriented Programming (**OOP**) is a powerful paradigm that structures a program by bundling data and the functions that operate on that data into a single unit called an **object**. This approach is designed to make code more organized, reusable, and easier to manage, especially for large, complex projects. The fundamental concepts of OOP are the **class** and the **object**. Think of a class as a blueprint or a template for creating objects. It defines a set of attributes (variables) that the object will possess and methods (functions) that the object can perform. For example, a `Car` class might have attributes like `color` and `speed` and methods like `accelerate()` and `brake()`. A class itself is a definition; it doesn't hold any specific data.
An **object** is a specific instance of a class. When you create an object from a class, you are creating a concrete entity with its own unique data for the attributes defined in the class. For example, from our `Car` blueprint, you could create a red car object and a blue car object. Both objects would have `color` and `speed` attributes and `accelerate()` and `brake()` methods, but the red car's `color` attribute would be "red," and the blue car's would be "blue." In Python, you define a class using the `class` keyword. The `__init__` method, also known as the constructor, is a special method that is automatically called when a new object is created. It is used to initialize the object's attributes. The `self` parameter is a reference to the instance of the class and is required as the first parameter of any instance method, including `__init__`.
Defining Classes and Methods
In a class definition, attributes are typically initialized in the `__init__` method. Methods are defined just like regular functions, but they must always take `self` as their first parameter. To access an attribute or call another method from within the class, you use the dot notation with the `self` reference (e.g., `self.color` or `self.accelerate()`). To create an object, you simply call the class name as if it were a function, passing any required arguments to the `__init__` method. You can then access the object's attributes and call its methods using the same dot notation. This structure provides a clear separation of concerns, encapsulating the data and logic related to a specific entity within a single, coherent unit. Mastering classes and objects is a prerequisite for building any moderately sized application, as it provides a structured way to model real-world concepts in your code.
# Define a simple class
class Dog:
# Class attribute (shared by all instances)
species = "Canis familiaris"
# The constructor method
def __init__(self, name, age):
# Instance attributes (unique to each instance)
self.name = name
self.age = age
# An instance method
def bark(self):
return f"{self.name} says woof!"
# Another instance method
def get_info(self):
return f"{self.name} is a {self.species} and is {self.age} years old."
# Create objects (instances) of the Dog class
dog1 = Dog("Fido", 5)
dog2 = Dog("Buddy", 2)
# Accessing attributes
print(f"Dog 1's name: {dog1.name}")
print(f"Dog 2's age: {dog2.age}")
# Calling methods
print(dog1.bark())
print(dog2.get_info())
# Class attribute is the same for all objects
print(f"Dog 1's species: {dog1.species}")
print(f"Dog 2's species: {dog2.species}")
Try It Yourself: Create a 'Rectangle' Class
Create a class called Rectangle
. The `__init__` method should take `width` and `height` as arguments and store them as attributes. Add a method called area()
that returns the area of the rectangle and another method called perimeter()
that returns its perimeter. Then, create an instance of the class and print its area and perimeter.
# Your code here
class Rectangle:
def __init__(self, width, height):
self.width = width
self.height = height
def area(self):
return self.width * self.height
def perimeter(self):
return 2 * (self.width + self.height)
# Create an object and test the methods
rect = Rectangle(10, 5)
print(f"Rectangle width: {rect.width}, height: {rect.height}")
print(f"Area: {rect.area()}")
print(f"Perimeter: {rect.perimeter()}")
Module 3: Object-Oriented Programming
Inheritance and Polymorphism
One of the most powerful features of Object-Oriented Programming (OOP) is **inheritance**. Inheritance allows a new class to take on the attributes and methods of an existing class. The existing class is called the **parent** or **base** class, and the new class is called the **child** or **derived** class. The child class inherits all the functionality of its parent, which promotes code reuse and helps you model real-world "is-a" relationships (e.g., a `Dog` is an `Animal`). To create a child class, you include the parent class name in parentheses after the child class name in the class definition: `class ChildClass(ParentClass):`. A child class can override methods from the parent class, providing its own specific implementation. It can also extend the parent class by adding new methods or attributes. The `super()` function is a special utility that allows a child class to call a method from its parent class, which is often used in the child's `__init__` method to properly initialize the inherited attributes before adding its own. Inheritance is a cornerstone of building flexible and scalable class hierarchies.
Closely related to inheritance is the concept of **polymorphism**, which means "many forms." In an OOP context, polymorphism allows objects of different classes to be treated as objects of a common parent class. This means you can write code that works with a collection of objects of a parent class and have it behave correctly for each of the child classes without needing to know their specific types. The most common form of polymorphism in Python is **method overriding**, where a child class provides a new implementation for a method that is already defined in its parent class. When you call that method on an object, Python automatically determines which version of the method to execute based on the object's actual class. This allows you to write generic functions that can handle a variety of related objects, each with its own unique behavior. For example, you could have a list of `Animal` objects (some of which are `Dog`s, others are `Cat`s) and a `make_sound()` function. Calling `make_sound()` on each object would result in different outputs ("Woof!" vs. "Meow!") without needing any special checks. This flexibility is what makes polymorphic code so powerful and maintainable.
Abstract Base Classes
While Python doesn't enforce strict polymorphism like some other languages, you can formalize the structure using **Abstract Base Classes (ABCs)** from the `abc` module. An ABC is a class that cannot be instantiated on its own and is designed to be a blueprint for other classes. It can contain abstract methods, which are methods that are declared but not implemented. Any class that inherits from an ABC must provide an implementation for all of its abstract methods; otherwise, it cannot be instantiated. This provides a powerful way to enforce a common interface across a set of related classes, ensuring that they all have a certain set of methods with specific signatures. Using ABCs helps to prevent errors by forcing developers to follow a predefined structure and is a great way to design robust and predictable code, especially when working on a team. The combination of inheritance and polymorphism, optionally guided by ABCs, is a key reason why OOP is so effective for building complex systems.
# Parent class
class Animal:
def __init__(self, name):
self.name = name
def make_sound(self):
raise NotImplementedError("Subclass must implement abstract method")
# Child classes inherit from Animal
class Dog(Animal):
def make_sound(self):
return f"{self.name} says Woof!"
class Cat(Animal):
def make_sound(self):
return f"{self.name} says Meow!"
# Polymorphism in action
animals = [Dog("Fido"), Cat("Whiskers")]
for animal in animals:
print(animal.make_sound())
# Another example of polymorphism with method overriding
class Car:
def drive(self):
return "Vroom! The car is driving."
class ElectricCar(Car):
def drive(self):
return "Whirr! The electric car is driving silently."
car = Car()
electric_car = ElectricCar()
print(car.drive())
print(electric_car.drive())
Try It Yourself: Create a Shape Hierarchy
Create a parent class called `Shape` with a method `area()` that returns 0. Then, create two child classes, `Square` and `Circle`, that inherit from `Shape`. The `Square` class should have a `side_length` attribute and a `Circle` class should have a `radius` attribute. Both child classes should override the `area()` method to return their correct area. (Hint: Use `math.pi` for the circle calculation). Then, create a list of shapes and iterate through it, printing the area of each shape to demonstrate polymorphism.
# Your code here
import math
class Shape:
def area(self):
return 0
class Square(Shape):
def __init__(self, side_length):
self.side_length = side_length
def area(self):
return self.side_length ** 2
class Circle(Shape):
def __init__(self, radius):
self.radius = radius
def area(self):
return math.pi * self.radius ** 2
# Demonstrate polymorphism
shapes = [Square(4), Circle(5)]
for shape in shapes:
print(f"The area of the shape is: {shape.area()}")
Module 3: Object-Oriented Programming
Encapsulation and Abstraction
Encapsulation and abstraction are two core pillars of Object-Oriented Programming (OOP) that work together to create clean, secure, and easy-to-use code. **Encapsulation** is the principle of bundling data (attributes) and the methods that operate on that data into a single unit, the class. The primary goal of encapsulation is to hide the internal state of an object from the outside world and only expose a public interface (methods) to interact with it. This prevents direct, unauthorized access to an object's internal data, which could lead to an inconsistent or corrupted state. In Python, encapsulation is achieved through convention. While there are no strict "private" keywords, a common practice is to prefix an attribute name with a single underscore (e.g., `_secret_data`) to indicate that it's intended for internal use and should not be accessed directly. Using a double underscore (e.g., `__very_secret_data`) triggers a name mangling process that makes it harder, but not impossible, to access the attribute from outside the class. The best practice is to provide public methods, often called "getters" and "setters," to control how the object's data is accessed and modified, ensuring that any changes are validated and handled correctly.
**Abstraction** is the principle of hiding complex implementation details and showing only the essential features of an object to the user. It allows you to focus on what an object does rather than how it does it. For example, when you use a remote control to change the channel on a TV, you don't need to know the intricate electronic signals and circuits involved; you only need to know that pressing the "channel up" button changes the channel. The TV abstracts away the complexity. In Python, abstraction is often implemented using classes and methods. A class provides a public interface of methods that the user can call, while the internal complexity and logic are hidden within the method implementations. A great example is Python's built-in data types. When you use a list and call `my_list.sort()`, you don't need to know which sorting algorithm is being used or how the memory is being reordered. The complexity is abstracted away, and you can simply focus on the result. This simplification makes code more manageable and less prone to errors.
Properties and Pythonic Encapsulation
Python provides a powerful feature called **properties** that offers a more elegant and Pythonic way to implement encapsulation and control attribute access without using explicit getter and setter methods. The `@property` decorator is used to define a method that can be accessed like an attribute. This allows you to add logic to your getters and setters while maintaining a clean, attribute-like syntax for the user. For instance, you can use a `@property` to return a calculated value or to validate data before it's set, all without changing how the attribute is accessed from outside the class. This provides a clean public interface while still ensuring the integrity of the object's internal state. The combination of encapsulation (hiding data), abstraction (simplifying the interface), and properties (enforcing rules on access) are key to writing robust, maintainable, and user-friendly classes.
# Encapsulation with public, protected, and private attributes
class User:
def __init__(self, username, password):
self.username = username # Public attribute
self._password = password # Protected by convention
self.__id_number = 12345 # Private (name mangled)
def get_password(self):
return self._password
# A getter property for encapsulation and validation
@property
def password(self):
return "Access denied!" # Not a good idea to return the real password
# Using a property to encapsulate logic
class Person:
def __init__(self, name, age):
self._name = name
self._age = age
@property
def age(self):
return self._age
@age.setter
def age(self, value):
if value < 0:
raise ValueError("Age cannot be negative.")
self._age = value
# Example usage
user1 = User("admin", "secret123")
print(f"Username: {user1.username}")
print(f"Password (using method): {user1.get_password()}")
# This is discouraged but possible: print(user1._password)
# print(user1.__id_number) # This will cause an error
# print(user1._User__id_number) # This is how to access the mangled attribute
person1 = Person("John", 30)
print(f"Initial age: {person1.age}")
person1.age = 35 # Uses the setter
print(f"New age: {person1.age}")
try:
person1.age = -1
except ValueError as e:
print(e)
Try It Yourself: Create a Bank Account Class with Encapsulation
Create a class called BankAccount
. The `__init__` method should take an initial balance and store it in a "protected" attribute like `_balance`. Add a method `deposit(amount)` and a method `withdraw(amount)`. Implement logic to ensure that a withdrawal cannot result in a negative balance and that both deposit and withdrawal amounts are positive. Use a `@property` decorator to create a read-only `balance` attribute that returns the current balance without exposing the internal attribute directly.
# Your code here
class BankAccount:
def __init__(self, balance):
if balance < 0:
raise ValueError("Initial balance cannot be negative.")
self._balance = balance
@property
def balance(self):
return self._balance
def deposit(self, amount):
if amount > 0:
self._balance += amount
print(f"Deposited {amount}. New balance is {self.balance}.")
else:
print("Deposit amount must be positive.")
def withdraw(self, amount):
if amount <= 0:
print("Withdrawal amount must be positive.")
elif amount > self.balance:
print("Insufficient funds.")
else:
self._balance -= amount
print(f"Withdrew {amount}. New balance is {self.balance}.")
# Example usage
account = BankAccount(100)
print(f"Current balance: {account.balance}")
account.deposit(50)
account.withdraw(120)
account.withdraw(40)
account.deposit(-10)
Module 3: Object-Oriented Programming
Magic Methods and Decorators
In Python, classes can implement special methods that allow them to integrate seamlessly with the language's core features. These methods, which start and end with a double underscore (e.g., `__init__`, `__str__`), are often called **magic methods** or **dunder methods** (for "double underscore"). They are not meant to be called directly but are invoked automatically by Python in response to specific syntax. For instance, the `__init__` method is called when an object is created, and the `__str__` method is called when you use `print()` or `str()` on an object, providing a human-readable string representation. Other common magic methods include `__repr__` for a developer-friendly representation, `__len__` for making an object's length computable with `len()`, and `__add__` for defining how the `+` operator works with your objects. By implementing these methods, you can make your custom classes behave just like built-in types, leading to more intuitive and Pythonic code.
A **decorator** is a special kind of function that modifies the behavior of another function or a class. A decorator is a powerful tool for extending functionality without permanently modifying the original code. It's an elegant way to implement cross-cutting concerns, such as logging, access control, or performance measurement. In Python, the decorator syntax is `@decorator_name` placed on the line immediately preceding the function or class definition. When Python encounters this syntax, it essentially passes the function or class to the decorator and replaces the original with the new, modified version. For example, a decorator could be used to ensure a user is logged in before they can access a certain function. The `@classmethod` and `@staticmethod` decorators are common examples that you'll use frequently in classes. A `@classmethod` takes the class itself as the first argument (`cls`), while a `@staticmethod` takes neither the class nor the instance as the first argument, behaving like a regular function but belonging to the class's namespace. Understanding decorators is key to writing advanced, modular, and reusable code.
Using Magic Methods for Operator Overloading
A particularly powerful application of magic methods is **operator overloading**, which allows you to define custom behavior for standard operators like `+`, `-`, `*`, or `==` when used with your objects. For example, by implementing the `__add__` magic method in a class that represents a vector, you can define what it means to "add" two vector objects together. This allows you to write natural-looking code like `vec1 + vec2` instead of a less intuitive `vec1.add(vec2)`. Similarly, implementing `__eq__` lets you define how two objects are compared with the `==` operator. This capability makes your custom objects feel like a native part of the language and can significantly improve the readability and expressiveness of your code. By thoughtfully implementing these magic methods, you can create a cohesive and powerful set of classes that are a pleasure to work with.
# Class with magic methods for string representation
class Book:
def __init__(self, title, author, pages):
self.title = title
self.author = author
self.pages = pages
def __str__(self):
return f"{self.title} by {self.author}"
def __repr__(self):
return f"Book(title='{self.title}', author='{self.author}', pages={self.pages})"
def __len__(self):
return self.pages
def __add__(self, other):
return Book(f"{self.title} & {other.title}", "Multiple Authors", self.pages + other.pages)
# Example of a simple decorator
def my_decorator(func):
def wrapper():
print("Something is happening before the function is called.")
func()
print("Something is happening after the function is called.")
return wrapper
@my_decorator
def say_hello():
print("Hello!")
book1 = Book("The Hitchhiker's Guide", "Douglas Adams", 200)
book2 = Book("The Lord of the Rings", "J.R.R. Tolkien", 500)
print(book1) # Uses __str__
print(repr(book1)) # Uses __repr__
print(len(book1)) # Uses __len__
combined_book = book1 + book2 # Uses __add__
print(combined_book)
say_hello() # The decorator is invoked
Try It Yourself: Create a Vector Class with Operator Overloading
Create a class called Vector
that takes two arguments, x
and y
, in its `__init__` method. Implement the `__add__` magic method so that you can add two `Vector` objects together. The result should be a new `Vector` object where the x and y components are the sums of the corresponding components of the original vectors. Create two `Vector` objects and print the result of their addition.
# Your code here
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
def __str__(self):
return f"Vector({self.x}, {self.y})"
def __add__(self, other):
if isinstance(other, Vector):
new_x = self.x + other.x
new_y = self.y + other.y
return Vector(new_x, new_y)
else:
raise TypeError("Can only add two Vector objects.")
# Create and add two vectors
v1 = Vector(2, 3)
v2 = Vector(5, 7)
v3 = v1 + v2
print(f"Result of v1 + v2: {v3}")
Module 3: Object-Oriented Programming
Design Patterns in Python
As you build more complex applications, you'll encounter recurring problems. **Design patterns** are reusable solutions to these common software design problems. They are not concrete code snippets but rather a set of best practices and principles that guide you in structuring your code. While there are dozens of design patterns, understanding a few key ones can dramatically improve the maintainability and scalability of your projects. One of the simplest and most widely used is the **Singleton pattern**. The goal of this pattern is to ensure that a class has only one instance and provides a single global point of access to it. This is useful for things like a database connection, a configuration manager, or a logger, where having multiple instances would be redundant or problematic. In Python, this can be implemented using a class-level attribute to store the single instance and a custom `__new__` method to check if the instance already exists.
The **Factory pattern** is another essential pattern, categorized as a creational pattern. It provides an interface for creating objects in a superclass, but allows subclasses to alter the type of objects that will be created. The core idea is to delegate the responsibility of creating an object to a specialized "factory" method, which hides the complex instantiation logic from the client code. This is particularly useful when you have a number of related classes and the client doesn't need to know which specific class to instantiate. For example, a `DocumentFactory` could have a `create_document()` method that returns either a `PDF` object, a `Word` object, or a `Text` object, depending on the input. This makes your code more flexible and easier to extend, as you can add new document types without modifying the client code. The Factory pattern separates the object creation logic from the object usage logic, which is a key principle of good design.
The Strategy Pattern
The **Strategy pattern** is a behavioral pattern that allows you to define a family of algorithms, encapsulate each one as an object, and make them interchangeable. The pattern lets the algorithm vary independently from the clients that use it. A common example is an e-commerce checkout system. The total price might be calculated differently based on the user's country or a special promotion. Instead of using a long `if/elif/else` chain, you can define different "strategy" classes for each pricing method (e.g., `StandardTax`, `VAT`, `PromoDiscount`). The main checkout logic then takes a pricing strategy object and uses it to calculate the price. This makes the code for the checkout process cleaner and easier to maintain. When you need to add a new pricing method, you simply create a new strategy class without touching the core checkout logic. Design patterns are a way of thinking about code structure, and while they can seem abstract at first, they provide a proven path to writing scalable, maintainable, and robust software.
# Singleton Pattern Example
class Logger:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance.logs = []
return cls._instance
def log(self, message):
self.logs.append(message)
print(f"Log: {message}")
# Factory Pattern Example
class Dog:
def speak(self): return "Woof!"
class Cat:
def speak(self): return "Meow!"
def animal_factory(animal_type):
if animal_type == "dog": return Dog()
if animal_type == "cat": return Cat()
return None
# Strategy Pattern Example
class PaymentStrategy:
def pay(self, amount):
raise NotImplementedError
class CreditCardPayment(PaymentStrategy):
def pay(self, amount):
print(f"Paying {amount} with Credit Card.")
class PayPalPayment(PaymentStrategy):
def pay(self, amount):
print(f"Paying {amount} with PayPal.")
class ShoppingCart:
def __init__(self, payment_strategy):
self.payment_strategy = payment_strategy
def checkout(self, amount):
self.payment_strategy.pay(amount)
# Example Usage
logger1 = Logger()
logger2 = Logger()
print(f"Are logger1 and logger2 the same object? {logger1 is logger2}")
logger1.log("This is the first log.")
logger2.log("This is the second log.")
print(f"All logs: {logger1.logs}")
dog = animal_factory("dog")
print(dog.speak())
cart = ShoppingCart(CreditCardPayment())
cart.checkout(100)
cart.payment_strategy = PayPalPayment()
cart.checkout(50)
Try It Yourself: Implement a Basic Factory Pattern
Create two simple classes, `EmailNotifier` and `SMSNotifier`, each with a method `send_notification(message)`. Then, create a factory function called `notifier_factory(type)` that returns an instance of `EmailNotifier` if `type` is "email" and `SMSNotifier` if `type` is "sms". Demonstrate how to use the factory function to get a notifier object and send a message without needing to know the specific class name.
# Your code here
class EmailNotifier:
def send_notification(self, message):
print(f"Sending email notification: {message}")
class SMSNotifier:
def send_notification(self, message):
print(f"Sending SMS notification: {message}")
def notifier_factory(type):
if type == "email":
return EmailNotifier()
elif type == "sms":
return SMSNotifier()
else:
raise ValueError("Invalid notifier type.")
# Use the factory to create objects
email_notifier = notifier_factory("email")
sms_notifier = notifier_factory("sms")
email_notifier.send_notification("Your order has shipped!")
sms_notifier.send_notification("Your order has shipped!")
Module 4: Libraries and Frameworks
NumPy for Numerical Computing
Python's strength lies not only in its core language but also in its rich ecosystem of libraries. For numerical computing and scientific tasks, one library stands head and shoulders above the rest: **NumPy**. NumPy is the fundamental package for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. The core data structure of NumPy is the **`ndarray`** (N-dimensional array), which is a grid of values, all of the same type, indexed by a tuple of non-negative integers. This is a game-changer because Python's built-in lists are not optimized for numerical operations. NumPy arrays are much more efficient for mathematical operations on large datasets, as the operations are performed in optimized, pre-compiled C code under the hood. This means that a NumPy array can be orders of magnitude faster than a traditional Python list when performing element-wise operations, which is the cornerstone of scientific and data analysis workflows.
Working with NumPy is all about leveraging its arrays and the vectorized operations they enable. Instead of writing explicit loops to perform a calculation on each element of a list, you can simply apply the operation directly to the entire array. For example, if you want to square every element in an array, you can simply write `my_array ** 2`. NumPy handles the iteration and calculation efficiently, a concept known as **vectorization**. This not only makes your code much faster but also more concise and readable. NumPy provides a vast collection of mathematical functions to operate on these arrays, including linear algebra, Fourier transforms, and random number generation. These functions are often much more optimized and robust than what you could write yourself. The library is so foundational that many other data science and machine learning libraries, such as Pandas and Scikit-learn, are built directly on top of it.
Key Features and Functions
To get started with NumPy, you first need to import the library, typically as `np` (e.g., `import numpy as np`). You can create an array from a Python list or tuple using `np.array()`. NumPy arrays are not limited to one dimension; you can create 2D arrays (matrices), 3D arrays, and more. A few key functions to know are `np.zeros()`, `np.ones()`, and `np.arange()` for creating arrays with specific initial values. Indexing and slicing of NumPy arrays are similar to Python lists but with powerful extensions for multidimensional arrays. For instance, you can select an entire row or column with a single slice. You can also perform boolean indexing, where you select elements from an array based on a condition (e.g., `my_array[my_array > 0]`). This allows for very expressive and fast data filtering. Mastering NumPy is a prerequisite for any serious work in data science, machine learning, or scientific research with Python, and it forms the bedrock for more advanced data manipulation with libraries like Pandas.
import numpy as np
# Create a NumPy array from a Python list
my_list = [1, 2, 3, 4, 5]
my_array = np.array(my_list)
print(f"NumPy array: {my_array}")
print(f"Type: {type(my_array)}")
# Perform vectorized operations
squared_array = my_array ** 2
print(f"Squared array: {squared_array}")
# Create a 2D array (matrix)
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print("\n2D Array:")
print(matrix)
# Basic array properties
print(f"Shape of matrix: {matrix.shape}")
print(f"Number of dimensions: {matrix.ndim}")
print(f"Data type of elements: {matrix.dtype}")
# Slicing and indexing
print(f"First row: {matrix[0, :]}")
print(f"Second column: {matrix[:, 1]}")
print(f"Element at (0, 1): {matrix[0, 1]}")
# Boolean indexing
large_numbers = my_array[my_array > 3]
print(f"Numbers greater than 3: {large_numbers}")
# Common functions
mean_value = np.mean(my_array)
print(f"Mean of array: {mean_value}")
Try It Yourself: Matrix Operations
1. Create a 3x3 NumPy array of random integers between 1 and 10.
2. Multiply every element in the array by 5.
3. Calculate the sum of all elements in the modified array using a NumPy function. Print the original array, the modified array, and the sum.
# Your code here
import numpy as np
# 1. Create a random 3x3 array
random_matrix = np.random.randint(1, 11, size=(3, 3))
print(f"Original 3x3 matrix:\n{random_matrix}")
# 2. Multiply every element by 5
modified_matrix = random_matrix * 5
print(f"\nModified matrix (multiplied by 5):\n{modified_matrix}")
# 3. Calculate the sum of all elements
matrix_sum = np.sum(modified_matrix)
print(f"\nSum of all elements: {matrix_sum}")
Module 4: Libraries and Frameworks
Pandas for Data Analysis
For data manipulation and analysis, Python's premier library is **Pandas**. Pandas is built on top of NumPy and provides easy-to-use data structures and data analysis tools. The two most important data structures in Pandas are the **`Series`** and the **`DataFrame`**. A `Series` is a one-dimensional array-like object that can hold any data type. It's essentially a column in a spreadsheet, with a data type and an index. A **`DataFrame`** is a two-dimensional, size-mutable, tabular data structure with labeled axes (rows and columns). Think of a DataFrame as a spreadsheet or a SQL table. It's the primary tool for data scientists and analysts, as it provides a powerful and flexible way to represent and manipulate data from various sources, including CSV files, databases, and web APIs. The key advantage of Pandas is its labeled axes, which allow for intuitive and robust data access, slicing, and manipulation, making it far superior to using raw lists of lists or NumPy arrays for most real-world data tasks.
One of the most common tasks in data analysis is loading data from a file, and Pandas excels at this. You can easily load data from a CSV file into a DataFrame with a single line of code: `pd.read_csv('filename.csv')`. The library automatically handles things like column headers, data types, and missing values. Once your data is in a DataFrame, you have an immense toolkit for working with it. You can inspect the data with methods like `df.head()` (to see the first few rows), `df.info()` (to get a summary of data types and non-null values), and `df.describe()` (to get summary statistics). DataFrames also support powerful filtering, such as selecting all rows where a specific column's value meets a condition (e.g., `df[df['age'] > 30]`). This allows you to quickly query and segment your data without writing complex loops. Furthermore, you can add new columns, perform column-wise operations, handle missing data, and group data for aggregation with methods like `groupby()`.
Pandas Operations and Data Cleaning
Pandas is particularly strong at **data cleaning** and preparation, which often consumes a significant portion of a data scientist's time. It provides straightforward methods for handling missing values, such as `df.dropna()` to remove rows with missing data or `df.fillna()` to fill them with a specific value. You can also easily rename columns, change data types, and perform complex transformations on your data using methods like `apply()`. For example, you could apply a custom function to every row or column to transform the data in a specific way. Pandas also provides excellent support for time-series data, making it a go-to tool for financial analysis and other time-based datasets. The ability to perform these operations in a highly expressive and efficient manner has made Pandas a cornerstone of the Python data science stack. Any project that involves working with structured data, from small scripts to large-scale data pipelines, will benefit from using this powerful library.
import pandas as pd
import numpy as np
# Create a dictionary to hold data
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 28],
'City': ['New York', 'Paris', 'London', 'Tokyo', 'New York'],
'Score': [90, 85, 92, 78, np.nan]
}
# Create a DataFrame from the dictionary
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Basic data inspection
print("\nDataFrame Info:")
df.info()
# Filtering data
young_people = df[df['Age'] < 35]
print("\nPeople under 35:")
print(young_people)
# Handling missing data
df_filled = df.fillna(0)
print("\nDataFrame with missing values filled:")
print(df_filled)
# Adding a new column
df['Passed'] = df['Score'] > 80
print("\nDataFrame with 'Passed' column:")
print(df)
Try It Yourself: Analyze and Group Data
Using the DataFrame from the example above, perform the following tasks:
1. Calculate the average score of all students.
2. Group the data by `City` and calculate the average `Age` for each city. Print the result.
# Your code here
import pandas as pd
import numpy as np
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 28],
'City': ['New York', 'Paris', 'London', 'Tokyo', 'New York'],
'Score': [90, 85, 92, 78, np.nan]
}
df = pd.DataFrame(data)
# 1. Calculate the average score
average_score = df['Score'].mean()
print(f"Average score of all students: {average_score:.2f}")
# 2. Group by city and get the average age
average_age_by_city = df.groupby('City')['Age'].mean()
print("\nAverage age by city:")
print(average_age_by_city)
Module 4: Libraries and Frameworks
Matplotlib and Data Visualization
Once you've cleaned and analyzed your data with libraries like Pandas, the next step is often to visualize it. **Data visualization** is the process of representing data graphically to make it easier to understand and discover insights. In Python, the most popular and foundational library for this is **Matplotlib**. Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations. It is highly customizable and can be used to create a wide variety of plots, including line plots, bar charts, scatter plots, and histograms. While Matplotlib's API can seem verbose at first, its flexibility means you have fine-grained control over every element of your plot, from the title and axis labels to the color and style of each line. Many other visualization libraries, such as Seaborn, are built on top of Matplotlib, which means that understanding the basics of Matplotlib is a prerequisite for using many of the more specialized tools.
The most common way to create a plot in Matplotlib is using its `pyplot` module, which provides a MATLAB-like interface. You typically import it as `plt` (`import matplotlib.pyplot as plt`). A basic plotting workflow involves a few simple steps: first, you create a figure and an axes object; then, you use the axes object to plot your data (e.g., `ax.plot(x, y)`); you add labels and a title to make the plot informative; and finally, you use `plt.show()` to display the plot. This object-oriented approach gives you a lot of control and is considered the best practice for creating complex visualizations. However, for quick and simple plots, you can also use a state-based approach directly with `plt.plot()` and `plt.show()`, which is a bit less verbose. Knowing the difference between these two approaches will help you write both quick, exploratory plots and more robust, publication-quality figures.
Common Plot Types and Customization
Matplotlib supports a vast array of plot types, each suited for a different kind of data or question. **Line plots** are great for showing trends over time or continuous data. **Bar charts** are perfect for comparing discrete categories. **Scatter plots** are used to visualize the relationship between two variables, often with a large number of data points. **Histograms** are ideal for showing the distribution of a single numerical variable. Beyond these, Matplotlib can also create pie charts, box plots, and more. Customization is where Matplotlib truly shines. You can change the colors of your plots, add legends to distinguish different data series, and annotate specific data points with text. You can also create multiple plots within a single figure using subplots, which is essential for comparing different views of your data. For example, you could have a line plot and a bar chart side-by-side in the same figure. Matplotlib's powerful capabilities, especially when combined with Pandas DataFrames, make it an indispensable tool for anyone who needs to visually communicate data insights.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# Data for plotting
x = np.linspace(0, 10, 100)
y_sin = np.sin(x)
y_cos = np.cos(x)
# Create a figure and axes
fig, ax = plt.subplots(figsize=(10, 6))
# Plotting the data
ax.plot(x, y_sin, label='sin(x)', color='blue', linestyle='--')
ax.plot(x, y_cos, label='cos(x)', color='red', marker='o', markersize=3)
# Adding titles and labels
ax.set_title("Sine and Cosine Functions")
ax.set_xlabel("X-axis")
ax.set_ylabel("Y-axis")
# Adding a legend and grid
ax.legend()
ax.grid(True)
# Show the plot
plt.show()
# Another example: A bar chart from a Pandas DataFrame
data = {'Category': ['A', 'B', 'C', 'D'], 'Value': [10, 25, 15, 30]}
df = pd.DataFrame(data)
df.plot(kind='bar', x='Category', y='Value', legend=False, title='Bar Chart Example')
plt.ylabel("Values")
plt.show()
Try It Yourself: Create a Scatter Plot with Custom Labels
1. Create two lists of 20 random numbers each, representing `x_values` and `y_values`.
2. Use Matplotlib to create a scatter plot of these values.
3. Add a title, and labels for the x and y axes.
4. Customize the color of the markers to be 'green' and their size to 50. Display the plot.
# Your code here
import matplotlib.pyplot as plt
import random
# 1. Create random data
x_values = [random.randint(1, 100) for _ in range(20)]
y_values = [random.randint(1, 100) for _ in range(20)]
# 2. Create the scatter plot
plt.figure(figsize=(8, 6))
plt.scatter(x_values, y_values, c='green', s=50, alpha=0.7)
# 3. Add labels and title
plt.title("Random Scatter Plot")
plt.xlabel("X-axis Values")
plt.ylabel("Y-axis Values")
# 4. Display the plot
plt.grid(True)
plt.show()
Module 4: Libraries and Frameworks
Requests for Web APIs
One of the most common tasks in modern programming is interacting with web services. An **API** (Application Programming Interface) is a set of rules that allows different applications to communicate with each other. A **Web API** provides a way for your code to send requests to a web server and receive data in return, typically in JSON format. Python’s standard library provides modules for this, but the third-party library **Requests** is the de-facto standard for making HTTP requests. It is known for its elegant and simple API, which makes sending requests and handling responses feel incredibly intuitive and "human-friendly." With Requests, you can easily perform all the standard HTTP operations, such as GET (to retrieve data), POST (to send new data), PUT (to update data), and DELETE (to remove data).
The most basic operation is a **GET** request, which is used to retrieve information from a server. You simply call `requests.get('url')`. The function returns a `Response` object, which contains all the information from the server's reply. The `Response` object has many useful attributes and methods. You can check the status code with `response.status_code` to see if the request was successful (200 is OK, 404 is Not Found, 500 is Server Error). You can access the content of the response as a string with `response.text` or, if the response is in JSON format, you can use `response.json()` to automatically parse it into a Python dictionary or list. This automatic parsing is one of the features that makes the Requests library so convenient. You don't have to manually deal with serialization and deserialization; Requests handles it for you, allowing you to work with the data directly.
Sending Data and Headers
For operations like creating new resources or submitting forms, you'll need to use a **POST** request. The Requests library makes this simple by allowing you to pass a dictionary of data to the `requests.post()` function. The library automatically encodes this data into the correct format for the server. You can also easily send custom headers with your requests by passing a dictionary to the `headers` parameter. Headers are often used for authentication, specifying the content type, or sending other metadata. For example, many APIs require an API key to be sent in the header of every request. Requests also handles advanced features like sessions, which allow you to persist certain parameters across multiple requests, and file uploads. Because of its simplicity and powerful features, the Requests library is an essential tool for any Python developer who needs to work with web services, which is a key part of building modern, connected applications.
import requests
# Example of a simple GET request
url_get = "https://jsonplaceholder.typicode.com/posts/1"
response = requests.get(url_get)
print(f"Status Code: {response.status_code}")
if response.status_code == 200:
post_data = response.json()
print("GET request successful. Post data:")
print(post_data)
else:
print("GET request failed.")
# Example of a POST request
url_post = "https://jsonplaceholder.typicode.com/posts"
new_post = {
"title": "My First Post",
"body": "This is the content of my post.",
"userId": 1
}
headers = {'Content-Type': 'application/json'} # Some APIs require this
post_response = requests.post(url_post, json=new_post, headers=headers)
print("\nPOST request status code:")
print(post_response.status_code)
if post_response.status_code == 201: # 201 is "Created"
created_post = post_response.json()
print("POST request successful. Created post:")
print(created_post)
else:
print("POST request failed.")
Try It Yourself: Make a GET Request with Query Parameters
Many APIs use query parameters to filter or search for data. The URL for this often looks like `https://api.example.com/data?param1=value1¶m2=value2`. The Requests library makes this easy with a `params` dictionary. Make a GET request to the following API endpoint `https://jsonplaceholder.typicode.com/posts`, but use query parameters to only retrieve posts from `userId` number 1. Print the number of posts returned and the title of the first post.
# Your code here
import requests
url = "https://jsonplaceholder.typicode.com/posts"
params = {"userId": 1}
response = requests.get(url, params=params)
if response.status_code == 200:
posts = response.json()
print(f"Number of posts for userId 1: {len(posts)}")
if posts:
print(f"Title of the first post: {posts[0]['title']}")
else:
print(f"Failed to retrieve data. Status code: {response.status_code}")
Module 4: Libraries and Frameworks
Flask/Django for Web Development
Python has become a powerhouse in web development, thanks to its powerful and flexible frameworks. A **web framework** is a collection of libraries and tools that provide a standardized way to build web applications, handling common tasks like routing URLs, handling requests and responses, and managing templates. This saves developers from having to reinvent the wheel for every project. The two most popular Python web frameworks are **Flask** and **Django**. Flask is a micro-framework, meaning it's lightweight and minimalist. It provides only the essential tools to get a web application up and running, giving developers the freedom to choose their own libraries for other features like databases, authentication, or form validation. This makes Flask an excellent choice for smaller projects, APIs, or for developers who want complete control over their stack. Its simplicity also makes it very easy to learn for beginners.
**Django**, on the other hand, is a "batteries-included" framework. It's a full-stack framework that comes with many built-in components, including an Object-Relational Mapper (ORM) to interact with databases, an admin panel for easy content management, and a robust templating system. Django's philosophy is to provide everything you need to build a complex web application from scratch, which makes it perfect for large-scale, enterprise-level projects. Django follows a Model-View-Controller (MVC) architectural pattern, though it refers to it as Model-View-Template (MVT). The **Model** defines the data structure, the **View** handles the business logic and serves the data, and the **Template** handles the presentation (the HTML). While Django has a steeper learning curve than Flask, its comprehensive nature and vast ecosystem mean you can build complex, data-driven applications very quickly and with a high degree of security and scalability.
Getting Started with Flask
For this lesson, we will focus on a basic Flask application, as its simplicity is ideal for a quick introduction to web development. A minimal Flask application can be as simple as a few lines of code. You create a Flask application instance, define a route using the `@app.route()` decorator to map a URL to a Python function, and then write the function to return the content to be displayed in the browser. You can run the application with a single command. Flask automatically handles the server, the HTTP requests, and the responses, allowing you to focus on the application logic. This simple pattern of defining routes and views is the foundation of almost all web applications. To extend Flask, you can use a wide variety of community-driven extensions for things like database integration (e.g., Flask-SQLAlchemy) or user authentication (e.g., Flask-Login). While this lesson only provides a taste of what's possible, understanding the basic request-response cycle and how to build a simple web server is a foundational skill for any Python developer.
# Note: This code requires Flask to be installed: `pip install Flask`
from flask import Flask
# Create a Flask app instance
app = Flask(__name__)
# Define a route for the home page
@app.route("/")
def home():
return "Hello, World!
Welcome to my first Flask web page.
"
# Define a route with a variable
@app.route("/user/")
def greet_user(name):
return f"Hello, {name}!
"
# Run the application
if __name__ == "__main__":
app.run(debug=True)
# To run this code:
# 1. Save it as app.py
# 2. Open a terminal in the same directory
# 3. Run the command `python app.py`
# 4. Open a web browser and go to http://127.0.0.1:5000/
# 5. Try going to http://127.0.0.1:5000/user/Alice
Try It Yourself: Create a Basic API Endpoint with Flask
Extend the Flask application to include a new route, `/api/data`. This route should return a JSON response containing a dictionary with some sample data (e.g., `{"status": "success", "message": "API data retrieved"}`). Remember to import the `jsonify` function from Flask to make returning JSON easy and correct. Test your API by visiting the URL in your browser.
# Your code here
from flask import Flask, jsonify
app = Flask(__name__)
@app.route("/")
def home():
return "Hello, World!
"
@app.route("/api/data")
def get_api_data():
data = {
"status": "success",
"message": "API data retrieved",
"data": [1, 2, 3, 4, 5]
}
return jsonify(data)
if __name__ == "__main__":
app.run(debug=True)
Module 5: Advanced Applications
Web Scraping with BeautifulSoup
**Web scraping** is the process of extracting data from websites. It's a powerful technique for gathering information that isn't available through a public API. Whether you need to collect product prices from an e-commerce site, news headlines from a news portal, or job listings from a career website, web scraping can automate this task. A web scraping project typically involves two main steps: first, making an HTTP request to download the HTML content of a page, and second, parsing that HTML to extract the specific data you're looking for. While Python's standard library can handle the request part, the parsing step is where a dedicated library becomes invaluable. The library **BeautifulSoup** is the industry standard for this task. It's a Python library for pulling data out of HTML and XML files and creating a parse tree that you can navigate and search.
The first step in any scraping task is to get the HTML content of the page. You'll typically use the **Requests** library for this, which we covered in a previous lesson. You simply make a GET request to the target URL and get the response text. Once you have the HTML content as a string, you pass it to the BeautifulSoup constructor, which creates a `BeautifulSoup` object. This object represents the entire document as a tree of objects, allowing you to access elements and their content using a clear and intuitive syntax. For example, to find the first `
` tag in the document, you can simply write `soup.h1`. BeautifulSoup also provides powerful methods like `find()` and `find_all()` to search the document for tags with specific names, attributes (like an `id` or `class`), or text content. The `find_all()` method returns a list of all matching elements, which you can then iterate over to extract the data you need.
Navigating the Parse Tree and Extracting Data
Web scraping requires a solid understanding of a page's HTML structure. You use BeautifulSoup to navigate this structure, moving from a parent tag to its children or siblings. For instance, if you have a list of products on a page, you might first find the `div` that contains the product list, then loop through each individual product `div` inside it, and finally extract the product name, price, and image URL from the tags within each product `div`. To extract the text content from a tag, you use the `.get_text()` method. To get the value of an attribute (e.g., the `href` attribute of an `` tag), you access it like a dictionary key: `tag['href']`. A key part of web scraping is being respectful of the websites you scrape. You should always check a site's `robots.txt` file to see if scraping is allowed, and you should avoid overwhelming a site with too many requests. Additionally, many modern websites use JavaScript to load content, which may require more advanced tools like Selenium, but BeautifulSoup remains an essential tool for static content scraping.
# Note: This code requires requests and beautifulsoup4 to be installed
# `pip install requests beautifulsoup4`
import requests
from bs4 import BeautifulSoup
# The URL of the page to scrape
url = "http://books.toscrape.com/"
try:
response = requests.get(url)
response.raise_for_status() # Raise an HTTPError for bad responses (4xx or 5xx)
# Create a BeautifulSoup object to parse the HTML
soup = BeautifulSoup(response.text, 'html.parser')
# Find the main container for the book list
books_container = soup.find('section')
# Find all the book articles within the container
books = books_container.find_all('article', class_='product_pod')
print(f"Found {len(books)} books on the page.")
print("---------------------------------")
# Iterate over the books and extract data
for book in books[:5]: # Extract data for the first 5 books
title = book.h3.a['title']
price = book.find('p', class_='price_color').get_text(strip=True)
rating_class = book.p['class'][1] # Get the rating class
print(f"Title: {title}")
print(f"Price: {price}")
print(f"Rating: {rating_class}")
print("---------------------------------")
except requests.exceptions.RequestException as e:
print(f"Error fetching the URL: {e}")
Try It Yourself: Scrape a Single Quote
Using the same website as the example, `http://books.toscrape.com/`, find and print the text of the quote at the top of the page. The quote is located within a `
` tag with the class `lead`. Try to find the author of the quote as well, which is in a `
# Your code here
import requests
from bs4 import BeautifulSoup
url = "http://books.toscrape.com/"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
# Find the quote and author
quote_tag = soup.find('p', class_='lead')
author_tag = soup.find('footer').a
if quote_tag and author_tag:
quote_text = quote_tag.get_text(strip=True)
author_name = author_tag.get_text(strip=True)
print(f"Quote: \"{quote_text}\"")
print(f"Author: {author_name}")
else:
print("Could not find the quote or author.")
Module 5: Advanced Applications
Database Integration (SQLite, PostgreSQL)
For most applications, data persistence goes beyond simple files; it requires a **database**. A database provides a structured way to store, manage, and retrieve large amounts of data efficiently. Python has excellent support for interacting with various database systems. For a beginner, **SQLite** is the perfect starting point. It's a self-contained, file-based database engine that requires no separate server process. This means you can create a complete database in a single file on your disk without any complex setup. Python's standard library includes the `sqlite3` module, which provides a straightforward API for connecting to an SQLite database, creating tables, and performing CRUD (Create, Read, Update, Delete) operations using SQL queries. The `sqlite3` module is the simplest way to get started with relational databases in Python, making it ideal for learning and for developing small-scale applications or prototypes. It's also often used for caching and data storage in mobile apps.
For larger, more complex applications that require more power and scalability, you'll likely use a client-server database like **PostgreSQL** or MySQL. These databases run as separate services and can handle multiple users, complex queries, and large datasets with high performance. To connect to these databases from Python, you'll need a specific database driver library, such as `psycopg2` for PostgreSQL or `mysql-connector-python` for MySQL. These libraries provide a similar API to the `sqlite3` module, following the Python Database API Specification (DB-API). The core workflow involves establishing a connection to the database, creating a cursor object (which is used to execute SQL commands), executing your queries, and committing the changes. For reading data, you can use methods like `cursor.fetchone()` to retrieve a single row or `cursor.fetchall()` to get all the results as a list of tuples. When you're done, it's essential to close both the cursor and the database connection to free up resources.
Using an ORM for Simplified Database Interaction
While writing raw SQL queries is powerful, it can also be tedious and prone to errors. A more modern and Pythonic approach is to use an **ORM** (Object-Relational Mapper). An ORM allows you to interact with your database using Python objects and classes instead of raw SQL. It provides a layer of abstraction that maps Python classes to database tables and object attributes to table columns. **SQLAlchemy** is the most popular and powerful ORM for Python. It offers a flexible toolkit that can be used with a wide range of databases. With an ORM, you can perform database operations like creating a new record (`new_user = User(name='Alice')`), querying for records (`session.query(User).filter_by(name='Alice')`), and updating records (`user.name = 'Bob'`). This approach makes your code cleaner, more secure (as it helps prevent SQL injection attacks), and more portable, as you can often switch databases without changing your application code. For any serious, data-driven application, using a library like SQLAlchemy is highly recommended to streamline database interactions and improve code quality.
# Note: No external libraries needed for SQLite
import sqlite3
# Connect to a database (or create it if it doesn't exist)
conn = sqlite3.connect('example.db')
cursor = conn.cursor()
# Create a table
cursor.execute('''
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL UNIQUE
)
''')
conn.commit()
# Insert data
cursor.execute("INSERT OR IGNORE INTO users (name, email) VALUES (?, ?)", ('Alice', 'alice@example.com'))
cursor.execute("INSERT OR IGNORE INTO users (name, email) VALUES (?, ?)", ('Bob', 'bob@example.com'))
conn.commit()
print("Inserted two users into the database.")
# Read data
cursor.execute("SELECT * FROM users")
all_users = cursor.fetchall()
print("\nAll users:")
for user in all_users:
print(user)
# Update data
cursor.execute("UPDATE users SET name = ? WHERE name = ?", ('Charlie', 'Alice'))
conn.commit()
print("\nUpdated Alice's name to Charlie.")
# Read the updated data
cursor.execute("SELECT * FROM users WHERE name = ?", ('Charlie',))
updated_user = cursor.fetchone()
print(f"Updated user: {updated_user}")
# Close the connection
conn.close()
Try It Yourself: Add a New Table and Query It
Modify the SQLite example to add a new table called `products` with columns for `name` and `price`. Insert at least three products into the table. Then, write a query to select and print all products with a price greater than 100.
# Your code here
import sqlite3
conn = sqlite3.connect('example.db')
cursor = conn.cursor()
# Create the products table
cursor.execute('''
CREATE TABLE IF NOT EXISTS products (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
price REAL NOT NULL
)
''')
conn.commit()
# Insert products
products_list = [('Laptop', 1200.00), ('Mouse', 25.50), ('Keyboard', 150.00)]
cursor.executemany("INSERT OR IGNORE INTO products (name, price) VALUES (?, ?)", products_list)
conn.commit()
# Select products with a price > 100
cursor.execute("SELECT * FROM products WHERE price > ?", (100,))
expensive_products = cursor.fetchall()
print("\nProducts with a price greater than 100:")
for product in expensive_products:
print(product)
conn.close()
Module 5: Advanced Applications
Testing with pytest
Writing code is only half the battle; ensuring that it works correctly is just as important. **Software testing** is the process of verifying that your code behaves as expected and does not contain bugs. This is a critical practice for building reliable and maintainable applications. While you can manually test your code, this is time-consuming and prone to human error. **Automated testing** solves this problem by allowing you to write code that tests your code. These tests can be run automatically and frequently, providing rapid feedback and catching bugs early. In Python, the standard library offers a module called `unittest`, but a third-party library called **pytest** has become the industry standard due to its simplicity, flexibility, and powerful features.
The philosophy of **pytest** is to make writing tests as easy as possible. You write your test functions in files that start with `test_` or end with `_test.py`. A test function itself must start with the prefix `test_`. Inside the test function, you use simple `assert` statements to check if a condition is true. If an `assert` statement fails, pytest catches the failure and provides a detailed report of what went wrong. For example, a test for a simple function that adds two numbers might look like `def test_add_numbers(): assert add(2, 3) == 5`. This simple, declarative style is easy to read and write. To run your tests, you simply navigate to your project directory in a terminal and run the command `pytest`. The library automatically discovers and runs all your tests and provides a clean, easy-to-read summary of the results.
Fixtures and Parameterization
Pytest's power comes from its advanced features, such as **fixtures** and **parameterization**. A **fixture** is a function that sets up the necessary environment for your tests to run. For example, a fixture could create a temporary database connection, a sample file, or a test user. Instead of writing this setup code in every single test, you can simply define a fixture and have pytest automatically pass it into your test functions. This helps to reduce code duplication and keeps your tests clean and focused on their specific purpose. **Parameterization** allows you to run a single test function with multiple different inputs, which is invaluable for testing a function with a variety of edge cases. You use the `@pytest.mark.parametrize` decorator to define the test cases, and pytest runs the test function once for each set of parameters. This prevents you from having to write a separate test function for every single test case. By using pytest, you can build a robust test suite that gives you confidence in your code's correctness, which is essential for any serious software development project.
# Note: This code is meant to be saved in a file named test_calculator.py
# and run with the `pytest` command in a terminal.
# You need to have pytest installed: `pip install pytest`
# Imagine this is the code you want to test, saved in a file named calculator.py
# def add(x, y):
# return x + y
# def subtract(x, y):
# return x - y
import pytest
# A simple test function
def test_add_two_numbers():
assert add(2, 3) == 5
# A test function with a fixture (we'll just simulate it here)
@pytest.fixture
def sample_list():
return [1, 2, 3]
def test_list_length(sample_list):
assert len(sample_list) == 3
# A parameterized test
@pytest.mark.parametrize("a, b, expected", [
(1, 2, 3),
(5, 5, 10),
(10, -5, 5),
])
def test_add_with_multiple_values(a, b, expected):
assert add(a, b) == expected
# You would run this from your terminal: `pytest`
Try It Yourself: Write a Test for a String Function
Assume you have a function `is_palindrome(s)` that returns `True` if a string is a palindrome (reads the same backward as forward) and `False` otherwise. Write a test file using `pytest` that includes three tests for this function: one for a simple palindrome (`"madam"`), one for a non-palindrome (`"hello"`), and a third parameterized test that checks a few more cases, including case insensitivity (e.g., `"Racecar"`).
# Your code here
# Save this in a file named `test_palindrome.py`
import pytest
def is_palindrome(s):
# This is the function we want to test
processed_s = s.lower().replace(" ", "")
return processed_s == processed_s[::-1]
def test_simple_palindrome():
assert is_palindrome("madam") == True
def test_non_palindrome():
assert is_palindrome("hello") == False
@pytest.mark.parametrize("input_string, expected", [
("A man a plan a canal Panama", True),
("No lemon, no melon", True),
("Python", False),
])
def test_complex_palindromes(input_string, expected):
assert is_palindrome(input_string) == expected
print("Tests are written. Now run `pytest` in your terminal.")
Module 5: Advanced Applications
Automation and Scripting
One of Python's most popular use cases is **automation and scripting**. A script is a program that automates a task that you would otherwise perform manually. This could be anything from organizing files on your computer to sending automated emails or interacting with web services. Python's simple syntax and rich standard library make it an ideal language for writing scripts that can save you a tremendous amount of time and effort. The power of scripting lies in its ability to handle repetitive, tedious, and error-prone tasks with precision and speed. The basic building blocks of scripting are the same as what we've already learned: functions, control structures, and loops. However, the key is knowing which libraries to use to interact with your system's environment.
The **`os`** and **`shutil`** modules are two of the most fundamental libraries for scripting. The `os` module provides a way to interact with the operating system, allowing you to perform tasks like creating or deleting directories, listing files, and changing the current working directory. The `os.path` submodule is particularly useful for manipulating file paths in a cross-platform way, ensuring your scripts work on Windows, macOS, and Linux. The `shutil` module is even more powerful for file operations, providing high-level functions for copying, moving, and deleting files and entire directory trees. For example, `shutil.copy()` can copy a file, and `shutil.move()` can move it. The ability to programmatically manage files and directories is the foundation of many automation tasks, such as nightly backups, log file cleanup, or data organization pipelines.
Automating System and Web Tasks
Beyond file management, Python can also automate tasks on a system level. The **`subprocess`** module is a robust way to run other programs or shell commands from your Python script. This allows you to integrate your Python code with other tools and command-line utilities. For example, you could write a script that runs a system command, captures its output, and then processes that output with Python. The **`datetime`** and **`time`** modules are also essential for automation, as they allow you to work with dates and times, which is crucial for scheduling tasks or working with time-stamped data. When it comes to web-based automation, libraries like **Selenium** and **PyAutoGUI** can automate browser interactions, such as filling out forms, clicking buttons, and navigating pages. While this is more advanced, it demonstrates the incredible power of Python to automate a wide range of tasks, from the simple to the complex. The key takeaway is that if you can do a task manually on your computer, there's a very good chance you can write a Python script to do it for you.
import os
import shutil
from datetime import datetime
# Get the current working directory
current_dir = os.getcwd()
print(f"Current directory: {current_dir}")
# Create a new directory
new_folder = "my_new_folder"
if not os.path.exists(new_folder):
os.makedirs(new_folder)
print(f"Created directory: {new_folder}")
# Create a sample file
file_path = os.path.join(new_folder, "sample.txt")
with open(file_path, "w") as f:
f.write(f"This file was created on {datetime.now()}.")
print(f"Created file: {file_path}")
# List files in the new folder
files_in_folder = os.listdir(new_folder)
print(f"Files in '{new_folder}': {files_in_folder}")
# Move the folder
new_location = "temp_folder"
shutil.move(new_folder, new_location)
print(f"Moved '{new_folder}' to '{new_location}'.")
# Clean up
shutil.rmtree(new_location)
print(f"Removed '{new_location}'.")
Try It Yourself: Rename Multiple Files
Write a script that renames all files in a specific directory by adding a prefix to their filenames. For this exercise, assume you have a directory named `report_files` with a few dummy files. Your script should loop through these files and rename them to `prefixed_filename.txt`.
# Your code here
import os
folder_path = "report_files"
prefix = "report_"
# Create a dummy folder and files for the exercise
if not os.path.exists(folder_path):
os.makedirs(folder_path)
with open(os.path.join(folder_path, "file1.txt"), "w") as f: pass
with open(os.path.join(folder_path, "file2.txt"), "w") as f: pass
print("Created dummy files in 'report_files'.")
# List files to be renamed
print(f"Files before renaming: {os.listdir(folder_path)}")
# Loop through and rename files
for filename in os.listdir(folder_path):
if filename.endswith(".txt"):
old_path = os.path.join(folder_path, filename)
new_path = os.path.join(folder_path, prefix + filename)
os.rename(old_path, new_path)
print(f"Files after renaming: {os.listdir(folder_path)}")
# Clean up
# shutil.rmtree(folder_path)
Module 5: Advanced Applications
Machine Learning Basics
Python has become the dominant language for **machine learning** and artificial intelligence due to its simplicity and powerful ecosystem of libraries. Machine learning is a field of study that gives computers the ability to learn without being explicitly programmed. The core idea is to train an algorithm on a dataset to find patterns, and then use that model to make predictions or decisions on new, unseen data. The process typically involves several steps: collecting and preparing data, choosing a model, training the model, and then evaluating its performance. While machine learning is a vast and complex field, a basic understanding of the core concepts and the libraries used is a great first step for any developer.
The library **Scikit-learn** is the most popular and user-friendly library for classic machine learning tasks. It provides a consistent API for a wide range of algorithms, from classification and regression to clustering and dimensionality reduction. Scikit-learn is built on top of NumPy and SciPy and works seamlessly with Pandas DataFrames, making it a perfect tool for a complete data science workflow. To use it, you first prepare your data (often in a Pandas DataFrame), split it into training and testing sets, and then create a model instance (e.g., `model = RandomForestClassifier()`). The `model.fit()` method is then used to train the model on your training data. Once trained, you can use `model.predict()` to make predictions on your test data. This simple `fit-predict` workflow is consistent across all of Scikit-learn's models, which makes it very easy to try out different algorithms to see which one works best for your problem.
A Simple Classification Example
Let's consider a simple classification problem. Imagine you have a dataset of people and you want to predict if they are a student based on their age and income. A **classifier** is a machine learning model that predicts a categorical label (e.g., "student" or "not a student"). We would first need to load our data and define our features (age, income) and our target label (is_student). We would then split this data into a training set (what the model learns from) and a test set (what we use to evaluate the model). A common and simple model is the **K-Nearest Neighbors** classifier. This model makes predictions by looking at the "k" closest data points in the training set and assigning the most common label among them. While this is a very basic example, it demonstrates the fundamental workflow of a supervised machine learning task. After training the model and making predictions, you can evaluate its performance using metrics like accuracy, which Scikit-learn provides out of the box. This provides an objective measure of how well your model is performing and is a crucial part of the machine learning process.
# Note: This code requires numpy and scikit-learn to be installed
# `pip install numpy scikit-learn`
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
# Sample data: features (age, income) and target (is_student)
# 1 = student, 0 = not a student
features = np.array([
[20, 15000], [22, 18000], [30, 45000], [35, 60000],
[21, 20000], [25, 25000], [40, 80000], [45, 95000]
])
target = np.array([1, 1, 0, 0, 1, 1, 0, 0])
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.25, random_state=42)
# Create a K-Nearest Neighbors classifier model
model = KNeighborsClassifier(n_neighbors=3)
# Train the model
model.fit(X_train, y_train)
print("Model training complete.")
# Make predictions on the test data
y_pred = model.predict(X_test)
print(f"Predictions on test data: {y_pred}")
print(f"Actual labels for test data: {y_test}")
# Evaluate the model's accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"\nModel Accuracy: {accuracy * 100:.2f}%")
# Make a new prediction
new_person = np.array([[23, 22000]])
prediction = model.predict(new_person)
print(f"Prediction for a new person (age 23, income 22000): {prediction[0]}")
Try It Yourself: Make a Regression Prediction
A regression model predicts a continuous numerical value (e.g., house price). Using Scikit-learn's `LinearRegression` model, train a model to predict a person's income based on their age. Use the same `features` and `target` data as above, but this time, the `target` should be the `income` column of your original features array. Split the data, train the model, and then make a prediction for a new person of age 33. Print the predicted income.
# Your code here
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
# Data: features (age) and target (income)
age = np.array([20, 22, 30, 35, 21, 25, 40, 45]).reshape(-1, 1)
income = np.array([15000, 18000, 45000, 60000, 20000, 25000, 80000, 95000])
# Split the data
X_train, X_test, y_train, y_test = train_test_split(age, income, test_size=0.25, random_state=42)
# Create and train the model
model = LinearRegression()
model.fit(X_train, y_train)
# Make a prediction for a new person of age 33
new_person_age = np.array([[33]])
predicted_income = model.predict(new_person_age)
print(f"The predicted income for a person of age 33 is: ${predicted_income[0]:,.2f}")