Skip to main content

Functions

Create reusable code blocks with Python functions

Functions package reusable logic behind a name, a signature, and a single return contract. They appear in every CS50P, CS106A, and DATA 100 problem set from week 3 onward because graders test code by calling the function under multiple inputs. Two mistakes lose more points than any other concept: mutable default arguments and the late-binding closure trap.

1. def: signature, body, and the single-return rule

A function definition binds a name to a callable object. The `def` keyword names the function, the parentheses list parameters, and the indented block defines the body. Python passes arguments by **object reference**, the same model variables use: the function receives a name pointing to the caller's object. Reassigning the name inside the function does not affect the caller. Mutating the object does.

Every function returns something. Without an explicit `return`, the function returns `None`. This is the cause of the "function prints but returns nothing" confusion: `print(x)` inside a function shows `x` to the user, but the calling code receives `None` from the function. Graders test the **return value**, not the printed output, so a function that only `print`s fails the autograder even when the screen looks right.

The single-return-point rule from C is **not** Python style. Multiple `return` statements are idiomatic and often clearer. Early returns for guard conditions flatten nested `if` blocks: `if not valid: return None` near the top removes the need to indent the rest of the body.

Example

                      
                        # Run with: python defbasics.py
# Function signature and the print-vs-return trap

def double(n):
    """Return n times 2. Pure function: no side effects."""
    return n * 2  # explicit return - graders test THIS value


def double_broken(n):
    print(n * 2)  # prints to stdout, returns None implicitly


result_good = double(5)
result_bad = double_broken(5)  # screen shows 10, but...
print(f"good: {result_good}, bad: {result_bad}")  # good: 10, bad: None


# Early-return guard pattern: cleaner than nested if/else
def letter_grade(score):
    if not isinstance(score, (int, float)):
        return None  # guard: bail early on bad input
    if score < 0 or score > 100:
        return None
    if score >= 90: return "A"
    if score >= 80: return "B"
    if score >= 70: return "C"
    if score >= 60: return "D"
    return "F"


print(letter_grade(87))   # B
print(letter_grade(-5))   # None
print(letter_grade("ab")) # None
                      
                    

2. Parameters, arguments, and the 4 argument styles

Python supports **four argument-passing styles** in a single signature: positional, keyword, `*args` (variable positional), and `**kwargs` (variable keyword). The order is fixed: positional first, `*args` next, then keyword-only, then `**kwargs` last. Mixing them well is the difference between a flexible API and a brittle one.

Default parameter values turn an argument into optional. `def greet(name, greeting="Hello"):` lets callers write `greet("Alice")` or `greet("Alice", "Hi")`. Defaults are evaluated **once** at function definition, not on each call. This produces the infamous mutable default bug: `def add(item, bucket=[]):` shares the same `[]` across every call that omits `bucket`, so the list grows between calls. The fix is `bucket=None` plus an `if bucket is None: bucket = []` line at the top of the body.

`*args` collects extra positional arguments into a tuple. `**kwargs` collects extra keyword arguments into a dict. Both are conventions, not syntax: `*x` and `**y` would work but break every linter. Use `*args` to write functions like `sum(*numbers)` that accept any count. Use `**kwargs` to forward configuration: `def wrapper(**opts): logger(**opts)`.

Example

                      
                        # Run with: python params.py
# All four parameter styles in one signature

def build_url(host, path, *segments, scheme="https", **query):
    """
    host, path: required positional
    *segments:  any number of extra path parts
    scheme:     keyword-only (after *segments)
    **query:    any keyword arg becomes a URL query param
    """
    full_path = "/".join([path, *segments])
    query_string = "&".join(f"{k}={v}" for k, v in query.items())
    url = f"{scheme}://{host}/{full_path}"
    if query_string:
        url += f"?{query_string}"
    return url


# Various call styles all work:
print(build_url("api.dmph.com", "v1", "users", "42"))
# https://api.dmph.com/v1/users/42

print(build_url("api.dmph.com", "v1", "search", scheme="http", q="pandas", limit=10))
# http://api.dmph.com/v1/search?q=pandas&limit=10


# The mutable-default BUG and its fix
def bad_append(item, bucket=[]):     # BUG: shared list across calls
    bucket.append(item)
    return bucket

print(bad_append(1))  # [1]
print(bad_append(2))  # [1, 2]  <- surprise: previous call leaked in


def good_append(item, bucket=None):  # FIX: sentinel pattern
    if bucket is None:
        bucket = []
    bucket.append(item)
    return bucket

print(good_append(1))  # [1]
print(good_append(2))  # [2]  <- fresh list every call
                      
                    

3. Return values, tuple unpacking, and multiple returns

Python functions return **one object**. To return multiple values, package them in a tuple and let the caller unpack: `def stats(data): return min(data), max(data), sum(data)/len(data)`. Then `lo, hi, avg = stats(scores)` binds three names in one line. This is the standard pattern for any function that produces several related outputs.

A `return` with no value returns `None`. A function with no `return` statement also returns `None`. Both are valid, but explicit `return None` reads better when the function deliberately produces no output. Functions used for side effects (printing, writing files, updating state) conventionally return `None`; functions that compute values always return them.

For optional outputs, two idioms compete. Return `None` when the operation fails (`def find_user(name) -> User | None`). Or raise an exception (`raise ValueError("not found")`). Use `None` when "not found" is expected and frequent. Use exceptions when "not found" indicates a bug or a precondition violation. The Python stdlib uses both: `dict.get(key)` returns `None` for a missing key, `dict[key]` raises `KeyError`.

Example

                      
                        # Run with: python returns.py
# Multiple returns via tuple unpacking + None vs exception

def summary(scores):
    """Return (min, max, mean) as a 3-tuple."""
    if not scores:
        return None  # explicit None for empty input
    return min(scores), max(scores), sum(scores) / len(scores)


result = summary([72, 85, 91, 68, 79])
if result is None:
    print("no data")
else:
    lo, hi, avg = result  # unpack into 3 names
    print(f"low={lo}, high={hi}, avg={avg:.1f}")


# Pattern A: return None for "not found" (cheap, expected)
def find_score(roster, name):
    for record in roster:
        if record["name"] == name:
            return record["score"]
    return None  # explicit, not implicit fall-through


# Pattern B: raise for "should never happen" (precondition)
def divide(a, b):
    if b == 0:
        raise ZeroDivisionError("denominator must be nonzero")
    return a / b


roster = [{"name": "Alice", "score": 87}, {"name": "Bob", "score": 72}]
print(find_score(roster, "Alice"))   # 87
print(find_score(roster, "Zoe"))     # None - expected miss
print(divide(10, 2))                  # 5.0
                      
                    

4. Lambda expressions: anonymous functions for one-liners

A `lambda` expression creates a function **without binding it to a name**. The syntax is `lambda args: expression`. The body must be a single expression, not a block; no statements, no assignments, no multiple lines. Lambdas exist for **inline callbacks**: arguments to `sorted()`, `map()`, `filter()`, and pandas `.apply()` calls.

The canonical lambda is a sort key. `sorted(students, key=lambda s: s["score"])` sorts a list of dicts by score without writing a named helper. Pandas users see lambdas constantly: `df["price"].apply(lambda x: x * 1.08)` applies a per-row transform. For anything more complex than one expression, use `def` and a named function. Three reasons: stack traces show the function name, the docstring documents intent, and the function is reusable.

Lambdas inherit the **late-binding closure** trap: a lambda inside a loop captures the **variable**, not its current value. `funcs = [lambda: i for i in range(3)]` produces three lambdas that all return `2` (the final value of `i`), not `0, 1, 2`. The fix is a default argument binding: `funcs = [lambda i=i: i for i in range(3)]`. This is the same trap that catches every student who tries to build a list of button-click handlers in Tkinter.

Example

                      
                        # Run with: python lambdas.py
# Lambda as inline callback, plus the closure trap

students = [
    {"name": "Alice", "score": 85, "year": 3},
    {"name": "Bob",   "score": 72, "year": 1},
    {"name": "Carol", "score": 91, "year": 2},
]

# Sort by score, descending
by_score = sorted(students, key=lambda s: s["score"], reverse=True)
for s in by_score:
    print(s["name"], s["score"])

# Multi-key sort: year ascending, then score descending
by_year_then_score = sorted(students, key=lambda s: (s["year"], -s["score"]))
for s in by_year_then_score:
    print(s)


# THE CLOSURE TRAP: every lambda captures the variable, not the value
broken = [lambda: i for i in range(3)]
print([f() for f in broken])  # [2, 2, 2] - all see final i

# Fix: bind via default argument (evaluated at lambda definition)
fixed = [lambda i=i: i for i in range(3)]
print([f() for f in fixed])   # [0, 1, 2] - each captures its own i
                      
                    

5. Scope and namespaces: LEGB and the global/nonlocal keywords

Python resolves names using the **LEGB rule**: Local, Enclosing, Global, Built-in. When you reference `x` inside a function, Python checks the function's local namespace first, then the enclosing function (if nested), then the module-level globals, then Python's built-ins. The first hit wins. This is why `len = 5` inside a function shadows the built-in `len` for that function only.

Assignment to a name **creates a local variable** unless declared otherwise. Writing `count = count + 1` inside a function that reads a module-level `count` raises `UnboundLocalError`: Python sees the assignment and marks `count` as local, then crashes when reading it before the first assignment. Two fixes exist: `global count` declares the name refers to the module global; `nonlocal count` declares it refers to the enclosing function's local.

Use `global` rarely. Use `nonlocal` for **closures**: inner functions that mutate state defined in the outer function. This is the foundation of decorators, generators, and the counter/cache patterns used in CS50P's "Outdated" and DATA 100's memoization assignments. If you find yourself reaching for `global` to share state, refactor to a class instead.

Example

                      
                        # Run with: python scope.py
# LEGB + nonlocal closure for a counter factory

x = "module"  # GLOBAL scope

def outer():
    x = "outer"  # ENCLOSING scope (relative to inner)

    def inner():
        x = "local"  # LOCAL scope
        print(f"inner sees: {x}")

    inner()
    print(f"outer sees: {x}")


outer()
print(f"module sees: {x}")
# inner sees: local
# outer sees: outer
# module sees: module


# Closure with nonlocal: a counter factory
def make_counter(start=0):
    """Return a function that, when called, returns the next integer."""
    count = start
    def increment():
        nonlocal count  # rebind enclosing "count", not create a new local
        count += 1
        return count
    return increment


tick = make_counter()
print(tick(), tick(), tick())  # 1 2 3 - independent state per closure

other = make_counter(100)
print(other(), other())        # 101 102 - isolated from tick
print(tick())                   # 4 - tick continues independently
                      
                    

Common pitfalls

Mutable default argument shares state across calls: `def add(x, bucket=[])` accumulates entries from every previous call that omitted `bucket`.

Use `bucket=None` and assign inside: `if bucket is None: bucket = []`. The default `None` is immutable, so no sharing happens.

Late-binding closure: every lambda in `[lambda: i for i in range(3)]` returns `2` because they all reference the same `i`.

Bind via default argument: `[lambda i=i: i for i in range(3)]`. The default captures the current value at lambda creation time.

Function prints but returns nothing. The autograder calls the function, gets `None`, fails the test, and the student sees correct output on screen.

Replace `print(result)` with `return result`. The caller decides whether to print. If both are needed, return AND let the caller print.

`UnboundLocalError` when a function reads a global before assigning to it: `def f(): count += 1` crashes if `count` is defined outside.

Add `global count` (for module-level) or `nonlocal count` (for enclosing function) as the first line of the function body.

Passing a list to a function and finding it mutated afterward: `clean(data)` empties the caller's list because lists are mutable and shared by reference.

Either document the mutation explicitly, or work on a copy: `def clean(data): data = data[:]; ...`. For dicts use `data = dict(data)`.

Lambda with a multi-statement body produces `SyntaxError`. Students try to write `lambda x: y = x + 1; return y`, which is illegal.

Lambdas hold one expression. Switch to a `def` for anything that needs intermediate variables or multiple statements.

When to use functions

Define a function whenever a code block appears twice, accepts inputs, or returns a value the rest of the program uses. Use a lambda only for single-expression callbacks to `sorted`, `map`, `filter`, or pandas `.apply()`.

Need Help?

Having trouble with this topic on an assignment? Our Python developers ship working code plus a walkthrough that helps you explain the code in class.