Python Data Structures: Lists, Tuples, Sets, and Dictionaries

In the previous posts, we covered syntax, basics, and control flow. Now we’re ready to work with the core built-in data structures: lists, tuples, sets, and dictionaries. This post is aimed at junior and middle developers who want to move from “I can make it work” to “I can make it work efficiently and elegantly.” We’ll go deep with clear explanations, full runnable examples, and practical advice you can apply today.

Let’s build that mastery—around 2500 words of actionable knowledge ahead!

Why Data Structures Matter

Data structures determine how you store and access data. The wrong choice leads to:

  • Slow code (O(n) lookups instead of O(1))
  • Bugs from unexpected mutations
  • Memory waste
  • Hard-to-maintain spaghetti code

Python gives you powerful, flexible built-ins that cover 95% of real-world needs. Master these four, and you’ll rarely need custom classes for basic storage.

Quick overview:

  • List: Ordered, mutable, allows duplicates
  • Tuple: Ordered, immutable, allows duplicates
  • Set: Unordered, mutable, no duplicates
  • Dictionary: Unordered (ordered since 3.7), mutable keys→values, keys unique

Let’s dive in.

Lists: Use Cases, Methods, and Performance

Lists are the workhorse of Python—dynamic arrays under the hood.

Python

# Creation
fruits = ["apple", "banana", "cherry"]
numbers = [1, 2, 3, 4, 5]
mixed = [1, "hello", 3.14, True]
nested = [[1, 2], [3, 4]]  # List of lists

Key methods:

Python

fruits.append("orange")          # Add to end
fruits.insert(0, "mango")        # Insert at index
fruits.extend(["grape", "kiwi"]) # Add multiple
removed = fruits.pop()           # Remove and return last
removed_idx = fruits.pop(2)      # Remove by index
fruits.remove("banana")          # Remove first occurrence
fruits.sort()                    # In-place sort
sorted_fruits = sorted(fruits)   # Return new sorted list

Slicing is powerful:

Python

print(fruits[1:4])    # ['mango', 'apple', 'cherry']
print(fruits[::-1])   # Reverse copy
print(fruits[:])      # Shallow copy

Performance notes:

  • Append/pop from end: O(1) amortized
  • Insert/pop from beginning: O(n)
  • Lookup by index: O(1)
  • Search (in): O(n)

Full example: Shopping cart

Python

cart = []

while True:
    action = input("Add/Remove/View/Quit: ").lower()
    if action == "add":
        item = input("Item: ")
        qty = int(input("Quantity: "))
        cart.append({"item": item, "qty": qty})
    elif action == "remove":
        idx = int(input("Item number to remove: ")) - 1
        if 0 <= idx < len(cart):
            cart.pop(idx)
    elif action == "view":
        print("\nShopping Cart:")
        for i, item in enumerate(cart, 1):
            print(f"{i}. {item['item']} (x{item['qty']})")
    elif action == "quit":
        break

print("Final cart:", cart)

Middle devs: For large lists, consider collections.deque for fast left-side operations.

Tuples: When Immutability Is a Feature

Tuples are immutable lists. Once created, they can’t change.

Python

point = (10, 20)
colors = ("red", "green", "blue")
single = (42,)  # Note the comma!
empty = ()

Why use tuples?

  • Faster than lists (less overhead)
  • Hashable → can be dict keys or set elements
  • Data integrity (protect against accidental mutation)
  • Convention for heterogeneous data (e.g., records)

Python

# Unpacking – Pythonic magic
x, y = point
name, age, city = ("Alice", 30, "NYC")

# Swapping
a, b = 1, 2
a, b = b, a  # No temp variable needed

Full example: CSV row processing

Python

records = [
    ("Alice", 30, "Engineer"),
    ("Bob", 25, "Designer"),
    ("Charlie", 35, "Manager")
]

for name, age, role in records:  # Tuple unpacking
    print(f"{name} ({age}) works as {role}")

# Tuples as dict keys
locations = {}
locations[("NYC", "Central Park")] = "Picnic spot"
locations[(40.7128, -74.0060)] = "New York"  # Coordinates
print(locations)

Juniors: Remember the comma for single-element tuples. Middle devs: Use collections.namedtuple or typing.NamedTuple for readable records.

Python

from typing import NamedTuple

class Employee(NamedTuple):
    name: str
    age: int
    role: str

emp = Employee("Alice", 30, "Engineer")
print(emp.name)  # Dot access!

Sets: Uniqueness and Set Operations

Sets are unordered collections of unique, hashable items.

Python

unique_numbers = {1, 2, 3, 3, 4}  # {1, 2, 3, 4}
empty_set = set()  # Not {} — that's dict!
frozen = frozenset([1, 2, 3])  # Immutable set

Operations (mathematical set theory):

Python

a = {1, 2, 3, 4}
b = {3, 4, 5, 6}

print(a | b)        # Union: {1,2,3,4,5,6}
print(a & b)        # Intersection: {3,4}
print(a - b)        # Difference: {1,2}
print(a ^ b)        # Symmetric difference: {1,2,5,6}

Methods:

Python

a.add(5)
a.remove(1)         # Raises KeyError if missing
a.discard(999)      # Safe remove
a.pop()             # Remove and return arbitrary

Performance: O(1) average lookup, insert, delete.

Real-world use: Removing duplicates

Python

# Fast deduplication
log_lines = ["error", "info", "error", "warning", "error"]
unique_logs = list(set(log_lines))
print(unique_logs)  # Order not preserved!

# Preserve order (Python 3.7+)
unique_ordered = list(dict.fromkeys(log_lines))
print(unique_ordered)  # ['error', 'info', 'warning']

Full example: Finding common friends

Python

friends_alice = {"Bob", "Charlie", "Diana"}
friends_bob = {"Alice", "Charlie", "Eve"}

common = friends_alice & friends_bob
print("Common friends:", common)

mutual_intros = friends_alice | friends_bob - common
print("Potential intros:", mutual_intros)

Middle devs: Use sets for membership testing on large collections.

Dictionaries: The Most Powerful Python Data Structure

Dicts map keys → values. Keys must be hashable; values can be anything.

Python

user = {
    "name": "Alice",
    "age": 30,
    "roles": ["admin", "editor"]
}

Modern features (Python 3.7+):

  • Insertion order preserved
  • Merge operator

Python

defaults = {"theme": "dark", "lang": "en"}
user_prefs = {"lang": "fr", "notifications": True}
combined = defaults | user_prefs
print(combined)

Methods:

Python

user.get("age", 0)           # Safe access
user.setdefault("city", "Unknown")  # Insert if missing
user.update({"age": 31})    # Merge
keys = list(user.keys())
values = list(user.values())
items = list(user.items())  # List of tuples

Full example: Word frequency counter

Python

text = "the quick brown fox jumps over the lazy dog the fox"
words = text.split()

freq = {}
for word in words:
    freq[word] = freq.get(word, 0) + 1

# More Pythonic
from collections import Counter
freq = Counter(words)
print(freq)
print(freq.most_common(3))

Middle devs: Use defaultdict to simplify.

Python

from collections import defaultdict

freq = defaultdict(int)
for word in words:
    freq[word] += 1

Another powerful pattern: Grouping

Python

employees = [
    {"name": "Alice", "dept": "Engineering"},
    {"name": "Bob", "dept": "Sales"},
    {"name": "Charlie", "dept": "Engineering"}
]

by_dept = defaultdict(list)
for emp in employees:
    by_dept[emp["dept"]].append(emp["name"])

print(by_dept)

Choosing the Right Data Structure

Quick decision guide:

NeedChooseWhy
Ordered sequence, frequent appendListFast append/pop from end
Fixed data, use as dict keyTupleImmutable, hashable
Fast membership testingSetO(1) lookup
Unique items, math operationsSetBuilt-in set operations
Key-value mappingDictO(1) lookup, flexible
Count occurrencesCounterSpecialized dict
Queue (FIFO)dequeFast append/pop left

Example: Processing API response

Python

response = [
    {"id": 1, "tag": "python"},
    {"id": 2, "tag": "java"},
    {"id": 1, "tag": "javascript"}
]

# Group tags by id
tags_by_id = defaultdict(list)
for item in response:
    tags_by_id[item["id"]].append(item["tag"])

# Get unique tags overall
all_tags = {tag for tags in tags_by_id.values() for tag in tags}
print("Unique tags:", all_tags)

Common Mistakes and Anti-Patterns

  1. Using list for membership testing

Python

# Bad: O(n)
if item in large_list:

# Good: O(1)
if item in large_set:
  1. Mutating during iteration

Python

# Crash!
for key in my_dict:
    if condition:
        del my_dict[key]

# Fix
for key in list(my_dict.keys()):
    ...
  1. Forgetting dict preserves order only from 3.7+

Python

# Safe across versions
from collections import OrderedDict  # Rarely needed now
  1. Using mutable default arguments

Python

# Dangerous!
def add_item(item, lst=[]):
    lst.append(item)
    return lst

# Fix: Use None
def add_item(item, lst=None):
    if lst is None:
        lst = []
    ...
  1. Overusing nested lists/dicts

Deep nesting → hard to read. Consider classes or pandas DataFrames for complex data.

Full pitfall demo:

Python

# Bad: Mutable default
def buggy_append(value, container=[]):
    container.append(value)
    return container

print(buggy_append(1))  # [1]
print(buggy_append(2))  # [1, 2] — surprise!

# Good
def safe_append(value, container=None):
    if container is None:
        container = []
    container.append(value)
    return container

In conclusion, mastering lists, tuples, sets, and dictionaries gives you the tools to model almost any data relationship efficiently. Practice by building projects: a contact book (dicts), a unique visitor tracker (sets), configuration storage (nested dicts + tuples).

Leave a Reply

Your email address will not be published. Required fields are marked *