In the previous posts, we covered syntax, basics, and control flow. Now we’re ready to work with the core built-in data structures: lists, tuples, sets, and dictionaries. This post is aimed at junior and middle developers who want to move from “I can make it work” to “I can make it work efficiently and elegantly.” We’ll go deep with clear explanations, full runnable examples, and practical advice you can apply today.
Let’s build that mastery—around 2500 words of actionable knowledge ahead!
Why Data Structures Matter
Data structures determine how you store and access data. The wrong choice leads to:
- Slow code (O(n) lookups instead of O(1))
- Bugs from unexpected mutations
- Memory waste
- Hard-to-maintain spaghetti code
Python gives you powerful, flexible built-ins that cover 95% of real-world needs. Master these four, and you’ll rarely need custom classes for basic storage.
Quick overview:
- List: Ordered, mutable, allows duplicates
- Tuple: Ordered, immutable, allows duplicates
- Set: Unordered, mutable, no duplicates
- Dictionary: Unordered (ordered since 3.7), mutable keys→values, keys unique
Let’s dive in.
Lists: Use Cases, Methods, and Performance
Lists are the workhorse of Python—dynamic arrays under the hood.
Python
# Creation
fruits = ["apple", "banana", "cherry"]
numbers = [1, 2, 3, 4, 5]
mixed = [1, "hello", 3.14, True]
nested = [[1, 2], [3, 4]] # List of lists
Key methods:
Python
fruits.append("orange") # Add to end
fruits.insert(0, "mango") # Insert at index
fruits.extend(["grape", "kiwi"]) # Add multiple
removed = fruits.pop() # Remove and return last
removed_idx = fruits.pop(2) # Remove by index
fruits.remove("banana") # Remove first occurrence
fruits.sort() # In-place sort
sorted_fruits = sorted(fruits) # Return new sorted list
Slicing is powerful:
Python
print(fruits[1:4]) # ['mango', 'apple', 'cherry']
print(fruits[::-1]) # Reverse copy
print(fruits[:]) # Shallow copy
Performance notes:
- Append/pop from end: O(1) amortized
- Insert/pop from beginning: O(n)
- Lookup by index: O(1)
- Search (in): O(n)
Full example: Shopping cart
Python
cart = []
while True:
action = input("Add/Remove/View/Quit: ").lower()
if action == "add":
item = input("Item: ")
qty = int(input("Quantity: "))
cart.append({"item": item, "qty": qty})
elif action == "remove":
idx = int(input("Item number to remove: ")) - 1
if 0 <= idx < len(cart):
cart.pop(idx)
elif action == "view":
print("\nShopping Cart:")
for i, item in enumerate(cart, 1):
print(f"{i}. {item['item']} (x{item['qty']})")
elif action == "quit":
break
print("Final cart:", cart)
Middle devs: For large lists, consider collections.deque for fast left-side operations.
Tuples: When Immutability Is a Feature
Tuples are immutable lists. Once created, they can’t change.
Python
point = (10, 20)
colors = ("red", "green", "blue")
single = (42,) # Note the comma!
empty = ()
Why use tuples?
- Faster than lists (less overhead)
- Hashable → can be dict keys or set elements
- Data integrity (protect against accidental mutation)
- Convention for heterogeneous data (e.g., records)
Python
# Unpacking – Pythonic magic
x, y = point
name, age, city = ("Alice", 30, "NYC")
# Swapping
a, b = 1, 2
a, b = b, a # No temp variable needed
Full example: CSV row processing
Python
records = [
("Alice", 30, "Engineer"),
("Bob", 25, "Designer"),
("Charlie", 35, "Manager")
]
for name, age, role in records: # Tuple unpacking
print(f"{name} ({age}) works as {role}")
# Tuples as dict keys
locations = {}
locations[("NYC", "Central Park")] = "Picnic spot"
locations[(40.7128, -74.0060)] = "New York" # Coordinates
print(locations)
Juniors: Remember the comma for single-element tuples. Middle devs: Use collections.namedtuple or typing.NamedTuple for readable records.
Python
from typing import NamedTuple
class Employee(NamedTuple):
name: str
age: int
role: str
emp = Employee("Alice", 30, "Engineer")
print(emp.name) # Dot access!
Sets: Uniqueness and Set Operations
Sets are unordered collections of unique, hashable items.
Python
unique_numbers = {1, 2, 3, 3, 4} # {1, 2, 3, 4}
empty_set = set() # Not {} — that's dict!
frozen = frozenset([1, 2, 3]) # Immutable set
Operations (mathematical set theory):
Python
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
print(a | b) # Union: {1,2,3,4,5,6}
print(a & b) # Intersection: {3,4}
print(a - b) # Difference: {1,2}
print(a ^ b) # Symmetric difference: {1,2,5,6}
Methods:
Python
a.add(5)
a.remove(1) # Raises KeyError if missing
a.discard(999) # Safe remove
a.pop() # Remove and return arbitrary
Performance: O(1) average lookup, insert, delete.
Real-world use: Removing duplicates
Python
# Fast deduplication
log_lines = ["error", "info", "error", "warning", "error"]
unique_logs = list(set(log_lines))
print(unique_logs) # Order not preserved!
# Preserve order (Python 3.7+)
unique_ordered = list(dict.fromkeys(log_lines))
print(unique_ordered) # ['error', 'info', 'warning']
Full example: Finding common friends
Python
friends_alice = {"Bob", "Charlie", "Diana"}
friends_bob = {"Alice", "Charlie", "Eve"}
common = friends_alice & friends_bob
print("Common friends:", common)
mutual_intros = friends_alice | friends_bob - common
print("Potential intros:", mutual_intros)
Middle devs: Use sets for membership testing on large collections.
Dictionaries: The Most Powerful Python Data Structure
Dicts map keys → values. Keys must be hashable; values can be anything.
Python
user = {
"name": "Alice",
"age": 30,
"roles": ["admin", "editor"]
}
Modern features (Python 3.7+):
- Insertion order preserved
- Merge operator
Python
defaults = {"theme": "dark", "lang": "en"}
user_prefs = {"lang": "fr", "notifications": True}
combined = defaults | user_prefs
print(combined)
Methods:
Python
user.get("age", 0) # Safe access
user.setdefault("city", "Unknown") # Insert if missing
user.update({"age": 31}) # Merge
keys = list(user.keys())
values = list(user.values())
items = list(user.items()) # List of tuples
Full example: Word frequency counter
Python
text = "the quick brown fox jumps over the lazy dog the fox"
words = text.split()
freq = {}
for word in words:
freq[word] = freq.get(word, 0) + 1
# More Pythonic
from collections import Counter
freq = Counter(words)
print(freq)
print(freq.most_common(3))
Middle devs: Use defaultdict to simplify.
Python
from collections import defaultdict
freq = defaultdict(int)
for word in words:
freq[word] += 1
Another powerful pattern: Grouping
Python
employees = [
{"name": "Alice", "dept": "Engineering"},
{"name": "Bob", "dept": "Sales"},
{"name": "Charlie", "dept": "Engineering"}
]
by_dept = defaultdict(list)
for emp in employees:
by_dept[emp["dept"]].append(emp["name"])
print(by_dept)
Choosing the Right Data Structure
Quick decision guide:
| Need | Choose | Why |
|---|---|---|
| Ordered sequence, frequent append | List | Fast append/pop from end |
| Fixed data, use as dict key | Tuple | Immutable, hashable |
| Fast membership testing | Set | O(1) lookup |
| Unique items, math operations | Set | Built-in set operations |
| Key-value mapping | Dict | O(1) lookup, flexible |
| Count occurrences | Counter | Specialized dict |
| Queue (FIFO) | deque | Fast append/pop left |
Example: Processing API response
Python
response = [
{"id": 1, "tag": "python"},
{"id": 2, "tag": "java"},
{"id": 1, "tag": "javascript"}
]
# Group tags by id
tags_by_id = defaultdict(list)
for item in response:
tags_by_id[item["id"]].append(item["tag"])
# Get unique tags overall
all_tags = {tag for tags in tags_by_id.values() for tag in tags}
print("Unique tags:", all_tags)
Common Mistakes and Anti-Patterns
- Using list for membership testing
Python
# Bad: O(n)
if item in large_list:
# Good: O(1)
if item in large_set:
- Mutating during iteration
Python
# Crash!
for key in my_dict:
if condition:
del my_dict[key]
# Fix
for key in list(my_dict.keys()):
...
- Forgetting dict preserves order only from 3.7+
Python
# Safe across versions
from collections import OrderedDict # Rarely needed now
- Using mutable default arguments
Python
# Dangerous!
def add_item(item, lst=[]):
lst.append(item)
return lst
# Fix: Use None
def add_item(item, lst=None):
if lst is None:
lst = []
...
- Overusing nested lists/dicts
Deep nesting → hard to read. Consider classes or pandas DataFrames for complex data.
Full pitfall demo:
Python
# Bad: Mutable default
def buggy_append(value, container=[]):
container.append(value)
return container
print(buggy_append(1)) # [1]
print(buggy_append(2)) # [1, 2] — surprise!
# Good
def safe_append(value, container=None):
if container is None:
container = []
container.append(value)
return container
In conclusion, mastering lists, tuples, sets, and dictionaries gives you the tools to model almost any data relationship efficiently. Practice by building projects: a contact book (dicts), a unique visitor tracker (sets), configuration storage (nested dicts + tuples).
