Strings Are Immutable: What That Means in Practice
In Python, strings are immutable — once created, you cannot change them in place. Any “modification” creates a new string.
Why care?
- Safety: No accidental side effects when passing strings around functions.
- Thread-safety: Easier in concurrent code.
- Hashability: Strings can be dict keys or set members because their value never changes.
- Performance trade-off: Repeated modifications (e.g., in a loop) are inefficient if you don’t understand this.
Bad example (slow, creates many temporary strings):
Python
s = ""
for i in range(10000):
s += str(i) + " " # Creates new string each time → O(n²) time!
Better: Use a list + join (only one final allocation):
Python
parts = []
for i in range(10000):
parts.append(str(i))
s = " ".join(parts) # Fast!
Or even better with generator expression:
Python
s = " ".join(str(i) for i in range(10000))
Proof of immutability:
Python
text = "hello"
print(id(text)) # Some memory address, e.g. 140712345678912
text = text.upper() # Creates NEW string
print(id(text)) # Different address!
# Trying to modify in place fails
# text[0] = "H" # TypeError: 'str' object does not support item assignment
Methods like .replace(), .strip(), .upper() all return new strings — the original remains unchanged.
Python
original = " python is fun "
cleaned = original.strip().upper()
print(original) # Still " python is fun "
print(cleaned) # "PYTHON IS FUN"
Key takeaway: Embrace immutability. Use list comprehensions, join, or io.StringIO for heavy concatenation. In practice, for small strings (< few KB), the naive += is optimized by CPython — but don’t rely on it for loops or large data.
Must-Know Operations: split, join, strip, replace
These four methods handle 80% of everyday text wrangling.
- split(sep=None, maxsplit=-1): Breaks string into list. Default sep is whitespace.
Python
sentence = "Python is awesome and fun"
words = sentence.split() # ['Python', 'is', 'awesome', 'and', 'fun']
print(words)
csv_line = "name,age,city"
fields = csv_line.split(",") # ['name', 'age', 'city']
log = "2026-01-14 20:11:00 INFO Processing started"
timestamp, level, message = log.split(" ", 2) # maxsplit=2
print(timestamp, level) # 2026-01-14 20:11:00 INFO
- join(iterable): Opposite of split — glues strings with separator.
Python
path_parts = ["home", "duong", "projects", "blog"]
path = "/".join(path_parts) # home/duong/projects/blog
print(path)
tags = ["python", "strings", "tips"]
hashtags = " ".join(f"#{tag}" for tag in tags)
print(hashtags) # #python #strings #tips
- strip([chars]), lstrip(), rstrip(): Removes leading/trailing whitespace (default) or specified chars.
Python
dirty = " hello world! \n"
clean = dirty.strip() # "hello world!"
print(repr(clean)) # 'hello world!'
user_input = "***welcome***"
print(user_input.strip("*")) # "welcome"
- replace(old, new, count=-1): Simple find-and-replace.
Python
text = "I love Python. Python is great!"
updated = text.replace("Python", "Rust", 1) # Replace only first occurrence
print(updated) # I love Rust. Python is great!
# Multi-line example
config = """
DEBUG=True
HOST=localhost
"""
fixed = config.replace("True", "False")
print(fixed)
Bonus combo: Clean CSV-like input
Python
raw = " john doe , 42 , Hanoi "
cleaned = [field.strip() for field in raw.split(",")]
print(cleaned) # ['john doe', '42', 'Hanoi']
Practice these — they’re building blocks for parsing, cleaning, and generating text.
f-strings: Clean Formatting the Modern Way
Introduced in Python 3.6, f-strings are now the preferred way to format strings — readable, fast, and powerful. In Python 3.12+ (our 2026 reality), they got even better thanks to PEP 701: nested quotes, arbitrary expressions, better error messages, and debug support.
Basic syntax: f”hello {variable}”
Python
name = "Duong"
age = 30
city = "Hanoi"
greeting = f"Hello {name}! You are {age} years old and live in {city}."
print(greeting)
Expressions inside {}:
Python
score = 85.567
print(f"Your score: {score:.2f}%") # Your score: 85.57%
now = 2026
print(f"Current year: {now}")
print(f"Next year: {now + 1}")
Advanced formatting specifiers (like old .format()):
- Alignment: :< left, :> right, :^ center, : width
- Numbers: :, thousands separator, :.nf decimals
Python
price = 1234567.89
quantity = 5
print(f"Total: ${price * quantity:,.2f}") # Total: $6,172,839.45
header = "Item"
value = "Python Mastery"
print(f"{header:>20} | {value:^30}")
# Item | Python Mastery
Python 3.12+ debug feature (super useful for logging/debugging):
Python
x = 42
y = "test"
print(f"{x=}") # x=42
print(f"{x + 10 = }") # x + 10 = 52
print(f"{y.upper() = }")# y.upper() = 'TEST'
x=42 x + 10 = 52 y.upper() = 'TEST'
Nested quotes (new in 3.12):
Python
details = {"name": "Alex", "lang": "Python"}
print(f"User: {details['name']} loves {details["lang"]}") # Works!
Multi-line f-strings:
Python
report = f"""
User Report
-----------
Name: {name}
Age : {age}
City: {city.upper()}
"""
print(report)
f-strings beat old % and .format() in speed and clarity. Use them everywhere except when the format string is dynamic/user-controlled (security risk — use .format() then).
Unicode and Encoding Basics
Python 3 strings are Unicode by default (str type = sequence of Unicode code points). No more u”…” prefixes.
Common pain points: reading files, APIs, terminals.
- Encoding: How Unicode → bytes (UTF-8 is default and recommended).
Python
# Unicode string
text = "Xin chào Hà Nội! 😊" # Vietnamese + emoji
# To bytes
utf8_bytes = text.encode("utf-8")
print(utf8_bytes) # b'Xin ch\xc3\xa0o H\xc3\xa0 N\xe1\xbb\x99i! \xf0\x9f\x98\x8a'
# Back to string
decoded = utf8_bytes.decode("utf-8")
print(decoded == text) # True
File handling (always specify encoding!):
Python
# Write
with open("log.txt", "w", encoding="utf-8") as f:
f.write("Ghi log tiếng Việt: Thành công\n")
# Read
with open("log.txt", "r", encoding="utf-8") as f:
content = f.read()
print(content)
Common encodings:
- UTF-8: Universal, default.
- UTF-16/32: Sometimes from Windows/legacy.
- latin-1: For old Western European files (never guess — check source).
Normalize for comparisons/search:
Python
import unicodedata
def normalize(text):
# NFKD: decompose, then remove diacritics
return "".join(
c for c in unicodedata.normalize("NFKD", text)
if unicodedata.category(c) != "Mn"
).lower()
print(normalize("Hà Nội")) # "ha noi"
Use unicodedata for advanced needs (e.g., category checks, East Asian width).
Mini Exercises: Parsing Logs, Validating Simple Input
Time to practice! Try these yourself.
Exercise 1: Parse Apache-like log line
Python
log = '192.168.1.1 - - [14/Jan/2026:20:11:00 +0700] "GET /blog HTTP/1.1" 200 1234'
# Extract: IP, timestamp, method, path, status
parts = log.split('"')
request_part = parts[1] # GET /blog HTTP/1.1
method, path, _ = request_part.split()
timestamp_start = log.find("[") + 1
timestamp_end = log.find("]")
timestamp = log[timestamp_start:timestamp_end]
status_start = log.rfind('"') + 2
status = log[status_start:].split()[0]
print(f"IP: {log.split()[0]}")
print(f"Timestamp: {timestamp}")
print(f"Request: {method} {path}")
print(f"Status: {status}")
Exercise 2: Validate username + email
Python
def validate_input(username: str, email: str) -> bool:
username = username.strip()
if not (3 <= len(username) <= 20):
return False
if not username.isalnum() and "_" not in username:
return False # Only letters, numbers, underscore
email = email.lower().strip()
if "@" not in email or email.count("@") != 1:
return False
local, domain = email.split("@", 1)
if not local or not domain or "." not in domain:
return False
return True
# Test
print(validate_input("duong_dev", "duong@example.com")) # True
print(validate_input(" duong ", "bad@@email")) # False
Exercise 3: Generate formatted report with f-strings
Python
items = [
("Laptop", 1200.50, 2),
("Mouse", 25.00, 5),
]
total = 0
print(f"{'Item':<15} {'Price':>8} {'Qty':>5} {'Subtotal':>10}")
print("-" * 40)
for name, price, qty in items:
subtotal = price * qty
total += subtotal
print(f"{name:<15} {price:>8.2f} {qty:>5} {subtotal:>10.2f}")
print("-" * 40)
print(f"{'Total':<29} ${total:>9.2f}")
Output:
text
Item Price Qty Subtotal
----------------------------------------
Laptop 1200.50 2 2401.00
Mouse 25.00 5 125.00
----------------------------------------
Total $2526.00
These exercises combine everything: immutability awareness, method chaining, f-strings, and careful parsing.
You’ve now got solid string superpowers! Practice daily — parse a real log file, clean CSV data, generate reports. Strings are foundational; master them and the rest gets easier.
