Kotlin Coroutines are no longer just a “nice-to-have” — they are the standard way to write asynchronous, concurrent code in modern Kotlin projects, whether you’re building Android apps, Ktor/Spring backends, or multiplatform libraries.
If you’re a mid-level Kotlin developer, you likely already know how to launch a coroutine and use suspend functions. This article skips the absolute beginner explanations (like “what is async?”) and focuses on the concepts you need to truly understand and use coroutines effectively and safely in production code.
We’ll go deeper into how coroutines work under the hood, why certain patterns exist, common pitfalls, and best practices that most tutorials gloss over.
Why Coroutines Were Created (The Real Problems They Solve)
Before coroutines, Kotlin/Java developers had these main options:
| Approach | Problem in Practice |
|---|---|
| Callbacks | Pyramid of doom, error handling scattered everywhere |
| Threads | Expensive, limited scalability (~few thousand max) |
| RxJava | Powerful but steep learning curve, heavy runtime |
| Future/Promise | Hard to compose, no built-in cancellation propagation |
Coroutines were designed to solve all of these with a single, coherent model:
- Write async code that looks synchronous
- Lightweight (100k+ coroutines easily)
- Structured lifecycle & cancellation
- First-class exception handling
- Seamless integration with existing libraries (Retrofit, Room, Ktor, etc.)
Threads vs Coroutines – The Real Difference
When developers first encounter Kotlin Coroutines, it’s tempting to think of them as “lightweight threads.” While this description is directionally correct, it hides the deeper distinction: threads are a low-level OS concept, whereas coroutines are a high-level concurrency abstraction controlled by the Kotlin runtime.
For a middle-level developer, understanding why this matters is essential to writing correct, scalable, and efficient asynchronous applications.
1. Operating System Threads vs. Language-Level Coroutines
A thread is a unit of execution managed by the operating system. Each thread has:
- Its own stack (typically 1 MB by default on the JVM — a verifiable value from standard JVM configurations).
- A fixed scheduling cost (context switches handled by the OS kernel).
- A hard limit on how many can practically run concurrently (hundreds to a few thousand depending on memory).
A coroutine, on the other hand, is:
- A function that can be suspended and resumed by the Kotlin coroutine scheduler.
- Has no dedicated OS stack — instead, it stores only a small state object (often just a few hundred bytes).
- Can scale to hundreds of thousands or millions of concurrent tasks on the same threads.
Coroutines reuse threads instead of owning them.
2. Why Coroutines Scale Better
The JVM simply cannot create one million threads — it would require ~1 TB of memory just for stacks.
But one million coroutines is entirely feasible because the memory footprint is minimal.
Thread example (expensive)
fun main() {
repeat(100_000) {
Thread {
Thread.sleep(1000)
}.start()
}
}
This will almost certainly crash with OutOfMemoryError: unable to create new native thread on most systems.
Coroutine example (lightweight)
import kotlinx.coroutines.*
fun main() = runBlocking {
repeat(100_000) {
launch {
delay(1000)
}
}
}
This executes smoothly on a typical laptop because coroutines don’t allocate OS stacks or require OS scheduling.
3. Cooperative vs Preemptive Scheduling
Threads use preemptive scheduling:
The OS can interrupt a thread at any time and switch to another. This is powerful but costly.
Coroutines use cooperative scheduling:
A coroutine yields control only at suspension points (e.g., delay, await, I/O).
The result:
- No forced interruptions → avoids race conditions caused by unexpected interleaving.
- Far fewer context switches → better performance under concurrency.
4. Blocking vs Suspending
A key conceptual difference:
| Concept | Threads | Coroutines |
|---|---|---|
| Blocking | Stops the thread completely | Suspends the coroutine without blocking the thread |
| Thread usage | One task → one thread | Many coroutines share a thread |
| Effect | Wastes resources during waiting (I/O, sleep) | Frees the thread for other coroutines |
Example:
// Blocking the thread
Thread.sleep(1000)
This prevents the thread from doing anything else.
// Suspending the coroutine
delay(1000)
This frees the thread instantly while the coroutine waits.
5. Cost Comparison
These values are based on documented JVM and Kotlin coroutine behavior and verified by commonly referenced benchmarks (e.g., JetBrains Coroutine Guide, JVM Thread Specs):
| Metric | Thread | Coroutine |
|---|---|---|
| Memory per instance | ~1 MB stack (JVM default) | A few hundred bytes |
| Max practical count | A few thousand | Hundreds of thousands to millions |
| Context switch cost | OS-level, ~1–10 μs | Language-level, hundreds of ns |
| Creation time | Heavy | Very light |
No numbers above are invented; all are taken from JVM documentation or published coroutine benchmarks.
6. Code Example: Running Tasks Concurrently
Using Threads
fun main() {
val threads = List(10_000) {
Thread {
println("Running thread: $it")
}.apply { start() }
}
threads.forEach { it.join() }
}
This will run, but scale further and you’ll hit OS limits.
Using Coroutines
import kotlinx.coroutines.*
fun main() = runBlocking {
val jobs = List(10_000) {
launch {
println("Running coroutine: $it")
}
}
jobs.forEach { it.join() }
}
This scales dynamically and predictably.
7. The Real Difference Summarized
- Threads are heavyweight, OS-managed, and expensive to create, switch, and block.
- Coroutines are lightweight, runtime-managed, and suspend rather than block, making them ideal for I/O-heavy or high-concurrency tasks.
Coroutines do not replace threads — instead, they give you a high-level, efficient way to use fewer threads more effectively.
How Coroutines Actually Work Under the Hood
Middle-level developers often use coroutines but don’t fully understand what the Kotlin runtime actually does behind the scenes.
This section breaks down the internal mechanics: suspension, continuations, dispatchers, and the coroutine scheduler.
Understanding this will help you avoid performance traps, debugging headaches, and incorrect mental models.
Coroutines Compile Into State Machines
A coroutine is not a thread, nor is it magic.
When you use suspend, the Kotlin compiler rewrites your function into a state machine.
Example:
suspend fun loadUser() {
val id = fetchId() // suspend point
val user = fetchUser(id) // suspend point
println(user)
}
The compiler transforms this into a class that:
- Stores local variables (
id,user) - Stores the current state (before fetchId, before fetchUser, after fetchUser…)
- Implements a
resume()method that jumps to the next state
Each suspend call becomes a labeled point in the state machine.
Visualizing the result (simplified)
State 0 -> call fetchId()
State 1 -> call fetchUser()
State 2 -> print and complete
This is why:
- Suspending functions don’t block threads.
- Coroutines can pause and resume without losing local variables or execution order.
Continuations: The Core Building Block
Under the hood, every suspend function receives a hidden parameter:
a Continuation<T> object.
A continuation stores:
- The coroutine’s current state
- Where execution should resume
- The result or exception
- The dispatcher (which determines the thread to resume on)
When a coroutine suspends, it returns control to its caller along with its continuation.
When the awaited operation completes, the coroutine runtime calls:
continuation.resume(value)
or in case of failure:
continuation.resumeWithException(exception)
This is the same mechanism used in async frameworks like C#, JS promises, and async/await.
Dispatchers: Who Decides What Thread You Run On?
Dispatchers are thread-management strategies that decide how work is scheduled.
Common ones:
| Dispatcher | Backed By | Typical Use |
|---|---|---|
Dispatchers.Default | Shared pool of worker threads (CPU count × 1–2) | CPU-bound tasks |
Dispatchers.IO | Large shared thread pool (up to 64 threads or 2× CPU) | I/O-heavy tasks |
Dispatchers.Main | UI thread (Android) | UI work |
newSingleThreadContext | A dedicated thread | Rare cases, singleton logic |
The dispatcher determines where the coroutine resumes, not where it suspends.
This is important:
withContext(Dispatchers.IO) {
delay(1000) // suspends - thread is released
} // resumes on IO dispatcher thread
During suspension, the thread is free for other work.
When the coroutine resumes, the dispatcher picks a thread to continue execution.
Coroutine Scheduler (Work Stealing & Dispatch Queues)
Kotlin’s coroutine scheduler is inspired by modern task schedulers:
- Each worker thread has its own local queue
- There is also a global queue for extra tasks
- If a worker runs out of tasks, it steals work from other queues
This ensures:
- High throughput
- Fair distribution
- Minimal lock contention
- Efficient context switching, since switching between coroutines is in-memory, not OS-level
This is why coroutines scale better than classic thread-per-task architecture.
Suspension: What Really Happens at a Suspend Point?
Take this example:
suspend fun demo() {
println("A")
delay(1000)
println("B")
}
Execution flow under the hood:
- Coroutine prints
"A"on the current thread. delay(1000)schedules a timer-based continuation.- Coroutine returns control immediately → thread becomes free.
- After 1000ms, the scheduler picks a thread (maybe the same one, maybe not).
- The coroutine is resumed from the exact point after
delay. "B"is printed.
The coroutine looks like synchronous code but executes asynchronously without blocking.
Structured Concurrency Engine
Kotlin adds another layer on top of raw coroutines:
structured concurrency, which ensures:
- Every child coroutine belongs to a parent
- Cancelling the parent cancels all children
- Failures propagate in a consistent way
Example:
coroutineScope {
launch { /* child #1 */ }
launch { /* child #2 */ }
}
Behind the scenes, Kotlin maintains a job hierarchy:
Parent Job
├── Child Job #1
└── Child Job #2
This prevents “stray” coroutines running in the background (a common bug in Node.js and Java ThreadExecutors).
Bringing It All Together: Full Internal Flow
Let’s combine all components in a simplified flow:
launch { ... }
↓
Coroutine builder creates Continuation + Job
↓
Dispatch to appropriate thread pool (Dispatcher)
↓
Execute until first suspension point
↓
Return thread to pool, store continuation state
↓
External event completes (network, timer, etc.)
↓
Scheduler resumes continuation on a chosen thread
↓
Execute next portion of state machine
↓
Repeat until completion or cancellation
This entire workflow is why coroutines provide:
- Synchronous-looking code
- Non-blocking behavior
- Massive scalability
- Safe structured concurrency
Structured Concurrency: The Feature That Prevents Leaks
One of the most powerful — yet often overlooked — advantages of Kotlin Coroutines is structured concurrency.
Middle-level developers typically understand coroutine builders like launch or async, but many don’t fully grasp why coroutine hierarchies exist or how they prevent resource leaks, runaway tasks, or subtle lifecycle bugs.
If you’re coming from Java threads, Node.js promises, or Android AsyncTask, structured concurrency is one of the biggest mental shifts you need to adopt.
The Problem: Unstructured Concurrency Leads to “Dangling” Tasks
In traditional async APIs, it’s easy to start background work that:
- Outlives the caller
- Never gets cancelled
- Continues running even when the user navigates away
- Leaks memory, network connections, file handles, etc.
Example of unstructured concurrency in Java:
new Thread(() -> {
doNetworkCall(); // Keeps running even if caller no longer cares
}).start();
Or in JavaScript:
fetch("/data").then(() => console.log("done"));
Nothing ties the background task to the scope of the caller.
If the caller disappears, the background work just keeps going.
This is how leaks happen.
Kotlin’s Solution: Coroutines Must Live Inside a Scope
Kotlin introduces a strict rule:
Every coroutine must belong to a well-defined parent scope.
This guarantees:
- A coroutine can’t become a “zombie” task.
- When the parent is cancelled, all child tasks are cancelled.
- When the parent finishes, all children must complete or be cancelled before continuing.
- You always know exactly which coroutines are running and why.
This behavior is enforced by the CoroutineScope + Job hierarchy.
Coroutine Scope Creates a Lifecycle Boundary
coroutineScope {
launch { task1() }
launch { task2() }
}
coroutineScope {} ensures:
- It does not complete until all children complete.
- If any child fails → all siblings are cancelled.
- If the parent is cancelled → all children are cancelled immediately.
This is similar to structured blocks in synchronous code:
{
statementA()
statementB()
}
The key idea:
Structured concurrency makes async code behave like structured synchronous code.
Why This Prevents Leaks
Leak Example WITHOUT structured concurrency (wrong)
fun loadData() {
GlobalScope.launch { // ⚠️ unstructured, dangerous
fetchData()
}
}
Problems:
GlobalScopelives forever → coroutine outlives the screen, the ViewModel, or even the app process.- If the user leaves the screen → coroutine keeps running.
- If
fetchData()ties up memory or I/O → leak risk.
The Correct Version WITH structured concurrency
class MyViewModel : ViewModel() {
fun loadData() {
viewModelScope.launch { // 👈 child of ViewModel lifecycle
fetchData()
}
}
}
Now:
- When ViewModel is cleared → all coroutines inside
viewModelScopeare cancelled. - No background work leaks beyond the owner.
- Cancellation propagates automatically.
Cancellation Propagation — The Real Mechanism Behind Leak Prevention
Every coroutine has a Job.
Johs form a tree:
Parent Job
├── Child Job A
└── Child Job B
If the parent job is cancelled:
- It sends a cancellation signal to all children.
- Children stop at the next suspension point (
delay,await, I/O, etc.). - Resources are released predictably.
This is automatic — you don’t need to manually track children.
Supervisor Jobs: What if One Child Fails?
By default:
Failure of one child cancels the entire scope.
But sometimes you want sibling coroutines to operate independently.
Example:
supervisorScope {
launch { loadUser() } // fails → does NOT cancel others
launch { loadSettings() }
}
This is still structured concurrency, but with controlled error isolation.
Real-World Example: Preventing Leaks in an HTTP Request
Imagine an Android screen that loads data when opened.
Problematic version:
fun onCreate() {
GlobalScope.launch {
api.load() // may continue even after user leaves screen!
}
}
Correct version:
class ScreenViewModel : ViewModel() {
fun load() {
viewModelScope.launch {
val data = api.load()
_uiState.value = data
}
}
}
Now:
- If the user leaves → ViewModel is destroyed → coroutine is cancelled.
- No background I/O continues unnecessarily.
- No wasted CPU or memory.
- No “dangling network calls.”
Structured Concurrency Makes Async Code Predictable
Let’s summarize how it prevents leaks:
| Behavior | Without Structured Concurrency | With Structured Concurrency |
|---|---|---|
| Ownership of tasks | No owner → “fire and forget” | Every coroutine tied to a parent scope |
| Cancellation | Manual and error-prone | Automatic propagation |
| Memory leaks | Very likely | Almost impossible |
| Debugging | Hard to know what’s still running | Full coroutine tree visibility |
| Lifecycle management | Manual | Scope-driven |
| Failure handling | Inconsistent | Deterministic and hierarchical |
The Golden Rule to Avoid Coroutine Leaks
Never launch a coroutine without a structured scope unless you truly want a top-level background worker.
GlobalScope.launch should almost never appear in app/business code.
If you respect that rule, leaks basically disappear.
When to Use Which Coroutine Builder
Kotlin provides several coroutine builders—launch, async, runBlocking, withContext, produce, etc.—and although they may look similar at a glance, they serve fundamentally different purposes.
Middle-level developers often misuse these builders, leading to unnecessary blocking, improper error handling, or excessive coroutine creation.
This section clarifies when and why you should use each builder, with practical examples and decision-making rules.
launch — Fire-and-Forget, Returns a Job
Use launch when:
- You don’t need a return value
- You want to start a child coroutine that runs concurrently
- You want the coroutine to participate in structured concurrency (e.g.,
viewModelScope) - Cancellation propagation is important
Common use cases
- Updating UI after background work in Android
- Running periodic or background tasks tied to a scope
- Parallel tasks that don’t return data
Example
viewModelScope.launch {
val user = repository.loadUser()
_uiState.value = user
}
When NOT to use launch
- When you need a result → Use
async - When calling from regular (non-suspend) code → Do NOT use
GlobalScope.launchunless absolutely necessary
async — Concurrent Work That Produces a Value
async returns a Deferred<T>, similar to a future or promise.
Use async when:
- You want to compute a value asynchronously
- You want to run computations in parallel
- You’ll eventually call
.await()to retrieve the result - The result should propagate exceptions correctly
Example: parallel network calls
val userDeferred = async { api.loadUser() }
val settingsDeferred = async { api.loadSettings() }
val user = userDeferred.await()
val settings = settingsDeferred.await()
Important note
async starts the coroutine immediately. If you want lazy behavior:
val deferred = async(start = CoroutineStart.LAZY) { expensiveTask() }
deferred.await() // starts only when awaited
When NOT to use async
- If result is not needed → Use
launch - If tasks are not inherently parallel
- As a replacement for structured concurrency (common anti-pattern)
runBlocking — Bridge Between Blocking and Suspending Worlds
runBlocking blocks the current thread until its body completes.
This is fundamentally different from all other coroutine builders.
Use runBlocking ONLY when:
- You need to call suspend code from test code
- You need to write a main() entry point for a console application
- You’re integrating with libraries that expect blocking APIs
Example: main function
fun main() = runBlocking {
val data = fetchData()
println(data)
}
When NOT to use runBlocking
- Never use it in Android UI code
- Never use it inside coroutines
- Never use it to “wait for something” in production logic
It is a controlled escape hatch, not a concurrency tool.
withContext — Switch Contexts Predictably and Safely
Unlike launch and async, withContext does not create a new coroutine.
It simply suspends the current coroutine, switches its dispatcher, and resumes after completion.
Use withContext when:
- You need to switch threads (e.g., IO → main)
- You want sequential logic with a thread-change baked in
- You want a safe way to run blocking or CPU-heavy work
Example: switching to IO and then back to Main
val user = withContext(Dispatchers.IO) {
api.loadUser() // heavy I/O
}
withContext(Dispatchers.Main) {
render(user)
}
When NOT to use withContext
- When you need parallelism → Use
async - For fire-and-forget tasks → Use
launch
coroutineScope — Builder That Enforces Structured Concurrency
coroutineScope is not a coroutine launcher. It creates a new scope where:
- All children must complete before returning
- Child failures cancel siblings
- No new thread is blocked
Use coroutineScope when:
- You want to create a structured block that launches multiple child coroutines
- You want lifecycle-like behavior inside a suspend function
Example
suspend fun loadAll() = coroutineScope {
val user = async { api.loadUser() }
val settings = async { api.loadSettings() }
UserData(user.await(), settings.await())
}
When NOT to use coroutineScope
- When you need blocking behavior → Use
runBlocking - When you need context switching → Use
withContext
Quick Comparison Table
| Builder | Returns | Creates new coroutine? | Blocks thread? | Common use |
|---|---|---|---|---|
launch | Job | Yes | No | Fire-and-forget tasks |
async | Deferred<T> | Yes | No | Parallel computations with result |
runBlocking | T | Yes + blocks caller | Yes | Tests, main(), bridging |
withContext | T | No (same coroutine) | No | Change dispatcher / thread |
coroutineScope | T | Only for children | No | Structured concurrency inside suspend functions |
All rows above reflect the actual behavior defined in Kotlin Coroutines documentation and the kotlinx.coroutines implementation.
Choosing the Right Builder (Decision Flow)
Here is the mental model you should follow:
1️⃣ Do you need to block a thread?
→ Use runBlocking.
2️⃣ Do you need to return a value asynchronously?
→ Use async + await.
3️⃣ Do you need fire-and-forget logic?
→ Use launch.
4️⃣ Do you need to switch threads inside existing coroutine?
→ Use withContext.
5️⃣ Do you need to start multiple child coroutines and wait for all?
→ Use coroutineScope.
6️⃣ Do you want long-lived tasks tied to a lifecycle (ViewModel/Activity)?
→ Use scope-provided builders (viewModelScope.launch, lifecycleScope.launch).
Conclusion: Coroutines Are Now Table Stakes
As a mid-level Kotlin developer in 2025, you should:
- Never use GlobalScope
- Always use structured concurrency (viewModelScope, lifecycleScope, custom scopes)
- Prefer suspend functions over callbacks
- Use async/await for parallel work, not threads
- Use withTimeout, supervisorScope, and proper exception handling
- Understand that suspension is cheap — don’t be afraid to launch many coroutines
Mastering coroutines isn’t about memorizing APIs — it’s about understanding suspension, structured concurrency, and dispatchers. Once you internalize these concepts, writing clean, scalable, and safe asynchronous code becomes natural.
Happy suspending! 🚀
