Kotlin Coroutines are “easy” until they aren’t. Most production issues I’ve debugged weren’t caused by coroutines themselves, but by wrong assumptions about dispatchers, threads, and “blocking”.
This post is a deep dive aimed at middle–senior Android/Kotlin developers. We’ll go past the usual “IO for network, Default for CPU” rule and talk about what a dispatcher actually does, how scheduling works, what “blocking” really means, and why the wrong dispatcher can hurt throughput and tail latency.
What is a dispatcher?
A dispatcher is the component that decides where and how a coroutine runs.
More precisely, a CoroutineDispatcher controls:
- Where a coroutine continuation is executed (which thread / pool)
- When it’s executed (queueing + scheduling)
- How much parallelism is allowed (how many tasks can run at once)
- In some cases, how to handle blocking (whether to compensate by adding threads)
A coroutine is not a thread. A coroutine executes in segments between suspension points. Each time it suspends and later resumes, the dispatcher decides how the continuation is scheduled.
viewModelScope.launch(Dispatchers.Default) {
// execution segments of this coroutine are scheduled by Dispatchers.Default
}
The key mental model:
Coroutines are logical concurrency. Dispatchers provide physical execution via threads and scheduling policy.
How dispatchers map to threads
“Blocking” means something specific
Before threads/pools, we need to clarify one word: blocking.
Many people say: “JSON parsing blocks the thread.” True in casual language, but not what schedulers mean.
In OS/scheduler terms:
- Runnable / Running: the thread is ready to run or running → it competes for CPU time.
- Blocked / Waiting: the thread is parked by the OS (e.g., waiting for I/O) → it does not compete for CPU.
CPU-bound work (e.g., JSON parsing, encryption, image decode) keeps the thread Runnable most of the time.
Blocking I/O (e.g., socket.read(), blocking file read) often puts the thread in Blocked state (parked by the kernel).
Why you should care: Dispatchers.IO is designed to compensate for threads that become Blocked (not runnable). It is not designed for CPU tasks that stay runnable.
Dispatchers.Default and Dispatchers.IO are backed by a scheduler
On Android/JVM, Dispatchers.Default and Dispatchers.IO are backed by the coroutine scheduler from kotlinx.coroutines. Conceptually, it behaves like a work-stealing thread pool optimized for many small coroutine continuations:
- Worker threads, often named like
DefaultDispatcher-worker-N - Each worker has a local queue (fast, low contention)
- There is also a global queue (for cross-thread submissions)
- When a worker is idle, it can steal work from others (work stealing)
Why this matters:
- Local queues reduce contention and improve cache locality
- Work stealing improves throughput under uneven load
- Scheduling is cheaper than a “single shared blocking queue” executor under high concurrency
The “mapping” is not “one coroutine → one thread”. It’s:
many coroutine continuations → scheduled onto a smaller set of worker threads
Dispatchers.Default: CPU-oriented parallelism
Dispatchers.Default is meant for CPU-bound work.
Characteristics:
- Parallelism is roughly tied to available CPU cores
- It aims to avoid oversubscription (too many runnable threads competing for the same cores)
- It expects work to be CPU-heavy but not blocking on I/O
When you run CPU tasks on Default, you typically get:
- High throughput
- Better cache locality
- Lower context-switch overhead
- More predictable tail latency
Dispatchers.IO: blocking-aware parallelism
Dispatchers.IO is meant for blocking I/O (operations that park threads in waiting states).
Key idea:
- Some threads will be blocked (not runnable)
- So the dispatcher can allow more concurrent threads than core count to keep progress going
This is sometimes described as “IO can use more threads (up to a configured cap)”. The exact cap may vary by runtime/version/config, but the important property is IO is allowed to oversubscribe to compensate for blocked threads.
This is a powerful tool — and the reason it’s dangerous when misused for CPU work.
Dispatchers.Main (Android): single-threaded, Looper-based
Dispatchers.Main is not a pool. It targets the Android main thread using the main Looper.
- It’s single-threaded
- It’s where UI state must be updated
- It integrates naturally with lifecycle scopes (
lifecycleScope,viewModelScope)
Also note:
Dispatchers.Main.immediatecan run immediately if you’re already on main, avoiding dispatch overhead.
Dispatchers.Unconfined: no fixed thread, no confinement guarantees
Dispatchers.Unconfined is fundamentally different:
- It starts in the caller thread
- After the first suspension, it can resume on whatever thread resumes it
- Thread affinity is not guaranteed
This can be useful for very low-level coroutine library code, but is a footgun for application code.
Dispatchers.IO vs Dispatchers.Default — real performance differences
The rule of thumb is correct:
- Default → CPU-bound
- IO → blocking I/O
What middle–senior devs need is: what breaks when you get it wrong?
Bad example: CPU-bound work on Dispatchers.IO
withContext(Dispatchers.IO) {
heavyJsonParsing()
}
Why this is bad
(1) IO assumes blocking, so it may allow more threads than cores
On a 4-core device, Default tends to keep ~4 CPU workers busy.
IO is designed to handle many blocked threads, so it may allow more concurrent workers (up to its cap).
When you run CPU-bound work on IO, you can end up with:
- More runnable threads than CPU cores
Example mental model:
- 4 cores
- 16 CPU-bound tasks all running on IO
- Scheduler/OS sees 16 runnable threads competing for 4 cores
Only 4 can run at a time, the rest are constantly preempted.
(2) Oversubscription causes context switching + cache thrash
When runnable threads > cores:
- The OS time-slices CPU among them
- Threads get preempted and resumed frequently
- Context switching overhead rises
- CPU cache locality is destroyed (cache lines evicted, branch predictor disrupted)
The result is counterintuitive:
CPU usage looks high, many threads are “busy”, but total throughput (work completed per second) drops, and tail latency gets worse.
(3) You can starve real I/O
If IO threads are saturated with CPU parsing:
- Real I/O tasks submitted to IO can queue behind compute tasks
- If the dispatcher hits its parallelism cap, it cannot keep “compensating”
- Your network/disk operations can get delayed
This becomes visible as:
- slower networking under load
- delayed DB/file operations
- jank when I/O results arrive late and UI waits
✅ Correct:
withContext(Dispatchers.Default) {
heavyJsonParsing()
}
Bad example: blocking I/O on Dispatchers.Default
withContext(Dispatchers.Default) {
socket.read() // blocking call
}
Why this is bad
(1) Default’s workers are precious
Default parallelism is limited (roughly core count).
If a Default worker thread blocks on I/O, it becomes unavailable for CPU work.
Imagine:
- 4-core device
- Default has ~4 CPU workers
- 2 of them get stuck in blocking I/O
Now you effectively have only 2 CPU workers left for everything else using Default:
- JSON parsing
- crypto
- image decoding
- diffing lists
- Compose runtime work (some internal pieces)
- other business computations
This is a classic “why is everything slower” bug: one component blocks Default, and unrelated parts degrade.
(2) Starvation + cascading latency
Coroutines scheduled on Default share the same pool.
Blocking ties up threads, which leads to:
- queues building up
- tasks waiting longer
- p95/p99 latency spikes
On Android this shows up as:
- delayed state updates
- late emissions in Flows
- UI stutter when results arrive unevenly
(3) Blocking defeats coroutines’ main scaling advantage
Coroutines are great because they allow high concurrency with few threads when work suspends.
Blocking doesn’t suspend — it pins a thread.
✅ Correct:
withContext(Dispatchers.IO) {
socket.read()
}
Important nuance: “suspend” does not automatically mean “non-blocking”
A function being suspend does not guarantee it won’t block. suspend only means it can suspend. If it doesn’t hit a suspension point, it runs like normal code.
suspend fun parseStillCpuBound() {
heavyJsonParsing() // no suspension point here
}
So you still need to choose the dispatcher based on:
- CPU-bound vs blocking I/O
- and where library code does parsing/processing
Production-grade tool: limit parallelism
In many apps, you don’t just pick IO/Default; you also cap concurrency to avoid overload:
private val dbDispatcher = Dispatchers.IO.limitedParallelism(4)
suspend fun loadFromDb(): List<Item> = withContext(dbDispatcher) {
dao.queryItems() // prevents 50 parallel DB hits
}
This is often the difference between “works locally” and “stable under production load”.
The Android Main dispatcher
The main thread is single-threaded and shared by:
- rendering (Choreographer)
- input events
- lifecycle callbacks
- Compose/Views invalidation
- animations
So your goal on Main is:
- orchestrate
- update UI state
- never do heavy work
Common anti-pattern
viewModelScope.launch(Dispatchers.Main) {
val data = api.call() // could be slow, could parse heavy, could block in edge cases
render(data)
}
Even with suspend APIs, you don’t want risk here.
Better:
viewModelScope.launch {
val data = withContext(Dispatchers.IO) { api.call() }
render(data) // runs on Main by default in viewModelScope
}
If the response parsing is heavy:
viewModelScope.launch {
val raw = withContext(Dispatchers.IO) { api.fetchRaw() }
val parsed = withContext(Dispatchers.Default) { parse(raw) }
render(parsed)
}
Why Unconfined is dangerous
Dispatchers.Unconfined is not “faster IO”. It’s “no confinement”.
launch(Dispatchers.Unconfined) {
println("Start: ${Thread.currentThread().name}")
delay(10)
println("Resume: ${Thread.currentThread().name}")
}
Possible output:
- Start:
main - Resume:
DefaultDispatcher-worker-2
Why it’s dangerous
- Execution thread is non-deterministic
- After suspension, you may resume on a different thread
- Any code that assumes thread confinement (UI updates, state mutation without synchronization) becomes unsafe
Typical failure modes:
- rare race conditions
- flaky tests
- “works on my phone” bugs
- subtle UI thread violations
When it’s acceptable
- low-level coroutine library code
- some unit tests
- very specific cases where you want “run immediately until first suspension”
For application code: treat it as a red flag.
Switching context with withContext
withContext is the primary tool for structured context switching.
Key properties:
- It suspends the caller (does not block the thread)
- It executes the block on the specified dispatcher
- It returns the result
- It propagates cancellation properly
- After it completes, execution resumes in the original context
suspend fun loadUser(): User =
withContext(Dispatchers.IO) {
api.fetchUser()
}
A “clean separation” pattern (IO → Default → Main)
viewModelScope.launch {
val raw = withContext(Dispatchers.IO) { api.fetchRawJson() }
val user = withContext(Dispatchers.Default) { parseUser(raw) }
render(user) // Main
}
This pattern is production-friendly because:
- blocking happens on IO
- CPU work happens on Default
- UI work happens on Main
- each step has clear responsibility
Practical patterns for Android + API services
Repository owns the dispatcher decision
The caller (ViewModel/UI) should not need to know which dispatcher is correct for networking.
class UserRepository(
private val api: UserApi
) {
suspend fun getUser(): User = withContext(Dispatchers.IO) {
api.getUser()
}
}
If parsing is heavy, do it in repository as well:
class UserRepository(
private val api: UserApi
) {
suspend fun getUser(): User {
val raw = withContext(Dispatchers.IO) { api.getUserRaw() }
return withContext(Dispatchers.Default) { parseUser(raw) }
}
}
This keeps ViewModel lean and improves testability.
ViewModel: orchestrate, don’t compute
ViewModel should:
- launch on Main (default)
- call suspend functions
- update state
viewModelScope.launch {
_uiState.value = UiState.Loading
runCatching { repository.getUser() }
.onSuccess { _uiState.value = UiState.Success(it) }
.onFailure { _uiState.value = UiState.Error(it) }
}
Flow: use flowOn correctly
A common mistake is pushing upstream work onto Main.
❌ Bad:
fun userFlow(): Flow<User> = flow {
emit(api.getUser()) // might do IO + parsing
}.flowOn(Dispatchers.Main)
✅ Better:
fun userFlow(): Flow<User> = flow {
emit(api.getUser())
}.flowOn(Dispatchers.IO)
If heavy parsing exists, split:
fun userFlow(): Flow<User> =
flow { emit(api.getUserRaw()) }
.flowOn(Dispatchers.IO)
.map { raw -> withContext(Dispatchers.Default) { parseUser(raw) } }
(For advanced readers: be mindful that map { withContext(...) } adds per-item context switching; in high-throughput streams you may want batching or a dedicated dispatcher.)
Avoid “dispatcher ping-pong”
Context switching has overhead. Don’t sprinkle withContext everywhere “just in case”.
A good heuristic:
- Switch at layer boundaries (Repository / UseCase)
- Keep UI layer mostly dispatcher-agnostic
- Only switch when it changes the nature of work (blocking I/O vs compute vs UI)
Closing thoughts
Dispatchers are not just “background threads”. They encode:
- scheduling strategy
- parallelism control
- fairness vs throughput trade-offs
- thread confinement guarantees (Main)
- and blocking compensation (IO)
A practical summary:
- Default: CPU-bound compute (keep runnable threads ≈ cores)
- IO: blocking I/O (can oversubscribe to compensate for blocked threads)
- Main: UI thread (rendering & state updates only)
- Unconfined: no guarantees (avoid in app code)
If you internalize the OS meaning of “blocking” (Blocked/Waiting vs Runnable), dispatcher selection becomes much more deterministic — and your app becomes faster and more stable under real load.
