If you have been using Kotlin Coroutines for a while, you have probably experienced at least one of these situations:
- A coroutine keeps running even after you think it should have been cancelled
- An exception crashes the entire scope unexpectedly
try/catchdoes not catch what you expect- Cancelling one task suddenly cancels everything else
- API calls continue even after the screen is destroyed
These problems do not come from “wrong syntax”.
They come from an incomplete mental model of cancellation and exception handling.
This article is written for middle–senior Kotlin developers who already use coroutines, but want to fully understand their behavior, especially in real-world Android or backend systems.
We will not just describe what happens, but why it happens.
1. How Coroutine Cancellation Really Works
Let’s start with the most important idea:
Coroutine cancellation is not a forceful stop.
It is a cooperative signal.
This single sentence explains most coroutine-related bugs.
Cancellation is not thread interruption
In traditional Java concurrency, cancellation is often associated with:
Thread.interrupt()- Forcefully stopping execution
- Unpredictable states
Kotlin Coroutines deliberately avoid this model.
When you cancel a coroutine:
- No thread is killed
- No stack frame is forcibly unwound
- No code is stopped arbitrarily
Instead, cancellation works through structured concurrency and state propagation.
What actually happens when you call cancel()
Every coroutine has a Job in its CoroutineContext.
When you call:
job.cancel()
Internally, several things happen in a very controlled way:
- The
Jobis marked as cancelled - The cancellation signal propagates downward to all child jobs
- Any suspending function checks the job state
- If cancellation is detected, a
CancellationExceptionis thrown
Nothing stops immediately.
Cancellation only takes effect when the coroutine reaches a cancellation check.
This design guarantees:
- Safe cleanup
- Predictable execution
- No corrupted state
2. Cooperative Cancellation: Why Your Coroutine Keeps Running
Because cancellation is cooperative, your code must cooperate.
Suspension points are implicit cancellation checks
Most coroutine APIs already cooperate with cancellation.
For example:
launch {
delay(5_000)
println("Done")
}
If the coroutine is cancelled during delay, it never reaches println.
Why?
Because delay:
- Suspends the coroutine
- Checks the job state before resuming
- Throws
CancellationExceptionif cancelled
The same applies to:
withContextyieldawaitreceive- Many Flow operators
The real problem: CPU-bound code
Now consider this:
launch {
while (true) {
doHeavyComputation()
}
}
This coroutine:
- Never suspends
- Never checks cancellation
- Will run forever
Even if the scope is cancelled, nothing happens.
This is the most common source of:
- UI freezes
- Battery drain
- “Why is my coroutine still running?”
Making computation cancellable
You must explicitly check cancellation.
The simplest approach:
launch {
while (isActive) {
doHeavyComputation()
}
}
Or more explicitly:
launch {
while (true) {
ensureActive()
doHeavyComputation()
}
}
ensureActive():
- Checks the job state
- Throws
CancellationExceptionif cancelled - Immediately exits the coroutine
This makes CPU-bound code behave exactly like a suspending function.
3. CancellationException: Not an Error, but a Control Signal
One of the most misunderstood aspects of coroutines is CancellationException.
Cancellation is a normal termination
When a coroutine is cancelled, it throws CancellationException.
This is intentional.
Cancellation:
- Is expected
- Is part of normal lifecycle management
- Should not be treated as a failure
For example, in Android:
viewModelScope.launch {
repository.loadData()
}
When the screen is destroyed:
viewModelScopeis cancelled- All child coroutines are cancelled
CancellationExceptionis thrown
This is correct behavior, not an error.
Why cancellation exceptions are ignored by default
Coroutine builders treat CancellationException differently:
- They do not log it
- They do not crash the app
- They do not propagate it as a failure
This prevents:
- Log pollution
- False error reporting
- Incorrect retry logic
If cancellation were treated like a normal exception, coroutine-based systems would be unusable.
4. try/catch with Coroutines: What Actually Works
Many developers expect try/catch to behave the same way it does in synchronous code.
With coroutines, that assumption is often wrong.
try/catch does not cross coroutine boundaries
This code does not work:
try {
launch {
throw RuntimeException("Boom")
}
} catch (e: Exception) {
// Never called
}
Why?
Because:
launchstarts a new coroutine- The exception happens later
- The outer
try/catchhas already exited
Correct usage
You must place try/catch inside the coroutine:
launch {
try {
riskyOperation()
} catch (e: Exception) {
handleError(e)
}
}
The dangerous mistake: catching cancellation
This is a subtle but serious bug:
launch {
try {
delay(10_000)
} catch (e: Exception) {
// This catches CancellationException!
}
}
Here, cancellation is swallowed.
The coroutine:
- Stops cancelling properly
- Breaks structured concurrency
- Can lead to leaked work
Correct pattern
launch {
try {
delay(10_000)
} catch (e: CancellationException) {
throw e
} catch (e: Exception) {
handleError(e)
}
}
Golden rule:
Always rethrow
CancellationException
5. Job vs SupervisorJob: Failure Propagation Explained
Understanding the difference between Job and SupervisorJob requires understanding failure semantics.
Default behavior with Job
coroutineScope {
launch { taskA() }
launch { taskB() }
launch { taskC() }
}
If taskB() throws an exception:
- The parent scope is cancelled
taskAandtaskCare cancelled- The entire scope fails
This is fail-fast behavior.
It is ideal when:
- Tasks depend on each other
- Partial results are meaningless
- You want consistency
SupervisorJob changes one rule
supervisorScope {
launch { taskA() }
launch { taskB() }
launch { taskC() }
}
Now:
- If
taskB()fails - Only
taskBis cancelled - The parent and siblings continue
This isolates failures.
Why ViewModel uses SupervisorJob
In UI layers:
- One failing request should not kill the entire screen
- Independent UI components should continue working
That is why viewModelScope uses SupervisorJob.
6. Exception Propagation Rules: launch vs async
Exception propagation differs fundamentally between launch and async.
launch: fire-and-forget
launch {
throw RuntimeException("Crash")
}
If uncaught:
- Cancels the parent
- Is handled by
CoroutineExceptionHandler
You cannot catch it outside.
async: deferred failure
val deferred = async {
throw RuntimeException("Boom")
}
At this point:
- No exception is thrown
- Coroutine has failed internally
The exception appears only when:
deferred.await()
This allows:
- Explicit error handling
- Controlled propagation
Structured concurrency implication
coroutineScope {
val a = async { loadA() }
val b = async { loadB() }
combine(a.await(), b.await())
}
If loadA() fails:
loadB()is cancelled- The scope fails immediately
This guarantees consistency over partial success.
7. Practical Example: Cancelling API Calls and Heavy Computation Safely
Let’s put everything together.
Problem
- API request
- CPU-heavy processing
- User leaves screen
- Work must stop immediately
- No leaks
- No swallowed cancellation
ViewModel implementation
class MyViewModel(
private val repository: Repository
) : ViewModel() {
fun loadData() {
viewModelScope.launch {
try {
val data = repository.fetch()
val result = withContext(Dispatchers.Default) {
process(data)
}
render(result)
} catch (e: Exception) {
if (e is CancellationException) throw e
showError(e)
}
}
}
private fun process(data: Data): Result {
repeat(1_000_000) {
ensureActive()
heavyStep(data)
}
return Result()
}
}
What happens on cancellation?
- Screen is destroyed
viewModelScopeis cancelled- API call throws
CancellationException - Computation detects
ensureActive() - Coroutine exits immediately
- No wasted CPU
- No memory leaks
- No incorrect error handling
This is correct coroutine design.
Final Thoughts
Kotlin Coroutines are not just a concurrency API.
They are a lifecycle-aware execution model.
If you understand:
- Cancellation as cooperation
- Why cancellation is not an error
- How exceptions propagate structurally
- When to isolate failures
You stop debugging coroutine behavior and start designing predictable systems.
Good coroutine code is not about syntax.
It is about respecting cancellation and structure.
