Exception Handling & Cancellation in Kotlin Coroutines

If you have been using Kotlin Coroutines for a while, you have probably experienced at least one of these situations:

  • A coroutine keeps running even after you think it should have been cancelled
  • An exception crashes the entire scope unexpectedly
  • try/catch does not catch what you expect
  • Cancelling one task suddenly cancels everything else
  • API calls continue even after the screen is destroyed

These problems do not come from “wrong syntax”.
They come from an incomplete mental model of cancellation and exception handling.

This article is written for middle–senior Kotlin developers who already use coroutines, but want to fully understand their behavior, especially in real-world Android or backend systems.

We will not just describe what happens, but why it happens.


1. How Coroutine Cancellation Really Works

Let’s start with the most important idea:

Coroutine cancellation is not a forceful stop.
It is a cooperative signal.

This single sentence explains most coroutine-related bugs.

Cancellation is not thread interruption

In traditional Java concurrency, cancellation is often associated with:

  • Thread.interrupt()
  • Forcefully stopping execution
  • Unpredictable states

Kotlin Coroutines deliberately avoid this model.

When you cancel a coroutine:

  • No thread is killed
  • No stack frame is forcibly unwound
  • No code is stopped arbitrarily

Instead, cancellation works through structured concurrency and state propagation.

What actually happens when you call cancel()

Every coroutine has a Job in its CoroutineContext.

When you call:

job.cancel()

Internally, several things happen in a very controlled way:

  1. The Job is marked as cancelled
  2. The cancellation signal propagates downward to all child jobs
  3. Any suspending function checks the job state
  4. If cancellation is detected, a CancellationException is thrown

Nothing stops immediately.
Cancellation only takes effect when the coroutine reaches a cancellation check.

This design guarantees:

  • Safe cleanup
  • Predictable execution
  • No corrupted state

2. Cooperative Cancellation: Why Your Coroutine Keeps Running

Because cancellation is cooperative, your code must cooperate.

Suspension points are implicit cancellation checks

Most coroutine APIs already cooperate with cancellation.

For example:

launch {
    delay(5_000)
    println("Done")
}

If the coroutine is cancelled during delay, it never reaches println.

Why?
Because delay:

  • Suspends the coroutine
  • Checks the job state before resuming
  • Throws CancellationException if cancelled

The same applies to:

  • withContext
  • yield
  • await
  • receive
  • Many Flow operators

The real problem: CPU-bound code

Now consider this:

launch {
    while (true) {
        doHeavyComputation()
    }
}

This coroutine:

  • Never suspends
  • Never checks cancellation
  • Will run forever

Even if the scope is cancelled, nothing happens.

This is the most common source of:

  • UI freezes
  • Battery drain
  • “Why is my coroutine still running?”

Making computation cancellable

You must explicitly check cancellation.

The simplest approach:

launch {
    while (isActive) {
        doHeavyComputation()
    }
}

Or more explicitly:

launch {
    while (true) {
        ensureActive()
        doHeavyComputation()
    }
}

ensureActive():

  • Checks the job state
  • Throws CancellationException if cancelled
  • Immediately exits the coroutine

This makes CPU-bound code behave exactly like a suspending function.


3. CancellationException: Not an Error, but a Control Signal

One of the most misunderstood aspects of coroutines is CancellationException.

Cancellation is a normal termination

When a coroutine is cancelled, it throws CancellationException.

This is intentional.

Cancellation:

  • Is expected
  • Is part of normal lifecycle management
  • Should not be treated as a failure

For example, in Android:

viewModelScope.launch {
    repository.loadData()
}

When the screen is destroyed:

  • viewModelScope is cancelled
  • All child coroutines are cancelled
  • CancellationException is thrown

This is correct behavior, not an error.

Why cancellation exceptions are ignored by default

Coroutine builders treat CancellationException differently:

  • They do not log it
  • They do not crash the app
  • They do not propagate it as a failure

This prevents:

  • Log pollution
  • False error reporting
  • Incorrect retry logic

If cancellation were treated like a normal exception, coroutine-based systems would be unusable.


4. try/catch with Coroutines: What Actually Works

Many developers expect try/catch to behave the same way it does in synchronous code.
With coroutines, that assumption is often wrong.

try/catch does not cross coroutine boundaries

This code does not work:

try {
    launch {
        throw RuntimeException("Boom")
    }
} catch (e: Exception) {
    // Never called
}

Why?

Because:

  • launch starts a new coroutine
  • The exception happens later
  • The outer try/catch has already exited

Correct usage

You must place try/catch inside the coroutine:

launch {
    try {
        riskyOperation()
    } catch (e: Exception) {
        handleError(e)
    }
}

The dangerous mistake: catching cancellation

This is a subtle but serious bug:

launch {
    try {
        delay(10_000)
    } catch (e: Exception) {
        // This catches CancellationException!
    }
}

Here, cancellation is swallowed.
The coroutine:

  • Stops cancelling properly
  • Breaks structured concurrency
  • Can lead to leaked work

Correct pattern

launch {
    try {
        delay(10_000)
    } catch (e: CancellationException) {
        throw e
    } catch (e: Exception) {
        handleError(e)
    }
}

Golden rule:

Always rethrow CancellationException


5. Job vs SupervisorJob: Failure Propagation Explained

Understanding the difference between Job and SupervisorJob requires understanding failure semantics.

Default behavior with Job

coroutineScope {
    launch { taskA() }
    launch { taskB() }
    launch { taskC() }
}

If taskB() throws an exception:

  • The parent scope is cancelled
  • taskA and taskC are cancelled
  • The entire scope fails

This is fail-fast behavior.

It is ideal when:

  • Tasks depend on each other
  • Partial results are meaningless
  • You want consistency

SupervisorJob changes one rule

supervisorScope {
    launch { taskA() }
    launch { taskB() }
    launch { taskC() }
}

Now:

  • If taskB() fails
  • Only taskB is cancelled
  • The parent and siblings continue

This isolates failures.

Why ViewModel uses SupervisorJob

In UI layers:

  • One failing request should not kill the entire screen
  • Independent UI components should continue working

That is why viewModelScope uses SupervisorJob.


6. Exception Propagation Rules: launch vs async

Exception propagation differs fundamentally between launch and async.

launch: fire-and-forget

launch {
    throw RuntimeException("Crash")
}

If uncaught:

  • Cancels the parent
  • Is handled by CoroutineExceptionHandler

You cannot catch it outside.

async: deferred failure

val deferred = async {
    throw RuntimeException("Boom")
}

At this point:

  • No exception is thrown
  • Coroutine has failed internally

The exception appears only when:

deferred.await()

This allows:

  • Explicit error handling
  • Controlled propagation

Structured concurrency implication

coroutineScope {
    val a = async { loadA() }
    val b = async { loadB() }
    combine(a.await(), b.await())
}

If loadA() fails:

  • loadB() is cancelled
  • The scope fails immediately

This guarantees consistency over partial success.


7. Practical Example: Cancelling API Calls and Heavy Computation Safely

Let’s put everything together.

Problem

  • API request
  • CPU-heavy processing
  • User leaves screen
  • Work must stop immediately
  • No leaks
  • No swallowed cancellation

ViewModel implementation

class MyViewModel(
    private val repository: Repository
) : ViewModel() {

    fun loadData() {
        viewModelScope.launch {
            try {
                val data = repository.fetch()
                val result = withContext(Dispatchers.Default) {
                    process(data)
                }
                render(result)
            } catch (e: Exception) {
                if (e is CancellationException) throw e
                showError(e)
            }
        }
    }

    private fun process(data: Data): Result {
        repeat(1_000_000) {
            ensureActive()
            heavyStep(data)
        }
        return Result()
    }
}

What happens on cancellation?

  1. Screen is destroyed
  2. viewModelScope is cancelled
  3. API call throws CancellationException
  4. Computation detects ensureActive()
  5. Coroutine exits immediately
  6. No wasted CPU
  7. No memory leaks
  8. No incorrect error handling

This is correct coroutine design.


Final Thoughts

Kotlin Coroutines are not just a concurrency API.
They are a lifecycle-aware execution model.

If you understand:

  • Cancellation as cooperation
  • Why cancellation is not an error
  • How exceptions propagate structurally
  • When to isolate failures

You stop debugging coroutine behavior and start designing predictable systems.

Good coroutine code is not about syntax.
It is about respecting cancellation and structure.

Leave a Reply

Your email address will not be published. Required fields are marked *