Exception Handling & Cancellation in Kotlin Coroutines

Exception handling and cancellation are among the most misunderstood parts of Kotlin Coroutines. Developers often run into questions like:

  • Why does a coroutine suddenly stop when one child fails?
  • Why are some exceptions ignored?
  • Why doesn’t my cancellation stop my long-running task?
  • What’s the difference between a Job and a SupervisorJob?
  • How do I safely cancel network requests or CPU-heavy work?

To master coroutines, you must understand how cancellation propagates, what exceptions mean, how structured concurrency affects error handling, and how Kotlin enforces “cooperative cancellation.”

In this article, we’ll explore these topics deeply:

  1. How coroutine cancellation works
  2. Cooperative cancellation
  3. Cancellation exceptions
  4. try/catch with coroutines
  5. SupervisorJob vs Job
  6. Exception propagation rules
  7. A practical example: safely cancelling API calls and long computations

Let’s begin.


How Coroutine Cancellation Works

Cancellation in coroutines is asynchronous, cooperative, and propagated by the Job hierarchy.

The key idea:

Cancelling a coroutine does NOT immediately kill the thread.
Instead, it signals the coroutine to stop at the next suspension point.

Every coroutine has a Job attached to it. When you call:

job.cancel()

You don’t “kill” the coroutine. Instead:

  1. The coroutine receives a cancellation signal.
  2. The coroutine checks its state at each suspension point (like delay, network calls, I/O).
  3. If it detects cancellation, it throws CancellationException.
  4. The coroutine stops executing.
  5. The parent and child relationships determine additional cancellations.

Cancellation is not like Thread.stop()

Kotlin does NOT forcefully interrupt threads because that leads to:

  • corrupted states
  • inconsistent memory
  • unreleased resources
  • deadlocks
  • impossible-to-debug behavior

Instead, Kotlin uses structured concurrency to safely propagate cancellation.


Cooperative Cancellation

Kotlin coroutines rely on cooperation. A coroutine must “check in” to see if it’s cancelled.

This means cancellation only works automatically when a coroutine reaches a suspension point.

Example: cooperative cancellation works

val job = launch {
    repeat(10) { i ->
        delay(500)
        println("Work $i")
    }
}

delay(1200)
job.cancel()

Output:

Work 0
Work 1

Cancelled safely.

Because delay checks for cancellation.


Where cancellation does NOT work automatically

If the coroutine runs CPU-heavy loops:

launch {
    for (i in 1..100_000_000) {
        // no suspension points here!
    }
}

Calling cancel() will NOT stop the coroutine.

Fix: check for cancellation manually

launch {
    for (i in 1..100_000_000) {
        ensureActive()  // throws CancellationException if cancelled
    }
}

Or:

coroutineContext.ensureActive()

Another option:

yield()  // introduces a suspension checkpoint

Why does cancellation require cooperation?

Because coroutines run on shared thread pools.
Forceful interruption could break other tasks using the same thread.


Cancellation Exceptions

When a coroutine is cancelled, Kotlin internally throws:

CancellationException

This is a lightweight exception used for control flow, not an error.

Example:

try {
    delay(1000)
} catch (e: CancellationException) {
    println("Cancelled!")
}

Two important facts:

  1. CancellationException does not crash your coroutine hierarchy.
    It is a normal part of cancellation.
  2. It is ignored by default unless you manually catch it.

Why is it thrown?

It’s Kotlin’s way of unwinding coroutine state cleanly and predictably.


What if you accidentally catch it?

Bad:

try {
    delay(1000)
} catch (e: Exception) {
    // catches CancellationException too!
}

This unintentionally “swallows” cancellation.

Fix:

If you catch Exception, rethrow CancellationException:

catch (e: Exception) {
    if (e is CancellationException) throw e
    // handle real exception
}


try/catch with Coroutines

Suspending functions throw exceptions just like normal functions.

Example:

launch {
    try {
        val user = api.getUser() // suspend
        println(user)
    } catch (e: Exception) {
        println("Failed: $e")
    }
}

try/catch works across suspension points

Even if the exception occurred after a suspension (e.g., after delay()), Kotlin correctly resumes the coroutine and throws into the right try/catch block.


Important: CancellationException is not an “error”

Because it’s not a failure, try/catch should usually not catch cancellation unless the coroutine needs to clean up resources.

Example:

launch {
    try {
        val data = api.load()
    } catch (e: CancellationException) {
        // Optional cleanup
        throw e  // Always rethrow!
    }
}

Why rethrow?

Swallowing cancellation breaks structured concurrency:

  • parents think the coroutine is still running
  • resources leak
  • UI waits forever

SupervisorJob vs Normal Job

A normal Job enforces fail-fast behavior:

  • If one child fails → parent is cancelled → all children are cancelled.

This is great for:

  • request/response flows
  • loading screens
  • atomic operations
  • parallel tasks that must all succeed

Example: normal Job failure propagation

coroutineScope {
    launch { error("Boom!") }
    launch { delay(1000); println("Never runs") }
}

Second child gets cancelled immediately.


SupervisorJob: children fail independently

With a supervisor, failure in one child:

  • does NOT cancel the parent
  • does NOT cancel siblings

Useful when:

  • tasks are independent
  • partial success is acceptable
  • UI components run in parallel

Example:

val scope = CoroutineScope(SupervisorJob())

scope.launch { error("Fail 1") }
scope.launch { delay(1000); println("Still running") }

Output:

Still running

SupervisorJob is very common in Android ViewModels.


supervisorScope for structured concurrency

Suspend version of SupervisorJob:

supervisorScope {
    launch { error("Fail") }
    launch { println("Still OK") }
}

Used inside suspend functions for isolated child failures.


Exception Propagation Rules

Kotlin’s exception propagation is strict and predictable.

Here are the key rules:


Rule 1: Failure in a child cancels the parent (normal Job)

coroutineScope {
    launch { error("Bad!") }
    launch { ... } // cancelled automatically
}


Rule 2: CancellationException does NOT propagate failure

Because cancellation is normal flow.


Rule 3: SupervisorJob breaks downward/upward failure propagation

  • child failures do not cancel siblings
  • child failures do not cancel supervisor

Rule 4: Exceptions are rethrown to parent on join

Example:

val job = launch {
    error("Oops")
}

job.join()  // throws exception


Rule 5: async rethrows exceptions on await, not immediately

val deferred = async { error("Boom") }
deferred.await() // throws here

This is similar to how Futures/Promises work.


Rule 6: runBlocking rethrows exceptions immediately

RunBlocking works like a regular try/catch around the coroutine body.


Example: Safely Cancelling API Calls or Long Computations

Let’s build a real example that demonstrates:

  • cancellation
  • exception handling
  • supervisor scope
  • safe cleanup

Scenario

A ViewModel loads:

  • user info
  • user posts

The user may navigate away, cancelling the coroutine.
The API calls must stop immediately, not waste bandwidth.
A failure in posts should not cancel user loading.


Step 1: Make API suspendable and cancellable

Retrofit suspend functions are automatically cancellable.

For custom callbacks, use:

suspend fun Api.getUserCancellable(): User =
    suspendCancellableCoroutine { cont ->

        val call = getUserAsync(
            onSuccess = { cont.resume(it) },
            onError = { cont.resumeWithException(it) }
        )

        cont.invokeOnCancellation {
            call.cancel()  // stops HTTP call
        }
    }

This ensures:

  • Cancelling the coroutine cancels the network call.
  • No wasted threads.
  • No callback firing after the screen is gone.

Step 2: Use supervisorScope to run tasks independently

viewModelScope.launch {
    supervisorScope {
        val userDeferred = async { api.getUserCancellable() }
        val postsDeferred = async { api.getPostsCancellable() }

        val user = try { userDeferred.await() } catch (e: Exception) { null }
        val posts = try { postsDeferred.await() } catch (e: Exception) { null }

        state.value = UiState(user, posts)
    }
}

Why this works perfectly:

  1. If user fails → posts still loads
  2. If posts fail → user still loads
  3. If ViewModel is cleared → both calls are cancelled
  4. Cancellation stops HTTP requests
  5. The state machine handles try/catch cleanly

Step 3: Handling long CPU computations

To make CPU loops cancellable:

suspend fun heavyComputation(): Int = withContext(Dispatchers.Default) {
    var sum = 0
    for (i in 1..1_000_000_000) {
        ensureActive()  // enables cancellation
        sum += i
    }
    sum
}

If coroutine is cancelled → computation stops at next ensureActive.


Step 4: Cancelling from UI

For example, a button that stops work:

private var job: Job? = null

fun startWork() {
    job = viewModelScope.launch {
        try {
            val result = heavyComputation()
            state.value = Result(result)
        } catch (e: CancellationException) {
            // optional cleanup
        }
    }
}

fun cancelWork() {
    job?.cancel()
}


Conclusion

Exception handling and cancellation in Kotlin Coroutines are powerful but require understanding how structured concurrency and cooperative cancellation work.

You learned:

  • Cancellation signals coroutines to stop—they must cooperate
  • Suspension points automatically check for cancellation
  • Cancellation uses CancellationException, which is normal
  • try/catch works across suspending functions
  • Normal Job = fail-fast
  • SupervisorJob = independent failures
  • async rethrows exceptions on await()
  • You can safely cancel network requests and computations

Key takeaways

  • Cancellation is not a “kill switch”—it’s a signal
  • Never swallow CancellationException
  • Use SupervisorJob when children should survive failures
  • Use coroutineScope when children should fail together
  • Always handle cancellation for long CPU work
  • Retrofit suspend functions are automatically cancellable

Leave a Reply

Your email address will not be published. Required fields are marked *