Structured Concurrency in Depth

Concurrency is not difficult because we don’t know how to run code in parallel.
It is difficult because we don’t know when that code should stop, who owns it, and what happens when something goes wrong.

Kotlin introduced structured concurrency to solve exactly those problems.
Not as syntactic sugar, but as a hard constraint on how concurrent code must be written.

This article explains structured concurrency from first principles, focusing on why it exists, how it actually works, and how to avoid subtle production bugs.


Why Structured Concurrency Exists

Before structured concurrency, coroutine-based code often looked asynchronous but behaved unpredictably.

Developers could launch coroutines freely, but there was no enforced relationship between:

  • the code that launched the coroutine
  • the lifetime of the object that owned it
  • the work happening in the background

This caused several recurring problems.

Work Continued After the Owner Was Gone

In Android, this usually happened when background work outlived a screen or ViewModel.

fun loadData() {
    GlobalScope.launch {
        val data = api.fetch()
        view.show(data)
    }
}

At first glance this looks fine, but the coroutine has no connection to the UI lifecycle.
If the screen is destroyed, the coroutine keeps running and may update a dead view or leak memory.

The root problem is not threading — it is lack of ownership.


Child Tasks Were Hard to Reason About

Consider this code:

launch {
    launch { taskA() }
    launch { taskB() }
}

Without structured concurrency, it was unclear:

  • whether taskA and taskB belonged to the parent
  • whether the parent waited for them
  • what should happen if one of them failed

The code looked structured, but the runtime had no obligation to enforce that structure.


Cancellation Was Unreliable

Developers often cancelled only the top-level job, assuming everything would stop.

job.cancel()

In reality:

  • child coroutines could continue
  • nested work might ignore cancellation
  • cleanup logic was inconsistent

Cancellation became something you hoped would work, not something you could rely on.


Exceptions Were Difficult to Control

An exception in a coroutine could:

  • crash the app
  • silently disappear
  • cancel unrelated work

Without clear rules, error handling turned into defensive programming with excessive try/catch, often breaking cancellation semantics.


What Structured Concurrency Guarantees

Structured concurrency fixes these problems by enforcing three non-negotiable guarantees.

First, every coroutine must have a parent.
A coroutine cannot exist without being part of a hierarchy.

Second, cancelling a scope cancels all coroutines inside it.
Cancellation is no longer best-effort — it is guaranteed.

Third, a parent coroutine does not complete until all its children complete.
Asynchronous code now behaves like structured, synchronous code.

Together, these rules make concurrency predictable.


CoroutineScope Rules

A CoroutineScope is not just a convenience API.
It represents a lifetime boundary.

When you create a scope, you are declaring:

“Any coroutine launched here belongs to this lifecycle.”

A Scope Must Have a Job

The Job is what binds coroutines together.

CoroutineScope(Job() + Dispatchers.IO)

Without a Job, there is no hierarchy and no structured cancellation.


Cancelling a Scope Cancels Everything Inside It

If a scope is cancelled, all coroutines launched in that scope are cancelled, recursively.

scope.launch {
    launch { taskA() }
    launch { taskB() }
}

scope.cancel()

This behavior is guaranteed by the coroutine framework — not by convention.


GlobalScope Breaks the Rules

GlobalScope has no owner and no lifecycle.
Any coroutine launched there is effectively immortal.

If the work belongs to:

  • a screen
  • a ViewModel
  • a request
  • a user action

then GlobalScope is conceptually incorrect.


Parent–Child Relationships in Coroutines

Coroutines form a tree, not a flat list.

A ViewModel scope might look like this:

viewModelScope
 ├── loadUser
 │    └── loadAvatar
 └── loadPosts

This structure is enforced at runtime.

Parents Wait for Children

runBlocking {
    launch {
        delay(100)
        println("Child finished")
    }
    println("Parent waiting")
}

The parent does not finish early.
It waits until all children complete.


Cancelling Children (And Why Cancellation Sometimes “Doesn’t Work”)

Cancellation in Kotlin coroutines is cooperative, not forced.

Calling cancel() does not immediately stop code.
It requests cancellation, which is only observed at specific points.


What Happens When a Child Calls cancel()

launch {
    cancel()
    println("Still running?")
}

The coroutine is marked as cancelled, but it only stops:

  • at a suspension point (delay, await, yield)
  • or when it explicitly checks isActive / ensureActive()

If neither happens, the code continues.


Why Heavy Computation Ignores Cancellation

launch {
    cancel()
    heavyComputation() // no suspension points
}

Kotlin does not interrupt threads.
If your code never suspends and never checks cancellation, it will continue running.

Correct cancellation-aware code looks like this:

while (isActive) {
    computeStep()
}


Cancellation vs Failure

This distinction is critical.

If a child cancels itself normally, the parent continues.
If a child fails with an exception, the parent is cancelled (by default).

launch { cancel() }        // parent continues
launch { error("Boom") }   // parent is cancelled

This is why cancellation should be used for control flow, not exceptions.


Swallowing Cancellation Is a Bug

try {
    delay(1000)
} catch (e: Exception) {
    // CancellationException swallowed
}

This prevents cancellation from completing.

Cancellation exceptions must always be rethrown.


Blocking Code Prevents Cancellation

Thread.sleep(5000)

Blocking APIs ignore coroutine cancellation entirely.
Use suspending APIs or explicit cancellation checks instead.


SupervisorJob and supervisorScope

By default, coroutines use a fail-fast model.
If one child fails, the parent and all siblings are cancelled.

This is correct for transactional work, but not for all scenarios.

Supervision exists to isolate failures without breaking structure.


SupervisorJob

SupervisorJob changes how failures propagate, not cancellation.

When a child fails:

  • the parent remains active
  • siblings continue running

When the parent is cancelled:

  • all children are cancelled

This makes SupervisorJob suitable for long-lived scopes where tasks should be independent.

This is why viewModelScope uses it.


supervisorScope

supervisorScope provides local supervision inside a suspending function.

It allows sibling coroutines to fail independently while still:

  • waiting for all children
  • propagating exceptions when awaited
  • respecting parent cancellation

It is ideal for parallel work inside repositories or use cases.


Which One Should You Use?

Use SupervisorJob when defining a scope that lives for a long time.
Use supervisorScope when you already have a scope and want local failure isolation.

They solve the same problem at different levels.


Common Mistakes That Cause Coroutine Leaks

Most coroutine leaks come from:

  • using GlobalScope
  • creating scopes without cancelling them
  • swallowing CancellationException
  • launching work that outlives its owner
  • mixing blocking code with coroutines

These are design bugs, not coroutine bugs.


Real-World Android Example

class ProfileViewModel(
    private val repo: Repo
) : ViewModel() {

    fun loadProfile() = viewModelScope.launch {
        supervisorScope {
            val user = async { repo.loadUser() }
            val posts = async { repo.loadPosts() }

            uiState.value = Profile(
                user = user.await(),
                posts = posts.await()
            )
        }
    }
}

Here:

  • the ViewModel owns the work
  • cancellation is automatic
  • failures are controlled
  • no background work leaks

Final Takeaways

Structured concurrency is not about coroutines.
It is about ownership, lifetime, and correctness.

If you cannot answer:

  • who owns this coroutine?
  • when should it stop?
  • what happens if it fails?

then the code is already wrong.

Concurrency should follow the structure of your program.
Kotlin coroutines finally make that enforceable.

Leave a Reply

Your email address will not be published. Required fields are marked *