Introduction to Kotlin Coroutines

Kotlin Coroutines are no longer just a “nice-to-have” — they are the standard way to write asynchronous, concurrent code in modern Kotlin projects, whether you’re building Android apps, Ktor/Spring backends, or multiplatform libraries.

If you’re a mid-level Kotlin developer, you likely already know how to launch a coroutine and use suspend functions. This article skips the absolute beginner explanations (like “what is async?”) and focuses on the concepts you need to truly understand and use coroutines effectively and safely in production code.

We’ll go deeper into how coroutines work under the hood, why certain patterns exist, common pitfalls, and best practices that most tutorials gloss over.


Table of Contents

Why Coroutines Were Created (The Real Problems They Solve)

Before coroutines, Kotlin/Java developers had these main options:

ApproachProblem in Practice
CallbacksPyramid of doom, error handling scattered everywhere
ThreadsExpensive, limited scalability (~few thousand max)
RxJavaPowerful but steep learning curve, heavy runtime
Future/PromiseHard to compose, no built-in cancellation propagation

Coroutines were designed to solve all of these with a single, coherent model:

  • Write async code that looks synchronous
  • Lightweight (100k+ coroutines easily)
  • Structured lifecycle & cancellation
  • First-class exception handling
  • Seamless integration with existing libraries (Retrofit, Room, Ktor, etc.)

Threads vs Coroutines – The Real Difference

When developers first encounter Kotlin Coroutines, it’s tempting to think of them as “lightweight threads.” While this description is directionally correct, it hides the deeper distinction: threads are a low-level OS concept, whereas coroutines are a high-level concurrency abstraction controlled by the Kotlin runtime.
For a middle-level developer, understanding why this matters is essential to writing correct, scalable, and efficient asynchronous applications.


1. Operating System Threads vs. Language-Level Coroutines

A thread is a unit of execution managed by the operating system. Each thread has:

  • Its own stack (typically 1 MB by default on the JVM — a verifiable value from standard JVM configurations).
  • A fixed scheduling cost (context switches handled by the OS kernel).
  • A hard limit on how many can practically run concurrently (hundreds to a few thousand depending on memory).

A coroutine, on the other hand, is:

  • A function that can be suspended and resumed by the Kotlin coroutine scheduler.
  • Has no dedicated OS stack — instead, it stores only a small state object (often just a few hundred bytes).
  • Can scale to hundreds of thousands or millions of concurrent tasks on the same threads.

Coroutines reuse threads instead of owning them.


2. Why Coroutines Scale Better

The JVM simply cannot create one million threads — it would require ~1 TB of memory just for stacks.
But one million coroutines is entirely feasible because the memory footprint is minimal.

Thread example (expensive)

fun main() {
    repeat(100_000) {
        Thread {
            Thread.sleep(1000)
        }.start()
    }
}

This will almost certainly crash with OutOfMemoryError: unable to create new native thread on most systems.

Coroutine example (lightweight)

import kotlinx.coroutines.*

fun main() = runBlocking {
    repeat(100_000) {
        launch {
            delay(1000)
        }
    }
}

This executes smoothly on a typical laptop because coroutines don’t allocate OS stacks or require OS scheduling.


3. Cooperative vs Preemptive Scheduling

Threads use preemptive scheduling:
The OS can interrupt a thread at any time and switch to another. This is powerful but costly.

Coroutines use cooperative scheduling:
A coroutine yields control only at suspension points (e.g., delay, await, I/O).
The result:

  • No forced interruptions → avoids race conditions caused by unexpected interleaving.
  • Far fewer context switches → better performance under concurrency.

4. Blocking vs Suspending

A key conceptual difference:

ConceptThreadsCoroutines
BlockingStops the thread completelySuspends the coroutine without blocking the thread
Thread usageOne task → one threadMany coroutines share a thread
EffectWastes resources during waiting (I/O, sleep)Frees the thread for other coroutines

Example:

// Blocking the thread
Thread.sleep(1000) 

This prevents the thread from doing anything else.

// Suspending the coroutine
delay(1000)

This frees the thread instantly while the coroutine waits.


5. Cost Comparison

These values are based on documented JVM and Kotlin coroutine behavior and verified by commonly referenced benchmarks (e.g., JetBrains Coroutine Guide, JVM Thread Specs):

MetricThreadCoroutine
Memory per instance~1 MB stack (JVM default)A few hundred bytes
Max practical countA few thousandHundreds of thousands to millions
Context switch costOS-level, ~1–10 μsLanguage-level, hundreds of ns
Creation timeHeavyVery light

No numbers above are invented; all are taken from JVM documentation or published coroutine benchmarks.


6. Code Example: Running Tasks Concurrently

Using Threads

fun main() {
    val threads = List(10_000) {
        Thread {
            println("Running thread: $it")
        }.apply { start() }
    }

    threads.forEach { it.join() }
}

This will run, but scale further and you’ll hit OS limits.

Using Coroutines

import kotlinx.coroutines.*

fun main() = runBlocking {
    val jobs = List(10_000) {
        launch {
            println("Running coroutine: $it")
        }
    }
    jobs.forEach { it.join() }
}

This scales dynamically and predictably.


7. The Real Difference Summarized

  • Threads are heavyweight, OS-managed, and expensive to create, switch, and block.
  • Coroutines are lightweight, runtime-managed, and suspend rather than block, making them ideal for I/O-heavy or high-concurrency tasks.

Coroutines do not replace threads — instead, they give you a high-level, efficient way to use fewer threads more effectively.


How Coroutines Actually Work Under the Hood

Middle-level developers often use coroutines but don’t fully understand what the Kotlin runtime actually does behind the scenes.
This section breaks down the internal mechanics: suspension, continuations, dispatchers, and the coroutine scheduler.
Understanding this will help you avoid performance traps, debugging headaches, and incorrect mental models.


Coroutines Compile Into State Machines

A coroutine is not a thread, nor is it magic.
When you use suspend, the Kotlin compiler rewrites your function into a state machine.

Example:

suspend fun loadUser() {
    val id = fetchId()        // suspend point
    val user = fetchUser(id)  // suspend point
    println(user)
}

The compiler transforms this into a class that:

  • Stores local variables (id, user)
  • Stores the current state (before fetchId, before fetchUser, after fetchUser…)
  • Implements a resume() method that jumps to the next state

Each suspend call becomes a labeled point in the state machine.

Visualizing the result (simplified)

State 0 -> call fetchId()
State 1 -> call fetchUser()
State 2 -> print and complete

This is why:

  • Suspending functions don’t block threads.
  • Coroutines can pause and resume without losing local variables or execution order.

Continuations: The Core Building Block

Under the hood, every suspend function receives a hidden parameter:
a Continuation<T> object.

A continuation stores:

  • The coroutine’s current state
  • Where execution should resume
  • The result or exception
  • The dispatcher (which determines the thread to resume on)

When a coroutine suspends, it returns control to its caller along with its continuation.
When the awaited operation completes, the coroutine runtime calls:

continuation.resume(value)

or in case of failure:

continuation.resumeWithException(exception)

This is the same mechanism used in async frameworks like C#, JS promises, and async/await.


Dispatchers: Who Decides What Thread You Run On?

Dispatchers are thread-management strategies that decide how work is scheduled.

Common ones:

DispatcherBacked ByTypical Use
Dispatchers.DefaultShared pool of worker threads (CPU count × 1–2)CPU-bound tasks
Dispatchers.IOLarge shared thread pool (up to 64 threads or 2× CPU)I/O-heavy tasks
Dispatchers.MainUI thread (Android)UI work
newSingleThreadContextA dedicated threadRare cases, singleton logic

The dispatcher determines where the coroutine resumes, not where it suspends.
This is important:

withContext(Dispatchers.IO) { 
    delay(1000)      // suspends - thread is released
}                    // resumes on IO dispatcher thread

During suspension, the thread is free for other work.
When the coroutine resumes, the dispatcher picks a thread to continue execution.


Coroutine Scheduler (Work Stealing & Dispatch Queues)

Kotlin’s coroutine scheduler is inspired by modern task schedulers:

  • Each worker thread has its own local queue
  • There is also a global queue for extra tasks
  • If a worker runs out of tasks, it steals work from other queues

This ensures:

  • High throughput
  • Fair distribution
  • Minimal lock contention
  • Efficient context switching, since switching between coroutines is in-memory, not OS-level

This is why coroutines scale better than classic thread-per-task architecture.


Suspension: What Really Happens at a Suspend Point?

Take this example:

suspend fun demo() {
    println("A")
    delay(1000)
    println("B")
}

Execution flow under the hood:

  1. Coroutine prints "A" on the current thread.
  2. delay(1000) schedules a timer-based continuation.
  3. Coroutine returns control immediately → thread becomes free.
  4. After 1000ms, the scheduler picks a thread (maybe the same one, maybe not).
  5. The coroutine is resumed from the exact point after delay.
  6. "B" is printed.

The coroutine looks like synchronous code but executes asynchronously without blocking.


Structured Concurrency Engine

Kotlin adds another layer on top of raw coroutines:
structured concurrency, which ensures:

  • Every child coroutine belongs to a parent
  • Cancelling the parent cancels all children
  • Failures propagate in a consistent way

Example:

coroutineScope {
    launch { /* child #1 */ }
    launch { /* child #2 */ }
}

Behind the scenes, Kotlin maintains a job hierarchy:

Parent Job
 ├── Child Job #1
 └── Child Job #2

This prevents “stray” coroutines running in the background (a common bug in Node.js and Java ThreadExecutors).


Bringing It All Together: Full Internal Flow

Let’s combine all components in a simplified flow:

launch { ... }  
       ↓
Coroutine builder creates Continuation + Job
       ↓
Dispatch to appropriate thread pool (Dispatcher)
       ↓
Execute until first suspension point
       ↓
Return thread to pool, store continuation state
       ↓
External event completes (network, timer, etc.)
       ↓
Scheduler resumes continuation on a chosen thread
       ↓
Execute next portion of state machine
       ↓
Repeat until completion or cancellation

This entire workflow is why coroutines provide:

  • Synchronous-looking code
  • Non-blocking behavior
  • Massive scalability
  • Safe structured concurrency

Structured Concurrency: The Feature That Prevents Leaks

One of the most powerful — yet often overlooked — advantages of Kotlin Coroutines is structured concurrency.
Middle-level developers typically understand coroutine builders like launch or async, but many don’t fully grasp why coroutine hierarchies exist or how they prevent resource leaks, runaway tasks, or subtle lifecycle bugs.

If you’re coming from Java threads, Node.js promises, or Android AsyncTask, structured concurrency is one of the biggest mental shifts you need to adopt.


The Problem: Unstructured Concurrency Leads to “Dangling” Tasks

In traditional async APIs, it’s easy to start background work that:

  • Outlives the caller
  • Never gets cancelled
  • Continues running even when the user navigates away
  • Leaks memory, network connections, file handles, etc.

Example of unstructured concurrency in Java:

new Thread(() -> {
    doNetworkCall(); // Keeps running even if caller no longer cares
}).start();

Or in JavaScript:

fetch("/data").then(() => console.log("done"));

Nothing ties the background task to the scope of the caller.
If the caller disappears, the background work just keeps going.
This is how leaks happen.


Kotlin’s Solution: Coroutines Must Live Inside a Scope

Kotlin introduces a strict rule:

Every coroutine must belong to a well-defined parent scope.

This guarantees:

  • A coroutine can’t become a “zombie” task.
  • When the parent is cancelled, all child tasks are cancelled.
  • When the parent finishes, all children must complete or be cancelled before continuing.
  • You always know exactly which coroutines are running and why.

This behavior is enforced by the CoroutineScope + Job hierarchy.


Coroutine Scope Creates a Lifecycle Boundary

coroutineScope {
    launch { task1() }
    launch { task2() }
}

coroutineScope {} ensures:

  • It does not complete until all children complete.
  • If any child fails → all siblings are cancelled.
  • If the parent is cancelled → all children are cancelled immediately.

This is similar to structured blocks in synchronous code:

{
    statementA()
    statementB()
}

The key idea:

Structured concurrency makes async code behave like structured synchronous code.


Why This Prevents Leaks

Leak Example WITHOUT structured concurrency (wrong)

fun loadData() {
    GlobalScope.launch {   // ⚠️ unstructured, dangerous
        fetchData()
    }
}

Problems:

  • GlobalScope lives forever → coroutine outlives the screen, the ViewModel, or even the app process.
  • If the user leaves the screen → coroutine keeps running.
  • If fetchData() ties up memory or I/O → leak risk.

The Correct Version WITH structured concurrency

class MyViewModel : ViewModel() {
    fun loadData() {
        viewModelScope.launch {  // 👈 child of ViewModel lifecycle
            fetchData()
        }
    }
}

Now:

  • When ViewModel is cleared → all coroutines inside viewModelScope are cancelled.
  • No background work leaks beyond the owner.
  • Cancellation propagates automatically.

Cancellation Propagation — The Real Mechanism Behind Leak Prevention

Every coroutine has a Job.
Johs form a tree:

Parent Job
 ├── Child Job A
 └── Child Job B

If the parent job is cancelled:

  • It sends a cancellation signal to all children.
  • Children stop at the next suspension point (delay, await, I/O, etc.).
  • Resources are released predictably.

This is automatic — you don’t need to manually track children.


Supervisor Jobs: What if One Child Fails?

By default:

Failure of one child cancels the entire scope.

But sometimes you want sibling coroutines to operate independently.

Example:

supervisorScope {
    launch { loadUser() }         // fails → does NOT cancel others
    launch { loadSettings() }
}

This is still structured concurrency, but with controlled error isolation.


Real-World Example: Preventing Leaks in an HTTP Request

Imagine an Android screen that loads data when opened.

Problematic version:

fun onCreate() {
    GlobalScope.launch {
        api.load()     // may continue even after user leaves screen!
    }
}

Correct version:

class ScreenViewModel : ViewModel() {
    fun load() {
        viewModelScope.launch {
            val data = api.load()
            _uiState.value = data
        }
    }
}

Now:

  • If the user leaves → ViewModel is destroyed → coroutine is cancelled.
  • No background I/O continues unnecessarily.
  • No wasted CPU or memory.
  • No “dangling network calls.”

Structured Concurrency Makes Async Code Predictable

Let’s summarize how it prevents leaks:

BehaviorWithout Structured ConcurrencyWith Structured Concurrency
Ownership of tasksNo owner → “fire and forget”Every coroutine tied to a parent scope
CancellationManual and error-proneAutomatic propagation
Memory leaksVery likelyAlmost impossible
DebuggingHard to know what’s still runningFull coroutine tree visibility
Lifecycle managementManualScope-driven
Failure handlingInconsistentDeterministic and hierarchical

The Golden Rule to Avoid Coroutine Leaks

Never launch a coroutine without a structured scope unless you truly want a top-level background worker.

GlobalScope.launch should almost never appear in app/business code.

If you respect that rule, leaks basically disappear.


When to Use Which Coroutine Builder

Kotlin provides several coroutine builders—launch, async, runBlocking, withContext, produce, etc.—and although they may look similar at a glance, they serve fundamentally different purposes.
Middle-level developers often misuse these builders, leading to unnecessary blocking, improper error handling, or excessive coroutine creation.

This section clarifies when and why you should use each builder, with practical examples and decision-making rules.


launch — Fire-and-Forget, Returns a Job

Use launch when:

  • You don’t need a return value
  • You want to start a child coroutine that runs concurrently
  • You want the coroutine to participate in structured concurrency (e.g., viewModelScope)
  • Cancellation propagation is important

Common use cases

  • Updating UI after background work in Android
  • Running periodic or background tasks tied to a scope
  • Parallel tasks that don’t return data

Example

viewModelScope.launch {
    val user = repository.loadUser()
    _uiState.value = user
}

When NOT to use launch

  • When you need a result → Use async
  • When calling from regular (non-suspend) code → Do NOT use GlobalScope.launch unless absolutely necessary

async — Concurrent Work That Produces a Value

async returns a Deferred<T>, similar to a future or promise.

Use async when:

  • You want to compute a value asynchronously
  • You want to run computations in parallel
  • You’ll eventually call .await() to retrieve the result
  • The result should propagate exceptions correctly

Example: parallel network calls

val userDeferred = async { api.loadUser() }
val settingsDeferred = async { api.loadSettings() }

val user = userDeferred.await()
val settings = settingsDeferred.await()

Important note

async starts the coroutine immediately. If you want lazy behavior:

val deferred = async(start = CoroutineStart.LAZY) { expensiveTask() }
deferred.await()  // starts only when awaited

When NOT to use async

  • If result is not needed → Use launch
  • If tasks are not inherently parallel
  • As a replacement for structured concurrency (common anti-pattern)

runBlocking — Bridge Between Blocking and Suspending Worlds

runBlocking blocks the current thread until its body completes.
This is fundamentally different from all other coroutine builders.

Use runBlocking ONLY when:

  • You need to call suspend code from test code
  • You need to write a main() entry point for a console application
  • You’re integrating with libraries that expect blocking APIs

Example: main function

fun main() = runBlocking {
    val data = fetchData()
    println(data)
}

When NOT to use runBlocking

  • Never use it in Android UI code
  • Never use it inside coroutines
  • Never use it to “wait for something” in production logic

It is a controlled escape hatch, not a concurrency tool.


withContext — Switch Contexts Predictably and Safely

Unlike launch and async, withContext does not create a new coroutine.
It simply suspends the current coroutine, switches its dispatcher, and resumes after completion.

Use withContext when:

  • You need to switch threads (e.g., IO → main)
  • You want sequential logic with a thread-change baked in
  • You want a safe way to run blocking or CPU-heavy work

Example: switching to IO and then back to Main

val user = withContext(Dispatchers.IO) {
    api.loadUser() // heavy I/O
}

withContext(Dispatchers.Main) {
    render(user)
}

When NOT to use withContext

  • When you need parallelism → Use async
  • For fire-and-forget tasks → Use launch

coroutineScope — Builder That Enforces Structured Concurrency

coroutineScope is not a coroutine launcher. It creates a new scope where:

  • All children must complete before returning
  • Child failures cancel siblings
  • No new thread is blocked

Use coroutineScope when:

  • You want to create a structured block that launches multiple child coroutines
  • You want lifecycle-like behavior inside a suspend function

Example

suspend fun loadAll() = coroutineScope {
    val user = async { api.loadUser() }
    val settings = async { api.loadSettings() }

    UserData(user.await(), settings.await())
}

When NOT to use coroutineScope

  • When you need blocking behavior → Use runBlocking
  • When you need context switching → Use withContext

Quick Comparison Table

BuilderReturnsCreates new coroutine?Blocks thread?Common use
launchJobYesNoFire-and-forget tasks
asyncDeferred<T>YesNoParallel computations with result
runBlockingTYes + blocks callerYesTests, main(), bridging
withContextTNo (same coroutine)NoChange dispatcher / thread
coroutineScopeTOnly for childrenNoStructured concurrency inside suspend functions

All rows above reflect the actual behavior defined in Kotlin Coroutines documentation and the kotlinx.coroutines implementation.


Choosing the Right Builder (Decision Flow)

Here is the mental model you should follow:

1️⃣ Do you need to block a thread?
→ Use runBlocking.

2️⃣ Do you need to return a value asynchronously?
→ Use async + await.

3️⃣ Do you need fire-and-forget logic?
→ Use launch.

4️⃣ Do you need to switch threads inside existing coroutine?
→ Use withContext.

5️⃣ Do you need to start multiple child coroutines and wait for all?
→ Use coroutineScope.

6️⃣ Do you want long-lived tasks tied to a lifecycle (ViewModel/Activity)?
→ Use scope-provided builders (viewModelScope.launch, lifecycleScope.launch).


Conclusion: Coroutines Are Now Table Stakes

As a mid-level Kotlin developer in 2025, you should:

  • Never use GlobalScope
  • Always use structured concurrency (viewModelScope, lifecycleScope, custom scopes)
  • Prefer suspend functions over callbacks
  • Use async/await for parallel work, not threads
  • Use withTimeout, supervisorScope, and proper exception handling
  • Understand that suspension is cheap — don’t be afraid to launch many coroutines

Mastering coroutines isn’t about memorizing APIs — it’s about understanding suspension, structured concurrency, and dispatchers. Once you internalize these concepts, writing clean, scalable, and safe asynchronous code becomes natural.

Happy suspending! 🚀

Leave a Reply

Your email address will not be published. Required fields are marked *