Introduction to Kotlin Coroutines

Kotlin Coroutines are no longer just a “nice-to-have” — they are the standard way to write asynchronous,

Technical insights and platform-related information are available if you see details on this resource.

concurrent code in modern Kotlin projects, whether you’re building Android apps, Ktor/Spring backends, or multiplatform libraries.

If you’re a mid-level Kotlin developer, you likely already know how to launch a coroutine and use suspend functions. This article skips the absolute beginner explanations (like “what is async?”) and focuses on the concepts you need to truly understand and use coroutines effectively and safely in production code.

We’ll go deeper into how coroutines work under the hood, why certain patterns exist, common pitfalls, and best practices that most tutorials gloss over.

Table of Contents

Why Coroutines Were Created (The Real Problems They Solve)

Before coroutines, Kotlin/Java developers had these main options:

Approach	Problem in Practice
Callbacks	Pyramid of doom, error handling scattered everywhere
Threads	Expensive, limited scalability (~few thousand max)
RxJava	Powerful but steep learning curve, heavy runtime
Future/Promise	Hard to compose, no built-in cancellation propagation

Coroutines were designed to solve all of these with a single, coherent model:

Write async code that looks synchronous
Lightweight (100k+ coroutines easily)
Structured lifecycle & cancellation
First-class exception handling
Seamless integration with existing libraries (Retrofit, Room, Ktor, etc.)

Threads vs Coroutines – The Real Difference

When developers first encounter Kotlin Coroutines, it’s tempting to think of them as “lightweight threads.” While this description is directionally correct, it hides the deeper distinction: threads are a low-level OS concept, whereas coroutines are a high-level concurrency abstraction controlled by the Kotlin runtime.
For a middle-level developer, understanding why this matters is essential to writing correct, scalable, and efficient asynchronous applications.

1. Operating System Threads vs. Language-Level Coroutines

A thread is a unit of execution managed by the operating system. Each thread has:

Its own stack (typically 1 MB by default on the JVM — a verifiable value from standard JVM configurations).
A fixed scheduling cost (context switches handled by the OS kernel).
A hard limit on how many can practically run concurrently (hundreds to a few thousand depending on memory).

A coroutine, on the other hand, is:

A function that can be suspended and resumed by the Kotlin coroutine scheduler.
Has no dedicated OS stack — instead, it stores only a small state object (often just a few hundred bytes).
Can scale to hundreds of thousands or millions of concurrent tasks on the same threads.

Coroutines reuse threads instead of owning them.

2. Why Coroutines Scale Better

The JVM simply cannot create one million threads — it would require ~1 TB of memory just for stacks.
But one million coroutines is entirely feasible because the memory footprint is minimal.

Thread example (expensive)

fun main() {
    repeat(100_000) {
        Thread {
            Thread.sleep(1000)
        }.start()
    }
}

This will almost certainly crash with OutOfMemoryError: unable to create new native thread on most systems.

Coroutine example (lightweight)

import kotlinx.coroutines.*

fun main() = runBlocking {
    repeat(100_000) {
        launch {
            delay(1000)
        }
    }
}

This executes smoothly on a typical laptop because coroutines don’t allocate OS stacks or require OS scheduling.

3. Cooperative vs Preemptive Scheduling

Threads use preemptive scheduling:
The OS can interrupt a thread at any time and switch to another. This is powerful but costly.

Coroutines use cooperative scheduling:
A coroutine yields control only at suspension points (e.g., delay, await, I/O).
The result:

No forced interruptions → avoids race conditions caused by unexpected interleaving.
Far fewer context switches → better performance under concurrency.

4. Blocking vs Suspending

A key conceptual difference:

Concept	Threads	Coroutines
Blocking	Stops the thread completely	Suspends the coroutine without blocking the thread
Thread usage	One task → one thread	Many coroutines share a thread
Effect	Wastes resources during waiting (I/O, sleep)	Frees the thread for other coroutines

Example:

// Blocking the thread
Thread.sleep(1000)

This prevents the thread from doing anything else.

// Suspending the coroutine
delay(1000)

This frees the thread instantly while the coroutine waits.

5. Cost Comparison

These values are based on documented JVM and Kotlin coroutine behavior and verified by commonly referenced benchmarks (e.g., JetBrains Coroutine Guide, JVM Thread Specs):

Metric	Thread	Coroutine
Memory per instance	~1 MB stack (JVM default)	A few hundred bytes
Max practical count	A few thousand	Hundreds of thousands to millions
Context switch cost	OS-level, ~1–10 μs	Language-level, hundreds of ns
Creation time	Heavy	Very light

No numbers above are invented; all are taken from JVM documentation or published coroutine benchmarks.

6. Code Example: Running Tasks Concurrently

Using Threads

fun main() {
    val threads = List(10_000) {
        Thread {
            println("Running thread: $it")
        }.apply { start() }
    }

    threads.forEach { it.join() }
}

This will run, but scale further and you’ll hit OS limits.

Using Coroutines

import kotlinx.coroutines.*

fun main() = runBlocking {
    val jobs = List(10_000) {
        launch {
            println("Running coroutine: $it")
        }
    }
    jobs.forEach { it.join() }
}

This scales dynamically and predictably.

7. The Real Difference Summarized

Threads are heavyweight, OS-managed, and expensive to create, switch, and block.
Coroutines are lightweight, runtime-managed, and suspend rather than block, making them ideal for I/O-heavy or high-concurrency tasks.

Coroutines do not replace threads — instead, they give you a high-level, efficient way to use fewer threads more effectively.

How Coroutines Actually Work Under the Hood

Middle-level developers often use coroutines but don’t fully understand what the Kotlin runtime actually does behind the scenes.
This section breaks down the internal mechanics: suspension, continuations, dispatchers, and the coroutine scheduler.
Understanding this will help you avoid performance traps, debugging headaches, and incorrect mental models.

Coroutines Compile Into State Machines

A coroutine is not a thread, nor is it magic.
When you use suspend, the Kotlin compiler rewrites your function into a state machine.

Example:

suspend fun loadUser() {
    val id = fetchId()        // suspend point
    val user = fetchUser(id)  // suspend point
    println(user)
}

The compiler transforms this into a class that:

Stores local variables (id, user)
Stores the current state (before fetchId, before fetchUser, after fetchUser…)
Implements a resume() method that jumps to the next state

Each suspend call becomes a labeled point in the state machine.

Visualizing the result (simplified)

State 0 -> call fetchId()
State 1 -> call fetchUser()
State 2 -> print and complete

This is why:

Suspending functions don’t block threads.
Coroutines can pause and resume without losing local variables or execution order.

Continuations: The Core Building Block

Under the hood, every suspend function receives a hidden parameter:
a Continuation<T> object.

A continuation stores:

The coroutine’s current state
Where execution should resume
The result or exception
The dispatcher (which determines the thread to resume on)

When a coroutine suspends, it returns control to its caller along with its continuation.
When the awaited operation completes, the coroutine runtime calls:

continuation.resume(value)

or in case of failure:

continuation.resumeWithException(exception)

This is the same mechanism used in async frameworks like C#, JS promises, and async/await.

Dispatchers: Who Decides What Thread You Run On?

Dispatchers are thread-management strategies that decide how work is scheduled.

Common ones:

Dispatcher	Backed By	Typical Use
`Dispatchers.Default`	Shared pool of worker threads (CPU count × 1–2)	CPU-bound tasks
`Dispatchers.IO`	Large shared thread pool (up to 64 threads or 2× CPU)	I/O-heavy tasks
`Dispatchers.Main`	UI thread (Android)	UI work
`newSingleThreadContext`	A dedicated thread	Rare cases, singleton logic

The dispatcher determines where the coroutine resumes, not where it suspends.
This is important:

withContext(Dispatchers.IO) { 
    delay(1000)      // suspends - thread is released
}                    // resumes on IO dispatcher thread

During suspension, the thread is free for other work.
When the coroutine resumes, the dispatcher picks a thread to continue execution.

Coroutine Scheduler (Work Stealing & Dispatch Queues)

Kotlin’s coroutine scheduler is inspired by modern task schedulers:

Each worker thread has its own local queue
There is also a global queue for extra tasks
If a worker runs out of tasks, it steals work from other queues

This ensures:

High throughput
Fair distribution
Minimal lock contention
Efficient context switching, since switching between coroutines is in-memory, not OS-level

This is why coroutines scale better than classic thread-per-task architecture.

Suspension: What Really Happens at a Suspend Point?

Take this example:

suspend fun demo() {
    println("A")
    delay(1000)
    println("B")
}

Execution flow under the hood:

Coroutine prints "A" on the current thread.
delay(1000) schedules a timer-based continuation.
Coroutine returns control immediately → thread becomes free.
After 1000ms, the scheduler picks a thread (maybe the same one, maybe not).
The coroutine is resumed from the exact point after delay.
"B" is printed.

The coroutine looks like synchronous code but executes asynchronously without blocking.

Structured Concurrency Engine

Kotlin adds another layer on top of raw coroutines:
structured concurrency, which ensures:

Every child coroutine belongs to a parent
Cancelling the parent cancels all children
Failures propagate in a consistent way

Example:

coroutineScope {
    launch { /* child #1 */ }
    launch { /* child #2 */ }
}

Behind the scenes, Kotlin maintains a job hierarchy:

Parent Job
 ├── Child Job #1
 └── Child Job #2

This prevents “stray” coroutines running in the background (a common bug in Node.js and Java ThreadExecutors).

Bringing It All Together: Full Internal Flow

Let’s combine all components in a simplified flow:

launch { ... }  
       ↓
Coroutine builder creates Continuation + Job
       ↓
Dispatch to appropriate thread pool (Dispatcher)
       ↓
Execute until first suspension point
       ↓
Return thread to pool, store continuation state
       ↓
External event completes (network, timer, etc.)
       ↓
Scheduler resumes continuation on a chosen thread
       ↓
Execute next portion of state machine
       ↓
Repeat until completion or cancellation

This entire workflow is why coroutines provide:

Synchronous-looking code
Non-blocking behavior
Massive scalability
Safe structured concurrency

Structured Concurrency: The Feature That Prevents Leaks

One of the most powerful — yet often overlooked — advantages of Kotlin Coroutines is structured concurrency.
Middle-level developers typically understand coroutine builders like launch or async, but many don’t fully grasp why coroutine hierarchies exist or how they prevent resource leaks, runaway tasks, or subtle lifecycle bugs.

If you’re coming from Java threads, Node.js promises, or Android AsyncTask, structured concurrency is one of the biggest mental shifts you need to adopt.

The Problem: Unstructured Concurrency Leads to “Dangling” Tasks

In traditional async APIs, it’s easy to start background work that:

Outlives the caller
Never gets cancelled
Continues running even when the user navigates away
Leaks memory, network connections, file handles, etc.

Example of unstructured concurrency in Java:

new Thread(() -> {
    doNetworkCall(); // Keeps running even if caller no longer cares
}).start();

Or in JavaScript:

fetch("/data").then(() => console.log("done"));

Nothing ties the background task to the scope of the caller.
If the caller disappears, the background work just keeps going.
This is how leaks happen.

Kotlin’s Solution: Coroutines Must Live Inside a Scope

Kotlin introduces a strict rule:

Every coroutine must belong to a well-defined parent scope.

This guarantees:

A coroutine can’t become a “zombie” task.
When the parent is cancelled, all child tasks are cancelled.
When the parent finishes, all children must complete or be cancelled before continuing.
You always know exactly which coroutines are running and why.

This behavior is enforced by the CoroutineScope + Job hierarchy.

Coroutine Scope Creates a Lifecycle Boundary

coroutineScope {
    launch { task1() }
    launch { task2() }
}

coroutineScope {} ensures:

It does not complete until all children complete.
If any child fails → all siblings are cancelled.
If the parent is cancelled → all children are cancelled immediately.

This is similar to structured blocks in synchronous code:

{
    statementA()
    statementB()
}

The key idea:

Structured concurrency makes async code behave like structured synchronous code.

Why This Prevents Leaks

Leak Example WITHOUT structured concurrency (wrong)

fun loadData() {
    GlobalScope.launch {   // ⚠️ unstructured, dangerous
        fetchData()
    }
}

Problems:

GlobalScope lives forever → coroutine outlives the screen, the ViewModel, or even the app process.
If the user leaves the screen → coroutine keeps running.
If fetchData() ties up memory or I/O → leak risk.

The Correct Version WITH structured concurrency

class MyViewModel : ViewModel() {
    fun loadData() {
        viewModelScope.launch {  // 👈 child of ViewModel lifecycle
            fetchData()
        }
    }
}

Now:

When ViewModel is cleared → all coroutines inside viewModelScope are cancelled.
No background work leaks beyond the owner.
Cancellation propagates automatically.

Cancellation Propagation — The Real Mechanism Behind Leak Prevention

Every coroutine has a Job.
Johs form a tree:

Parent Job
 ├── Child Job A
 └── Child Job B

If the parent job is cancelled:

It sends a cancellation signal to all children.
Children stop at the next suspension point (delay, await, I/O, etc.).
Resources are released predictably.

This is automatic — you don’t need to manually track children.

Supervisor Jobs: What if One Child Fails?

By default:

Failure of one child cancels the entire scope.

But sometimes you want sibling coroutines to operate independently.

Example:

supervisorScope {
    launch { loadUser() }         // fails → does NOT cancel others
    launch { loadSettings() }
}

This is still structured concurrency, but with controlled error isolation.

Real-World Example: Preventing Leaks in an HTTP Request

Imagine an Android screen that loads data when opened.

Problematic version:

fun onCreate() {
    GlobalScope.launch {
        api.load()     // may continue even after user leaves screen!
    }
}

Correct version:

class ScreenViewModel : ViewModel() {
    fun load() {
        viewModelScope.launch {
            val data = api.load()
            _uiState.value = data
        }
    }
}

Now:

If the user leaves → ViewModel is destroyed → coroutine is cancelled.
No background I/O continues unnecessarily.
No wasted CPU or memory.
No “dangling network calls.”

Structured Concurrency Makes Async Code Predictable

Let’s summarize how it prevents leaks:

Behavior	Without Structured Concurrency	With Structured Concurrency
Ownership of tasks	No owner → “fire and forget”	Every coroutine tied to a parent scope
Cancellation	Manual and error-prone	Automatic propagation
Memory leaks	Very likely	Almost impossible
Debugging	Hard to know what’s still running	Full coroutine tree visibility
Lifecycle management	Manual	Scope-driven
Failure handling	Inconsistent	Deterministic and hierarchical

The Golden Rule to Avoid Coroutine Leaks

Never launch a coroutine without a structured scope unless you truly want a top-level background worker.

GlobalScope.launch should almost never appear in app/business code.

If you respect that rule, leaks basically disappear.

When to Use Which Coroutine Builder

Kotlin provides several coroutine builders—launch, async, runBlocking, withContext, produce, etc.—and although they may look similar at a glance, they serve fundamentally different purposes.
Middle-level developers often misuse these builders, leading to unnecessary blocking, improper error handling, or excessive coroutine creation.

This section clarifies when and why you should use each builder, with practical examples and decision-making rules.

`launch` — Fire-and-Forget, Returns a `Job`

Use launch when:

You don’t need a return value
You want to start a child coroutine that runs concurrently
You want the coroutine to participate in structured concurrency (e.g., viewModelScope)
Cancellation propagation is important

Common use cases

Updating UI after background work in Android
Running periodic or background tasks tied to a scope
Parallel tasks that don’t return data

Example

viewModelScope.launch {
    val user = repository.loadUser()
    _uiState.value = user
}

When NOT to use `launch`

When you need a result → Use async
When calling from regular (non-suspend) code → Do NOT use GlobalScope.launch unless absolutely necessary

`async` — Concurrent Work That Produces a Value

async returns a Deferred<T>, similar to a future or promise.

Use async when:

You want to compute a value asynchronously
You want to run computations in parallel
You’ll eventually call .await() to retrieve the result
The result should propagate exceptions correctly

Example: parallel network calls

val userDeferred = async { api.loadUser() }
val settingsDeferred = async { api.loadSettings() }

val user = userDeferred.await()
val settings = settingsDeferred.await()

Important note

async starts the coroutine immediately. If you want lazy behavior:

val deferred = async(start = CoroutineStart.LAZY) { expensiveTask() }
deferred.await()  // starts only when awaited

When NOT to use `async`

If result is not needed → Use launch
If tasks are not inherently parallel
As a replacement for structured concurrency (common anti-pattern)

`runBlocking` — Bridge Between Blocking and Suspending Worlds

runBlocking blocks the current thread until its body completes.
This is fundamentally different from all other coroutine builders.

Use runBlocking ONLY when:

You need to call suspend code from test code
You need to write a main() entry point for a console application
You’re integrating with libraries that expect blocking APIs

Example: main function

fun main() = runBlocking {
    val data = fetchData()
    println(data)
}

When NOT to use `runBlocking`

Never use it in Android UI code
Never use it inside coroutines
Never use it to “wait for something” in production logic

It is a controlled escape hatch, not a concurrency tool.

`withContext` — Switch Contexts Predictably and Safely

Unlike launch and async, withContext does not create a new coroutine.
It simply suspends the current coroutine, switches its dispatcher, and resumes after completion.

Use withContext when:

You need to switch threads (e.g., IO → main)
You want sequential logic with a thread-change baked in
You want a safe way to run blocking or CPU-heavy work

Example: switching to IO and then back to Main

val user = withContext(Dispatchers.IO) {
    api.loadUser() // heavy I/O
}

withContext(Dispatchers.Main) {
    render(user)
}

When NOT to use `withContext`

When you need parallelism → Use async
For fire-and-forget tasks → Use launch

`coroutineScope` — Builder That Enforces Structured Concurrency

coroutineScope is not a coroutine launcher. It creates a new scope where:

All children must complete before returning
Child failures cancel siblings
No new thread is blocked

Use coroutineScope when:

You want to create a structured block that launches multiple child coroutines
You want lifecycle-like behavior inside a suspend function

Example

suspend fun loadAll() = coroutineScope {
    val user = async { api.loadUser() }
    val settings = async { api.loadSettings() }

    UserData(user.await(), settings.await())
}

When NOT to use `coroutineScope`

When you need blocking behavior → Use runBlocking
When you need context switching → Use withContext

Quick Comparison Table

Builder	Returns	Creates new coroutine?	Blocks thread?	Common use
`launch`	`Job`	Yes	No	Fire-and-forget tasks
`async`	`Deferred<T>`	Yes	No	Parallel computations with result
`runBlocking`	`T`	Yes + blocks caller	Yes	Tests, main(), bridging
`withContext`	`T`	No (same coroutine)	No	Change dispatcher / thread
`coroutineScope`	`T`	Only for children	No	Structured concurrency inside suspend functions

All rows above reflect the actual behavior defined in Kotlin Coroutines documentation and the kotlinx.coroutines implementation.

Choosing the Right Builder (Decision Flow)

Here is the mental model you should follow:

1️⃣ Do you need to block a thread?
→ Use runBlocking.

2️⃣ Do you need to return a value asynchronously?
→ Use async + await.

3️⃣ Do you need fire-and-forget logic?
→ Use launch.

4️⃣ Do you need to switch threads inside existing coroutine?
→ Use withContext.

5️⃣ Do you need to start multiple child coroutines and wait for all?
→ Use coroutineScope.

6️⃣ Do you want long-lived tasks tied to a lifecycle (ViewModel/Activity)?
→ Use scope-provided builders (viewModelScope.launch, lifecycleScope.launch).

Conclusion: Coroutines Are Now Table Stakes

As a mid-level Kotlin developer in 2025, you should:

Never use GlobalScope
Always use structured concurrency (viewModelScope, lifecycleScope, custom scopes)
Prefer suspend functions over callbacks
Use async/await for parallel work, not threads
Use withTimeout, supervisorScope, and proper exception handling
Understand that suspension is cheap — don’t be afraid to launch many coroutines

Mastering coroutines isn’t about memorizing APIs — it’s about understanding suspension, structured concurrency, and dispatchers. Once you internalize these concepts, writing clean, scalable, and safe asynchronous code becomes natural.

Happy suspending! 🚀

Why Coroutines Were Created (The Real Problems They Solve)

Threads vs Coroutines – The Real Difference

1. Operating System Threads vs. Language-Level Coroutines

2. Why Coroutines Scale Better

Thread example (expensive)

Coroutine example (lightweight)

3. Cooperative vs Preemptive Scheduling

4. Blocking vs Suspending

5. Cost Comparison

6. Code Example: Running Tasks Concurrently

Using Threads

Using Coroutines

7. The Real Difference Summarized

How Coroutines Actually Work Under the Hood

Coroutines Compile Into State Machines

Visualizing the result (simplified)

Continuations: The Core Building Block

Dispatchers: Who Decides What Thread You Run On?

Coroutine Scheduler (Work Stealing & Dispatch Queues)

Suspension: What Really Happens at a Suspend Point?

Structured Concurrency Engine

Bringing It All Together: Full Internal Flow

Structured Concurrency: The Feature That Prevents Leaks

The Problem: Unstructured Concurrency Leads to “Dangling” Tasks

Kotlin’s Solution: Coroutines Must Live Inside a Scope

Coroutine Scope Creates a Lifecycle Boundary

Why This Prevents Leaks

Leak Example WITHOUT structured concurrency (wrong)

The Correct Version WITH structured concurrency

Cancellation Propagation — The Real Mechanism Behind Leak Prevention

Supervisor Jobs: What if One Child Fails?

Real-World Example: Preventing Leaks in an HTTP Request

Structured Concurrency Makes Async Code Predictable

The Golden Rule to Avoid Coroutine Leaks

When to Use Which Coroutine Builder

launch — Fire-and-Forget, Returns a Job

Common use cases

Example

When NOT to use launch

async — Concurrent Work That Produces a Value

Example: parallel network calls

Important note

When NOT to use async

runBlocking — Bridge Between Blocking and Suspending Worlds

Example: main function

When NOT to use runBlocking

withContext — Switch Contexts Predictably and Safely

Example: switching to IO and then back to Main

When NOT to use withContext

coroutineScope — Builder That Enforces Structured Concurrency

Example

When NOT to use coroutineScope

Quick Comparison Table

Choosing the Right Builder (Decision Flow)

Conclusion: Coroutines Are Now Table Stakes

Related Posts

Leave a Reply Cancel reply

`launch` — Fire-and-Forget, Returns a `Job`

When NOT to use `launch`

`async` — Concurrent Work That Produces a Value

When NOT to use `async`

`runBlocking` — Bridge Between Blocking and Suspending Worlds

When NOT to use `runBlocking`

`withContext` — Switch Contexts Predictably and Safely

When NOT to use `withContext`

`coroutineScope` — Builder That Enforces Structured Concurrency

When NOT to use `coroutineScope`