In mobile system design interviews, "Offline Support" is often a dedicated section or a major non-functional requirement. Unlike web apps, mobile apps must assume the network is unreliable.
Your Goal: Demonstrate how to architect an app that works seamlessly without internet, syncs efficiently when connectivity returns, and handles data conflicts gracefully.
The most critical architectural decision is defining the "Single Source of Truth" (SSOT).
Many candidates design apps that fetch data from the network and display it directly in the UI.
- Problem: If the network fails, the screen is empty. If the user navigates away and back, they wait for a loader again.
- Result: Poor UX and high data usage.
The UI only observes the Local Database.
- Read: UI subscribes to the Local DB (e.g., Room
Flow, CoreDataNSFetchedResultsController). - Write: User actions update the Local DB immediately.
- Sync: A background "Sync Engine" synchronizes the Local DB with the Remote API.
The Signal: This decouples the UI from the Network. The app feels instant because it's reading from local disk, regardless of network latency.
How do you keep the Local DB and Remote Server in sync?
- Full Sync: Download the entire dataset every time.
- Pros: Simple to implement. Guaranteed consistency.
- Cons: High bandwidth, slow, battery drain. Only acceptable for tiny datasets (e.g., User Settings).
- Delta Sync (Incremental Sync): Download only what changed since the last sync.
- Mechanism: The client sends a sync marker to the server. The server returns only records modified after that point.
- Option A:
last_synced_timestamp- Flow:
- Client fetches all data. Server returns current server time (e.g.,
2025-12-19T10:00:00Z). - Client saves this timestamp locally.
- Next sync, client requests:
GET /sync?since=2025-12-19T10:00:00Z. - Server queries DB for
updated_at > since.
- Client fetches all data. Server returns current server time (e.g.,
- Risk: Vulnerable to Clock Skew. If server instances have different times, or an update happens in the exact same millisecond as the sync, data might be missed.
- Flow:
- Option B:
sync_token(The "Opaque Cursor") - Recommended- Definition: A string or number generated by the server (e.g.,
v2_seq_98765) that acts as a bookmark. The client stores it blindly without interpreting it. - Flow:
- Response: Server returns data + token:
{"data": [...], "sync_token": "v2_seq_98765"}. - Request: Next sync, client sends it back:
GET /sync?token=v2_seq_98765. - Server Logic: Server decodes the token (e.g., maps it to a Global Sequence ID) and returns newer items.
- Response: Server returns data + token:
- Benefit: Stateless & Robust. Avoids clock skew entirely by using monotonic sequence IDs. Allows the backend to change versioning logic without breaking the mobile app.
- Definition: A string or number generated by the server (e.g.,
- Pros: Efficient, fast, saves battery.
- Cons: Complex backend logic (requires "Soft Deletes" to sync deletions).
- Pull (Down-sync): Fetching updates from the server to the device.
- Push (Up-sync): Sending local changes (pending writes) to the server.
- The Concept: Instead of syncing the current state (e.g., "Note Title is 'Groceries'"), you sync the list of changes (e.g., "User A changed title to 'Groceries'").
- How it works:
- The server keeps an append-only log of all mutations.
- The client says "Give me all operations starting from Offset 100."
- The client "replays" these actions locally.
- Pros: Preserves intent (e.g., distinguishing between "User set value to 0" and "User decremented value"). Easier to resolve conflicts.
- Cons: If the client is very old (Offset 0), replaying the entire history is slow. Requires "Snapshots" to fix.
- Read More: Martin Fowler on Event Sourcing
- The Concept: Instead of tracking when data changed, you track the signature of the data.
- How it works:
- Both the Client and Server organize their data records into a tree structure.
- Each leaf node is a hash of a data record. Each parent node is a hash of its children.
- The Sync Protocol: The client sends the "Root Hash" of its tree to the server.
- Comparison: If
Client.RootHash == Server.RootHash, they are perfectly in sync (0 bytes transferred). If different, the server compares children hashes to traverse down and pinpoint specifically which record is out of sync.
- Use Case: Blockchain, Git, Cassandra, and complex file syncing (like Dropbox).
- Pros: Extremely bandwidth-efficient for verifying consistency of large datasets.
- Cons: High computational cost (hashing) and complexity to maintain the tree.
- Read More: Merkle Tree (Wikipedia)
- The Concept: Instead of a single "Server Time", we track logical counters for every actor (device) that modifies data.
- How it works:
- State is tracked as
[DeviceA: 5, DeviceB: 3, Server: 10]. - This allows the system to distinguish between "Device A hasn't seen Device B's update" vs "Device A overwrote Device B's update."
- State is tracked as
- Use Case: Peer-to-Peer systems or truly distributed offline-first apps where devices might sync directly with each other (rare in typical mobile interviews, but good for "Signal").
- Read More: Vector Clocks (Wikipedia)
- The Concept: A simplified "Merkle Tree" for a quick sanity check.
- How it works:
- Client calculates a single hash of its entire dataset (e.g.,
md5(all_ids + timestamps)). - Client sends this hash to the server.
- Server compares it with its own calculation. If match -> Done. If mismatch -> Trigger full sync or standard delta sync.
- Client calculates a single hash of its entire dataset (e.g.,
- Pros: Very easy to implement. Great for verifying consistency after a series of complex delta syncs.
- Cons: Requires hashing the entire dataset, which can be slow for large databases.
You cannot physically delete a row on the server in a Delta Sync system, because the client won't know it's gone.
- Solution: Use a
is_deleted(tombstone) column. - Flow:
- Server marks item as
is_deleted = true. - Client requests changes since
T. - Server sends the "deleted" item.
- Client sees the flag and removes it from the Local DB (or keeps it hidden if undo is allowed).
- Server marks item as
When a user performs an action (e.g., "Like Tweet") while offline:
- Optimistic Update: Immediately update the UI to show the "Like" state (red heart).
- Persist Action: Store the action in a Persistent Queue (not just memory).
- Why Persistent? If the app is killed before the network returns, the action must not be lost.
- Background Sync: When the network returns (via
WorkManageron Android orBackgroundTaskson iOS), process the queue.- Success: Remove item from queue.
- Failure (Transient): Retry with Exponential Backoff.
- Failure (Permanent): Remove from queue and notify user (e.g., "Could not like tweet"). Revert the Optimistic Update.
This is the hardest part of offline architecture. What happens if the user edits a note offline, but someone else edits the same note on the server?
- Logic: The system looks at the timestamp. The most recent update overwrites the other.
- Pros: Easy to implement.
- Cons: Data loss. (If I edit offline at 10:00, and you edit online at 10:05, my changes are wiped out when I eventually sync).
- Logic: The server's version is the truth. If the client tries to upload a stale version, the server rejects it (HTTP 409 Conflict).
- Client Handling: The client must download the new server version and ask the user what to do.
- Pros: Safe, prevents silent data loss.
- Cons: Annoying UX ("Conflict detected, please resolve").
- Logic: Merge non-conflicting fields automatically.
- Example: User A updates
Title. User B updatesDescription. Both changes are kept. - Pros: Reduces conflicts significantly.
- Logic: specialized data structures designed to always merge successfully mathematically.
- Use Case: Collaborative text editors (Google Docs), counters.
- The Signal: Mentioning CRDTs shows deep theoretical knowledge, but acknowledge they are complex to implement from scratch.
- Database:
- Android: Room (SQLite wrapper). Strongly typed, observable.
- iOS: Core Data (Object Graph) or SwiftData.
- Cross-Platform: Realm (NoSQL, easy sync), SQLite (Raw).
- Job Schedulers:
- Android:
WorkManager. The gold standard. Handles constraints (e.g., "Run only when on WiFi and Charging"). - iOS:
BGAppRefreshTask/BGProcessingTask. stricter limitations on execution time.
- Android:
- Trello: (Engineering Blog)
- Strategy: A complex "Command Queue" system that replays user actions.
- Signal: Demonstrates how to handle "optimistic UI" with potentially thousands of offline edits.
- Linear: (Engineering Blog)
- Strategy: "Sync Engine" that treats the local database as a cache of the entire dataset.
- Signal: Shows how high-performance apps prioritize local-first reads for speed.
- CouchDB / PouchDB: (Website)
- Strategy: Replication protocols built into the database layer.
- Signal: Understanding "Replication" vs. "Custom Sync" trade-offs.
- Define SSOT: "I will use the Repository Pattern with a local database as the Single Source of Truth."
- Define Sync Strategy: "I will implement Delta Sync using a
last_updatedcursor to minimize bandwidth." - Handle Offline Writes: "I will use a persistent operation queue and
WorkManagerto flush changes when connectivity returns." - Address Conflicts: "For this use case, [Last Write Wins / User Prompt] is appropriate because..."
- Mention UX: "I will use Optimistic Updates to make the app feel responsive."
- "I'll use a boolean
isOfflineflag." -> Bad code smell. Avoid building separate logic paths. Always write to DB, let the Sync Engine handle the rest. - In-Memory Queues: -> Data loss if the app crashes. Always persist pending actions.
- Infinite Retries: -> Battery drain. Always use Exponential Backoff and jitter.
- Blocking the UI: -> Database and Network operations must happen on background threads.
- Trello: Sync Architecture - Excellent breakdown of a complex sync engine.
- Linear: Sync Engine - How a high-performance app handles real-time sync.
- Google: Offline-First Guide - Official Android architectural guidance.