Freecloud Insights
When the Cloud Stumbles: SMBs Need a Plan B
20 October 2025
Why this matters
Shared platforms fail. When a hyperscaler or a critical SaaS has a bad day, the blast radius hits many companies at once. Even if your core app stays up, the third party bits you rely on - payments, auth, email, storage, analytics - might not. You do not need two of everything. You do need a clear Plan B.
Step 1 - Map real dependencies
- List critical user journeys. Sign up, login, pay, place order, support, fulfilment.
- Attach dependencies. Which cloud region, database, queue, SaaS API or DNS provider each journey needs.
- Mark shared risk items. Anything run by your cloud provider or a single SaaS vendor is a shared point of failure.
Step 2 - Decide what must stay live
- Tier 0. Safety and money. Payments, patient data, legal obligations. Must stay live or enter a safe mode.
- Tier 1. Core customer actions. Read only is acceptable for a short window.
- Tier 2. Nice to have. Can pause gracefully.
Step 3 - Pick simple contingency patterns
- Same cloud, multi region. Primary and backup in different regions with DNS or app level failover.
- Graceful degradation. Read only mode, queue and catch up, disable non essential features.
- Alternate channels. Backup email or SMS provider, static status page on a separate host, fallback auth (magic links or backup IdP).
- Hot standby for one thing that matters. A minimal replica of the payment path or order capture, not the entire platform.
Step 4 - Keep your data portable
- Regular exports. Automated daily exports of key datasets to an independent storage location.
- Format discipline. Use open formats and versioned schemas so you can spin up a basic service elsewhere.
- DNS and identity. Ensure DNS, domains and IdP recovery steps are documented and accessible outside your main cloud account.
Step 5 - Make comms part of the plan
- Status page that survives. Host it outside your main stack. Pre write outage notices for common scenarios.
- Internal comms kit. Slack or Teams fallback channel, on call list, customer support macros.
- Decision authority. Who can declare read only mode or trigger failover. No debates at 2 a.m.
What not to do
- Do not jump straight to multi cloud for everything. It is expensive and operationally heavy for most SMBs.
- Do not assume SLAs mean uptime for you. They are credits, not continuity.
- Do not leave third party risks unowned. Assign an owner to each critical dependency.
30-60-90 day playbook
- 30 days. Map dependencies, classify tiers, create a one page runbook, stand up a separate status page.
- 60 days. Implement a read only mode, enable cross region backups, test a DNS failover drill.
- 90 days. Hot standby for the most critical journey, contract a backup comms provider, rehearse the on call scenario.
Checklist
- ✅ One page runbook that names an incident lead and decision points.
- ✅ Status page hosted outside your primary cloud account.
- ✅ Daily exports of essential data to an independent location.
- ✅ Read only mode defined and tested.
- ✅ DNS and IdP recovery steps documented and accessible.
Bottom line
You cannot remove shared risk, but you can stop it becoming a full outage for your customers. Keep it simple: know your dependencies, pick one fallback pattern, test it, and write it down.