TEHUTI EKEMA
All work Hybrid Cloud · Azure Local

Two-Site Stretched Azure Local Cluster

Designed and built a two-site stretched Azure Local (Azure Stack HCI) reference environment — storage-replica site-to-site replication plus Azure Arc management — demonstrating edge infrastructure that survives the loss of a full site.

Role
Cloud Engineer
Org
Dell Technologies
Period
2021–2022
Azure LocalAzure ArcStorage Spaces DirectStorage ReplicaStretched Cluster
Azure Local hands-on labs (public)

Context

Enterprises running operational-technology workloads at the edge often can’t go cloud-only: latency, data-gravity, and continuity-of-operations requirements mean the workloads have to run close to the site. This was a demonstration / reference environment built to prove out a pattern for exactly that case — infrastructure that can lose an entire physical location and keep running.

This is a reference design, not a production customer system. All addressing, host naming, and identifiers are illustrative; the architecture is described in generic terms.

Approach

The design is a stretched Azure Local cluster spanning two sites — a primary and an asynchronous DR location — fronted by redundant top-of-rack switching and managed through Azure Arc.

  • Two active sites, each running a Storage Spaces Direct pool as a two-way mirror, so each location is independently fault-tolerant.
  • Storage-replica, site-to-site replication between the pools, giving a recoverable copy of data at the second site without depending on the primary.
  • Segmented networking — separate management, storage, and replication paths over redundant NICs and TOR switches — so storage and replication traffic never competes with management.
  • Azure Arc projection to bring the on-prem cluster under cloud governance: Azure Monitor for telemetry, Recovery Services vaults for backup, and a single control plane for security posture.
  • A cloud witness for quorum, removing the need for a third physical site to arbitrate failover.

Outcome

A working, repeatable blueprint for operations-grade hybrid infrastructure that tolerates the loss of a full site while staying centrally observable and governed from Azure. The same patterns — cluster bring-up, Network ATC intents, lifecycle management — are captured as public hands-on labs so other engineers can stand up equivalent environments.

If you want to add specifics you can speak to — hardware footprint, recovery objectives the design targets — drop them here.