Multi-region Infrastructure at Wego

Over the past few years, Wego has been the leading travel metasearch engine in the Middle East. To serve our users better, we decided to embark on a journey to achieve multi-regional resiliency as well as lower latency for our Middle Eastern users. This is the first in a series of blog post that covers the design and implementation of Wego’s multi-region infrastructure, as well as highlighting some of the interesting challenges and learnings we encountered along the way.

Background

Wego launched in 2005 in Singapore with a focus on Southeast Asia. Nearly 14 years later, with the success it’s seeing in the Middle East, we decided to expand our tech infrastructure to serve our Middle Eastern users better.

Currently, all of our services are running in AWS Singapore region. To cement Wego’s shift to the Middle East, we decided to launch our core services in AWS Mumbai region, which is the region closest to the Middle East. Our goals for having multi-region infrastructure would be:

Reduce latency for end-users
Increase the reliability of our core services

Laying the groundwork

For multi-region infrastructure to work, we need to satisfy the following requirements:

Each regional service should access its own regional data stores (S3, ElastiCache etc.). This means applications that are uploading data to a single data store, now have to upload data to multiple regional data stores.
The deployment pipeline should support multi-region deployments. Our current deployment pipeline assumes that we are only deploying to a single region and is tightly coupled with that region. Hence, we need to revamp the pipeline to make it region- agnostic.
The entry-point for multi-region traffic will be on the DNS level. We need to have a way to route traffic to different regions based on latency, as well as smooth rollout and rollback processes when launching the regional services.

Over the past six months, we have been laying the groundwork for a multi-region infrastructure and launched a core service in Mumbai region that has been running smoothly ever since. In this series, we hope to share several detailed write-ups on the systems that we have built to make multi-region infrastructure possible.

In part 2, we talk about how we achieve cross-region replication of Redis clusters by implementing an in-house system that is maintainable and cost-efficient.

In part 3, we talk about revamping the deployment pipeline by adopting Spinnaker, the benefits that it brings to the table and our experience using it.

In part 4, we talk about multi-region traffic management as well as automatic cross-region failover.