AWS Logo

AWS Aurora Global Clusters Explained: What you wish they told you before you built it

Posted by

AWS Aurora Global is, on the face of it, a decent product. Aurora is a MySQL fork with a tonne of purported performance benefits over vanilla MySQL. I was building a system, in AWS, which relied on a MySQL database so thought I’d give Aurora Global Clusters a try. This was because I wanted to achieve high availability both in a single region as well as across regions (i.e. if one region failed, another took over). Here’s what I learned:

Foreword about Aurora Serverless

Everybody wants to be “serverless” these days, right? Aurora Serverless is AWS’s attempt at jumping on that bandwagon. Unlike truly serverless products like Lambda and DynamoDB, Aurora Serverless is not really serverless. When you provision a “serverless” cluster, a set of MySQL servers get built for you. Having used Aurora Serverless in a former job, it soon became apparent that it’s not a mature product and after many many hours of cluster downtime due to badly performing scaling activities et al. we binned it off. I don’t think it’s gotten a lot better since and, personally, I’d not touch it with a 9.5 ft pole.

Basic Design

Aurora Global is, after-all, just MySQL. As such, the design is broadly what you’d expect for a MySQL single-master cluster. The Global Cluster has a primary region. This is where the master/writer node runs. You then have one or more read-only nodes in your primary and secondary region(s). Aurora Global is, in effect, a way to get read-only copies of your MySQL database replicated to other regions so your applications can do low-latency reads of that data.

Aurora Global Cluster Limitations

There’s a few documented limitations, albeit they’re hidden deep in larger documentation. Here’s a few of the key ones:

  • You cannot have multi-master global clusters. Global clusters have a single writer and multiple readers
  • You can have up to 6 regional clusters in a global cluster. If you want more, you’re stuffed
  • You can’t use RDS Proxy with Global clusters
  • You won’t get automatic minor version upgrades
  • There’s no auto-scaling
  • You often can’t do major version upgrades – for example you cannot upgrade from
    5.7.mysql_aurora.2.10.1 to 8.0.mysql_aurora.3.01.0

Resiliency and Failover

The obvious advantage that you might assume exists when running a Global Cluster is that, if the primary region (the one with the writer node) were to fail, then another region would pick up the traffic. Well… you’re wrong. Aurora Global has no automatic failover; this means that you will never end up with a writer node in another region, if the primary region fails. As such, you cannot write to your database if the primary region fails.

It is possible for you to manually initiate a “failover”… but this involves removing a region from the Global Cluster and then pointing all of your applications, globally, to this single regional cluster. Once you do this, that region will have a writer node but all other regions will be entirely useless as they will not be replicating from the newly split region. To restore other regions, you must destroy and rebuild their database clusters, making them part of a new global cluster with the region that you just promoted as the primary.

In addition to this, when the primary writer node fails/reboots for some reason (e.g. underlying instance hardware failure), all other nodes in your cluster will also reboot. This means you will have a global read outage until the writer node comes back.

Update 2023-01-05: The unavailability of reader nodes when the primary writer node fails has become particularly important during an outage today. A bug caused the primary region writer node, in one of our clusters, to crash. This caused all of the reader nodes to go offline, meaning that writing and reading, across all regions of the global cluster, was down for about 8 hours. AWS support have no ability to resolve issues like this and must escalate to internal teams. Manually initiating a “failover” to another region (by deleting the primary region’s database) would have been tempting here but the bug meant the secondary region would have crashed too and the whole cluster would need rebuilding, due to the “failover” activity. This is a terrifying prospect, for a production database…

Upgrades

As mentioned above, you often can’t do major version upgrades without building a whole new cluster and migrating your applications. Minor version upgrades are sometimes possible by simply editing the Aurora version of your cluster, but there’s a bug with CloudFormation meaning that you must manually upgrade your clusters with the AWS console/CLI before changing the Aurora version in CloudFormation, otherwise your stack will error out.

I’ve had mixed success with upgrades. Some of them have been smooth and no downtime was seen. Others have been a bit more brutal and a minute or two of downtime was seen during the upgrade. When upgrading from 3.02.x to 3.03.x, for example, it’s not possible without destroying and rebuilding your entire global cluster, which will likely come with a very long period of downtime.

Aurora now has a Green/Blue deployment model, but it also comes with a minute or so of downtime as traffic fails over.

In essence, expect downtime during upgrades.

Parallel Query

Parallel Query is an Aurora feature which is not enabled by default (as at Feb 2023). It purportedly has performance benefits which allows it to automagically split a SELECT query into multiple smaller queries and execute them on the database in parallel. For example, it might split something like SELECT * FROM table WHERE foo IN (“bar”, “baz”) into two separate queries – one searching for foo=”bar” and the other foo=”baz”.

Feb 2023: Alas, Parallel Query has a propensity to crash your database servers when it is enabled and AWS support recommends turning it off:

During our investigation, we identified that the restarts were caused by an issue in the Aurora MySQL software related to parallel query. Please look for upcoming release for the fix. In the meantime, consider leaving the parallel query option disabled by setting aurora_parallel_query = OFF.

Apr 2023: Aurora 3.02.3 and 3.03.1 fixes this issue, apparently

Write Forwarding

Write forwarding is a pretty exciting feature, on paper. It allows you to do write (INSERT, UPDATE, DELETE, etc.) statements on reader nodes and have those forwarded to and executed on the writer nodes. It does, however, have a lot of very painful limitations:

  • Contrary to the documentation, write forwarding is not available on Aurora versions based on MySQL 8. At the time of writing, the latest version that supports it is
    5.7.mysql_aurora.2.10.1
  • You do not have access to “writer” nodes in non-primary regions. The DNS record doesn’t resolve because the endpoint is “inactive”. This means that your applications always need to know whether they are running in a primary or secondary region. Note: 2022-09-09: Aurora changed behaviour, without notice, to make secondary region writer endpoints active. This apparently broke quite a few customers so they rolled back the change, however there does seem to be a desire from them to re-implement this in the future
  • When you connect to a reader node and try to write to it, you will be told the node is read-only until you run this query: set aurora_replica_read_consistency = ‘eventual’;
  • You cannot set aurora_replica_read_consistency globally. It must be set, manually, each time you connect to the database. As such, it’s unlikely you’ll be able to use write forwarding with applications that you do not have full control over the codebase
  • You cannot do DDL (create table, alter table, truncate table, etc.) on write forwarded nodes

I want solutions, not problems!

All is not entirely lost with Aurora Global. It serves a purpose if you need to run your applications geographically close to your users and, thus, want a local read-only database next to the application.

For anything more than this, an open-source product called ProxySQL, solves a lot of the problems. The basic design is thus:

  • Create a multi-region Aurora Global cluster as you would normally. Don’t worry about enabling write forwarding, cause it’s a bit useless
  • Peer together the VPCs in all of your regions, and add security group rules, such that each region can connect to the Aurora instances in every other region
  • Run ProxySQL locally in each of your regions
  • Add your local region’s writer and reader endpoints to ProxySQL with a weight of 10,000,000
  • Add your writer and reader endpoints, in every other region, to ProxySQL with a weight of 1 (at time of writing, weight 0 meant the server was never used so 1 means it’s going to be used for 1 query in every 10 million… which is the best you’ll get)
  • Create an entry in mysql_replication_hostgroups with a check_type of innodb_read_only to ensure that writer endpoints which erroneously become active are never attempted to be written to
  • Create ProxySQL mysql_query_rules to send SELECT (but not SELECT FOR UPDATE) statements to the reader hostgroup whilst sending all other queries to the writer hostgroup
  • Connect your applications to the ProxySQL instance running locally in their own region

What will happen here is:

  • Applications will do write queries to ProxySQL and it will forward them to the single writer node which is currently active
  • In the event that you are forced to break a region out of the global cluster to “promote” it, ProxySQL will realise this and start sending write queries to that region
  • Read queries will always go to the reader node(s) in the local region
  • In the event that your local reader nodes fail (either because of a fault or because you manually destroyed them during “failover”) then read queries will be done from another region. This means that your applications must tolerate some kind of elevated query latency, but this is good practice anyway

Here’s the config for that:

MySQL [(none)]> select * from mysql_users;
+----------+------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+------------+---------+
| username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections | attributes | comment |
+----------+------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+------------+---------+
| admin | passw0rd | 1 | 0 | 1 | | 0 | 1 | 0 | 1 | 1 | 10000 | | |
+----------+------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+------------+---------+
1 row in set (0.000 sec)

MySQL [(none)]> select * from mysql_servers;
+--------------+----------------------------------------------------------------------------------------+------+-----------+--------+----------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+----------------------------------------------------------------------------------------+------+-----------+--------+----------+-------------+-----------------+---------------------+---------+----------------+---------+
| 1 | thing.cluster-cdqdzbngckab.eu-west-1.rds.amazonaws.com | 3306 | 0 | ONLINE | 10000000 | 0 | 1000 | 0 | 0 | 0 | |
| 1 | thing.cluster-cvbn52q4dmcd.eu-central-1.rds.amazonaws.com | 3306 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 2 | thing.cluster-ro-cvbn52q4dmef.eu-central-1.rds.amazonaws.com | 3306 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 2 | thing.cluster-ro-cdqdzbngckgh.eu-west-1.rds.amazonaws.com | 3306 | 0 | ONLINE | 10000000 | 0 | 1000 | 0 | 0 | 0 | |
+--------------+----------------------------------------------------------------------------------------+------+-----------+--------+----------+-------------+-----------------+---------------------+---------+----------------+---------+
4 rows in set (0.000 sec)

MySQL [(none)]> select * from mysql_replication_hostgroups;
+------------------+------------------+------------------+-----------------+
| writer_hostgroup | reader_hostgroup | check_type | comment |
+------------------+------------------+------------------+-----------------+
| 1 | 2 | innodb_read_only | Read only check |
+------------------+------------------+------------------+-----------------+
1 row in set (0.000 sec)


MySQL [(none)]> select * from mysql_query_rules;
+---------+--------+----------+------------+--------+-------------+------------+------------+--------+---------------------+---------------+----------------------+--------------+---------+-----------------+-----------------------+-----------+--------------------+---------------+-----------+---------+---------+-------+-------------------+----------------+------------------+-----------+--------+-------------+-----------+---------------------+-----+-------+------------+---------+
| rule_id | active | username | schemaname | flagIN | client_addr | proxy_addr | proxy_port | digest | match_digest | match_pattern | negate_match_pattern | re_modifiers | flagOUT | replace_pattern | destination_hostgroup | cache_ttl | cache_empty_result | cache_timeout | reconnect | timeout | retries | delay | next_query_flagIN | mirror_flagOUT | mirror_hostgroup | error_msg | OK_msg | sticky_conn | multiplex | gtid_from_hostgroup | log | apply | attributes | comment |
+---------+--------+----------+------------+--------+-------------+------------+------------+--------+---------------------+---------------+----------------------+--------------+---------+-----------------+-----------------------+-----------+--------------------+---------------+-----------+---------+---------+-------+-------------------+----------------+------------------+-----------+--------+-------------+-----------+---------------------+-----+-------+------------+---------+
| 1 | 1 | NULL | NULL | 0 | NULL | NULL | NULL | NULL | ^SELECT.*FOR UPDATE | NULL | 0 | CASELESS | NULL | NULL | 1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 | NULL | NULL | 1 | | NULL |
| 2 | 1 | NULL | NULL | 0 | NULL | NULL | NULL | NULL | ^SELECT | NULL | 0 | CASELESS | NULL | NULL | 2 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 1 | NULL | NULL | 1 | | NULL |
+---------+--------+----------+------------+--------+-------------+------------+------------+--------+---------------------+---------------+----------------------+--------------+---------+-----------------+-----------------------+-----------+--------------------+---------------+-----------+---------+---------+-------+-------------------+----------------+------------------+-----------+--------+-------------+-----------+---------------------+-----+-------+------------+---------+
2 rows in set (0.000 sec)

ProxySQL does have a feature (configured in mysql_aws_aurora_hostgroups) whereby it can discover all of your Aurora nodes, using the AWS API, and automagically populate them into mysql_servers. This is fine but because the AWS API does not return the full hostname of the endpoint, you must configure this table with a domain suffix (e.g. us-east-1.rds.amazonaws.com). When you are doing multi-region Aurora Global, this clearly doesn’t work as each region has its own suffix.

Summary

It’s possible, with a lot of effort, to get Aurora Global working fairly well in a multi-region HA setup. In reality, MySQL isn’t a fantastic database for any sort of highly available setup where you must share data between the regions. If you’re building a new application, maybe consider some better alternatives such as MongoDB, DynamoDB, ElasticSearch, CouchBase or Redis.

Yours sincerely, a worn down Engineer whose Cheerios have been pissed in by AWS one too many times.

10 comments

    1. Thanks for the tip! I believe it only works between some versions. When trying this method myself to upgrade between the versions mentioned in the article, an error was explicitly thrown stating it was not possible between those versions

    1. Good question! I’m not sure, having never tried it. I can’t imagine it’s too far different but do let me know what your experience is, if you get chance to try it

Leave a Reply to Andrew Rice Cancel reply

Your email address will not be published. Required fields are marked *