Add partitions periodically in AWS RDS Aurora Cluster (MySQL compatible) with the SQL Procedures
Introduction
RDS Aurora is a managed service provided by AWS. Aurora is a OSS compatible relational database built on cloud.
In this article, I explain table partitioning with date in RDS Aurora MySQL compatible.
Goals
- Create table partitioned by date on RDS Aurora (MySQL compatible)
- Add a partition for a day periodically (once a day)
- Drop records after a certain period of time
- Handle errors in these operation running
As an example, I handle the following hoge
table.
column | types | note |
---|---|---|
id | varchar(255) | User ID |
info | varchar(255) | Some information about the user for id |
create_at | timestamp | Date and time when the record was created |
Add a partition once a day
Create hoge
table by the following steps.
- Create a certain number of partitions at first
- Regist a SQL procedure to add a partition
- Run the procedure once a day
Create partition table
First, execute the following query and create hoge
table.
1-- create hoge table
2DROP TABLE IF EXISTS hoge;
3CREATE TABLE hoge (
4 id varchar(255) NOT NULL,
5 info varchar(255) NOT NULL,
6 create_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
7 PRIMARY KEY (id, create_at),
8 INDEX index_id (id)
9) ENGINE=InnoDB DEFAULT CHARSET=utf8;
10
11ALTER TABLE hoge PARTITION BY RANGE (UNIX_TIMESTAMP(create_at)) (
12 PARTITION p20180219 VALUES LESS THAN (UNIX_TIMESTAMP('2018-02-19 00:00:00'))
13);
In this partitioning scheme, naming rules of partition is p
+ yyyyMMdd
.
create_at
column is a partition key. id
column and create_at
column are Composite Primary Keys.
The partition key must be included in the Primary keys.
The partition p20180219
is created by PARTITION BY RANGE
clause, but p20180219
can store only the data for the create_at column before 2018-02-19 00:00:00
.
Regist a SQL procedure to add a partition
Second, Create a SQL procedure to add a partition. One partition can store in range of 24 hours of data. To make the code easier to see, validation is omitted.
1--
2-- A procedure to add partition to hoge table.
3-- This procedure creates partitions in the range of days 'from_date' to 'to_date - 1'.
4--
5-- Arguments
6-- from_date: The start of date time to create partitions
7-- to_date: The end of date time to create partitions
8-- Execution sample
9-- CALL add_hoge_partition(str_to_date('2018-01-01', '%Y-%m-%d'), str_to_date('2019-01-01', '%Y-%m-%d'));
10
11DROP PROCEDURE IF EXISTS add_hoge_partition;
12DELIMITER $$
13CREATE PROCEDURE add_hoge_partition(IN from_date DATE, IN to_date DATE)
14 proc_label:BEGIN
15 DECLARE target_date DATE;
16 DECLARE partition_range DATE;
17 DECLARE p_count INT;
18
19 SET target_date = from_date;
20 WHILE DATEDIFF(to_date, target_date) > 0 DO
21
22 SET partition_range = DATE_ADD(target_date, INTERVAL 1 DAY);
23 SELECT CONCAT(
24 'ALTER TABLE hoge ADD PARTITION ( PARTITION ',
25 DATE_FORMAT(target_date, 'p%Y%m%d'),
26 ' VALUES LESS THAN (UNIX_TIMESTAMP(', QUOTE(DATE_FORMAT(partition_range, '%Y-%m-%d 00:00:00')), ')))'
27 ) INTO @ddl;
28
29 PREPARE ddl_stmt FROM @ddl;
30 EXECUTE ddl_stmt;
31 DEALLOCATE PREPARE ddl_stmt;
32
33 SET target_date = DATE_ADD(target_date, INTERVAL 1 DAY);
34 END WHILE;
35
36 END$$
37DELIMITER ;
Now add initial partitions using add_hoge_partition
.
Add the partition for 365 days with the current date as the base date.
1CALL add_hoge_partition(CURDATE(), DATE_ADD(CURDATE(), INTERVAL 365 DAY));
Create Event task for procedure call
Use MySQL Events that are tasks that run according to a schedule. To use that in RDS, need to set event_scheduler
to ON
in parameter group of RDS. A reboot of RDS instances is needless.
Call MySQL Events from SQL with the following query.
1CREATE EVENT add_hoge_partition
2ON SCHEDULE EVERY 1 DAY STARTS '2018-02-19 00:00:00'
3DO CALL
4 add_hoge_partition(
5 (
6 select
7 from_unixtime(max(PARTITION_DESCRIPTION))
8 from
9 INFORMATION_SCHEMA.PARTITIONS
10 where
11 TABLE_NAME = 'hoge',
12 DATE_ADD(
13 (
14 select
15 from_unixtime(max(PARTITION_DESCRIPTION))
16 from
17 INFORMATION_SCHEMA.PARTITIONS
18 where
19 TABLE_NAME = 'hoge'
20 ),
21 INTERVAL 1 DAY
22 )
23 );
This query get the latest partition information from INFORMATION_SCHEMA
, and calculate target partition date that should be created.
Now the add_hoge_partition
procedure is called every day at 0 o’clock.
On RDS Aurora Cluster, only writer instance executes MySQL Events
Make sure that MySQL Events runs only on Writer instance.
- Execute the below query on Writer instance
1select * from INFORMATION_SCHEMA.PROCESSLIST where USER = 'event_scheduler' limit 10;
2> 1 event_scheduler localhost Daemon 40803 Waiting for next activation
- Execute the below query on Reader instance
1select * from INFORMATION_SCHEMA.PROCESSLIST where USER = 'event_scheduler' limit 10;
2> Empty set (0.01 sec)
If the role of database change with failover, only new switched writer instance executes the events.
Error handling on Aurora
If there is no way to detect an error, nobody notices database problems. This sentence describes error handling of Aurora instance in procedure call.
Notification with AWS Lambda
RDS Aurora(MySQL Compatible) has mysql.lambda_async
procedure as a default.
The mysql.lambda_async
procedure can execute AWS Lambda function directly with messages.
Developers can be notified some database troubles using mysql.lambda_async
.
The steps will be explained below.
Add IAM Role to Aurora cluster
When RDS Aurora executes AWS Lambda function, it should have a execution permission.
Now, create rdsToLambdaRole
to do that. The IAM role has a access policy as below
1"Action": [
2 "lambda:InvokeFunction"
3]
Set Amazon Resource Name(ARN) of rdsToLambdaRole
to parameter named aws_default_lambda_role in RDS cluster parameter group. Since this parameter is dynamic
, rebooting of RDS cluster is needless.
Set network routing for subnets
RDS instances need to be permitted outbound communication. Generally, database instances are placed in private subnets, so developers should allow the subnets to communicate to outbound.
Add error handling to procedure
For error handling, add HANDLER
declaration that executes AWS Lambda function to procedure. Call Lambda function named rds_monitor
using mysql.lambda_async
procedure. The mysql.lambda_async
procedure requires ARN of AWS Lambda as argument.
Code snippet is as below.
1 -- Notify when errors occured
2 DECLARE EXIT HANDLER FOR SQLEXCEPTION
3 BEGIN
4 GET DIAGNOSTICS CONDITION 1 @p1 = RETURNED_SQLSTATE, @p2 = MESSAGE_TEXT;
5 CALL mysql.lambda_async(
6 'arn:aws:lambda:${your_region}:${your_account_id}:function:rds_monitor',
7 CONCAT('{"message":"', @p2, '",',
8 '"state":"', @p1, '"}')
9 );
10 END;
Add the above snippet to the previous procedure code.
1DROP PROCEDURE IF EXISTS add_hoge_partition;
2DELIMITER $$
3CREATE PROCEDURE add_hoge_partition(IN from_date DATE, IN to_date DATE)
4 proc_label:BEGIN
5 DECLARE target_date DATE;
6 DECLARE partition_range DATE;
7 DECLARE p_count INT;
8 DECLARE EXIT HANDLER FOR SQLEXCEPTION
9 BEGIN
10 GET DIAGNOSTICS CONDITION 1 @p1 = RETURNED_SQLSTATE, @p2 = MESSAGE_TEXT;
11 CALL mysql.lambda_async(
12 'arn:aws:lambda:${your_region}:${your_account_id}:function:rds_monitor',
13 CONCAT('{"message":"', @p2, '",',
14 '"state":"', @p1, '"}')
15 );
16 END;
17
18 SET target_date = from_date;
19 WHILE DATEDIFF(to_date, target_date) > 0 DO
20
21 SET partition_range = DATE_ADD(target_date, INTERVAL 1 DAY);
22 SELECT CONCAT(
23 'ALTER TABLE hoge ADD PARTITION ( PARTITION ',
24 DATE_FORMAT(target_date, 'p%Y%m%d'),
25 ' VALUES LESS THAN (UNIX_TIMESTAMP(', QUOTE(DATE_FORMAT(partition_range, '%Y-%m-%d 00:00:00')), ')))'
26 ) INTO @ddl;
27
28 PREPARE ddl_stmt FROM @ddl;
29 EXECUTE ddl_stmt;
30 DEALLOCATE PREPARE ddl_stmt;
31
32 SET target_date = DATE_ADD(target_date, INTERVAL 1 DAY);
33 END WHILE;
34
35 END$$
36DELIMITER ;
Implement AWS Lambda function
Implement AWS Lambda function to handle the message from Aurora. In case of me, I implement to notify errors to datadog that is a monitoring service.
[Addition] Error log on RDS instances
To check error log for RDS instances, I opened error/mysql-error-running.log
on AWS Management Console.
However, despite the fact that log file size is 38.2 kB, the log file shows only ‘END OF LOG’ as below.
Actually, this is the specification of RDS. The file includes the log used by AWS, and the log is not displayed to AWS users. For this reason, the file size is not 0.
Conclusion
It is available to
- Call procedure and add partition table periodically.
- Handle errors when error occured in procedure call.