Published on

How to delete automatically your DynamoDB records with time_to_live

Authors
  • avatar
    Name
    Katherine Moreno
    Twitter

Introduction

In this Post you'll learn about time_to_live , the feature of DynamoDB which will allow you to remove automatically your records.

Why do I want to delete the records

Imagine you work at a company that handles personal information about users. If you're storing user information in DynamoDB that you will never use, you're paying for storage you don't need. Plus, keeping unnecessary data can create privacy risks. By deleting these records, you save money and protect user privacy.

How to use the Time To Live attribute

When you define a DynamoDB table, you can configure the time_to_live attribute. This attribute should point the variable that you are going to use to determine the date of expiration of the record.

In the following example I'm using 'ttl' as attribute to be used as time_to_live

        my_table = dynamodb.Table(self, "MyTable",
            partition_key=dynamodb.Attribute(
              name="PK",
              type=dynamodb.AttributeType.STRING,
            ),
            table_name = 'my_table',
+           time_to_live_attribute='ttl'
        )

The table is going to have the following schema definition

PKttl(time_to_live_attribute)
myFirstRecord1721267443
mySecondRecord1721267443

How does it work?

The value to be stored is the UNIX timestamp in ms of the date where we want to expire the item. Once the current timestamp surpasses the time_to_live of the record, DynamoDB will delete the item within a few days when the timestamp is surpassed.

How can I use it in my functions?

How to check if a record has expired?

When you make a query you should not only check that the record exists, you should also check if is in a pending deletion state(the value of time_to_live attribute has been surpassed and is about to be deleted at some point soon by AWS).

With Query

DynamoDB Query has the FilterExpression attribute to filter the results, we are going to use this to filter the expired records.

import boto3
import boto3.dynamodb.conditions as conditions
from datetime import datetime

import os


dynamodb = boto3.resource("dynamodb", region_name=os.environ["REGION"])
table_name = os.environ["TABLE_NAME"]


def handler(event, context):
    random_key = event["randomKey"]
    my_table = dynamodb.Table(table_name)

    ## Get the current time
    current_time = int(datetime.now().timestamp())
    ## Query the table to get the item and filtering by the ttl attribute
    response = my_table.query(
        KeyConditionExpression=conditions.Key("PK").eq(random_key),
        FilterExpression=conditions.Attr("ttl").gt(current_time),
    )
    if response["Count"] > 0:
        return response["Items"][0]  # The item has not expired yet
    return "Item not found"  # The item has already expired

With GetItem

DynamoDB GetItem does not have a FilterExpression property, therefore we've to do a manual check to verify that the item has not expired

import boto3
from datetime import datetime

import os


dynamodb = boto3.resource("dynamodb", region_name=os.environ["REGION"])
table_name = os.environ["TABLE_NAME"]


def handler(event, context):
    random_key = event["randomKey"]
    my_table = dynamodb.Table(table_name)

    ## Get the current time
    current_time = int(datetime.now().timestamp())

    response = my_table.get_item(Key={"PK": random_key})
    if "Item" in response:
        item = response["Item"]
        if(item["ttl"] > current_time): ## Check if the item has not expired
            return item
        else:
            return "Item not found"
    else:
        return "Item not found" ## The item does not exist

How to store a new record?

The time_to_live attribute expects to store the unix timestamp

import boto3
from datetime import datetime, timedelta
import uuid
import os


dynamodb = boto3.resource("dynamodb", region_name=os.environ["REGION"])
table_name = os.environ["TABLE_NAME"]


def handler(event, context):
    my_table = dynamodb.Table(table_name)

    randomUUID = uuid.uuid4().hex

    my_table.put_item(
        Item={
            "url_user": randomUUID,
            "data": "This is a test",
            "ttl": int(  # Set record to be expired after 1 day
                (datetime.now() + timedelta(days=1)).timestamp()
            ),
        }
    )