⚡ Labs

Deploying Receipt Extraction API with Amazon ECS Express Mode and Terraform

Deploying Receipt Extraction API with Amazon ECS Express Mode and Terraform

Amazon Elastic Container Service (ECS) is a fully managed orchestration service designed to run and manage containers without the complexity of infrastructure overhead. This tutorial focuses on creating a production-ready API on ECS using a pre-built ECR image for receipt extraction.

The deployment requires several prerequisites: a SageMaker AI environment, an AWS account, Terraform for Infrastructure as Code (IaC), and optionally, Streamlit for the user interface. The core of this implementation utilizes ECS Express Mode for streamlined container management.

ECS Express Mode leverages Amazon ECR images and requires specific IAM roles, namely the Execution Role and the Infrastructure Role for Express Gateway Services. While it supports default VPCs, utilizing a custom VPC provides superior networking control. A custom IAM Task Role is also necessary if the container needs to invoke SageMaker endpoints for model inference.

Using Terraform, the infrastructure is defined through several files including iam.tf and ecs.tf. The aws_ecs_express_gateway_service resource links the ECS cluster to the Gemma-based receipt extraction image. Key technical parameters include health check paths, container port mapping (port 8000), and automated scaling policies. By setting a scaling_target based on an average CPU usage of 70%, the system can dynamically manage task counts between 1 and 3, ensuring optimal performance for AI-driven extraction tasks.

↗ Read original source