⚡ News

Amazon SageMaker Feature Store Adds Lake Formation Support and Iceberg Management

Amazon SageMaker Feature Store Adds Lake Formation Support and Iceberg Management

Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. It now supports Apache Iceberg table format, streaming ingestion, scalable batch ingestion, and fine-grained access control through AWS Lake Formation.

As organizations scale ML platforms from experimentation to production, two operational challenges consistently surface. The first is securing access to sensitive feature data without manual overhead for every new feature group. The second is keeping storage costs predictable when high-frequency streaming workloads generate massive volumes of Apache Iceberg metadata. For instance, one retail analytics team discovered their Iceberg-based offline store accumulated over 50 TB of metadata files in under a year, driving substantial Amazon S3 charges. Infrastructure teams require Lake Formation-enforced access control that works automatically at the point of feature group creation.

We are announcing three new capabilities available in SageMaker Python SDK v3.8.0 to address these challenges:

1. **Native AWS Lake Formation integration**: Register your offline store with Lake Formation during or after feature group creation to enforce column-level, row-level, and cell-level access control without manual setup.

2. **Additional Apache Iceberg table properties**: Control metadata retention and snapshot lifecycle policies to prevent metadata accumulation and reduce storage costs at the feature group level.

3. **Feature Store support in SageMaker Python SDK v3**: The modernized SDK v3.8.0 brings the full set of Feature Store capabilities into a modular, faster, and lighter-weight package.

To get started, you need an AWS account with SageMaker resource permissions, an execution role with S3, Glue, and Lake Formation access, and SageMaker Python SDK v3.8.0 or later (pip install --upgrade "sagemaker>=3.8.0"). For Lake Formation integration, at least one Data Lake Administrator must be configured in your account.

↗ Read original source