sagemaker.mlops.feature_store.feature_group_manager

Contents

sagemaker.mlops.feature_store.feature_group_manager#

FeatureGroup with Lake Formation support.

Classes

FeatureGroupManager(*, feature_group_name[, ...])

Class representing resource FeatureGroup

IcebergProperties(*[, properties])

Configuration for Iceberg table properties in a Feature Group offline store.

LakeFormationConfig(*[, enabled, ...])

Configuration for Lake Formation governance on Feature Group offline stores.

class sagemaker.mlops.feature_store.feature_group_manager.FeatureGroupManager(*, feature_group_name: str | PipelineVariable, feature_group_arn: str | PipelineVariable | None = Unassigned(), record_identifier_feature_name: str | PipelineVariable | None = Unassigned(), event_time_feature_name: str | PipelineVariable | None = Unassigned(), feature_definitions: List[FeatureDefinition] | None = Unassigned(), creation_time: datetime | None = Unassigned(), last_modified_time: datetime | None = Unassigned(), online_store_config: OnlineStoreConfig | None = Unassigned(), offline_store_config: OfflineStoreConfig | None = Unassigned(), throughput_config: ThroughputConfigDescription | None = Unassigned(), role_arn: str | PipelineVariable | None = Unassigned(), feature_group_status: str | PipelineVariable | None = Unassigned(), offline_store_status: OfflineStoreStatus | None = Unassigned(), last_update_status: LastUpdateStatus | None = Unassigned(), failure_reason: str | PipelineVariable | None = Unassigned(), description: str | PipelineVariable | None = Unassigned(), next_token: str | PipelineVariable | None = Unassigned(), online_store_total_size_bytes: int | None = Unassigned(), iceberg_properties: IcebergProperties | None = None)[source]#

Bases: FeatureGroup

Class representing resource FeatureGroup

feature_group_arn#

The Amazon Resource Name (ARN) of the FeatureGroup.

Type:

Optional[StrPipeVar]

feature_group_name#

he name of the FeatureGroup.

Type:

StrPipeVar

record_identifier_feature_name#

The name of the Feature used for RecordIdentifier, whose value uniquely identifies a record stored in the feature store.

Type:

Optional[StrPipeVar]

event_time_feature_name#

The name of the feature that stores the EventTime of a Record in a FeatureGroup. An EventTime is a point in time when a new event occurs that corresponds to the creation or update of a Record in a FeatureGroup. All Records in the FeatureGroup have a corresponding EventTime.

Type:

Optional[StrPipeVar]

feature_definitions#

A list of the Features in the FeatureGroup. Each feature is defined by a FeatureName and FeatureType.

Type:

Optional[List[FeatureDefinition]]

creation_time#

A timestamp indicating when SageMaker created the FeatureGroup.

Type:

Optional[datetime.datetime]

next_token#

A token to resume pagination of the list of Features (FeatureDefinitions).

Type:

Optional[StrPipeVar]

last_modified_time#

A timestamp indicating when the feature group was last updated.

Type:

Optional[datetime.datetime]

online_store_config#

The configuration for the OnlineStore.

Type:

Optional[OnlineStoreConfig]

offline_store_config#

The configuration of the offline store. It includes the following configurations: Amazon S3 location of the offline store. Configuration of the Glue data catalog. Table format of the offline store. Option to disable the automatic creation of a Glue table for the offline store. Encryption configuration.

Type:

Optional[OfflineStoreConfig]

throughput_config#
Type:

Optional[ThroughputConfigDescription]

role_arn#

The Amazon Resource Name (ARN) of the IAM execution role used to persist data into the OfflineStore if an OfflineStoreConfig is provided.

Type:

Optional[StrPipeVar]

feature_group_status#

The status of the feature group.

Type:

Optional[StrPipeVar]

offline_store_status#

The status of the OfflineStore. Notifies you if replicating data into the OfflineStore has failed. Returns either: Active or Blocked

Type:

Optional[OfflineStoreStatus]

last_update_status#

A value indicating whether the update made to the feature group was successful.

Type:

Optional[LastUpdateStatus]

failure_reason#

The reason that the FeatureGroup failed to be replicated in the OfflineStore. This is failure can occur because: The FeatureGroup could not be created in the OfflineStore. The FeatureGroup could not be deleted from the OfflineStore.

Type:

Optional[StrPipeVar]

description#

A free form description of the feature group.

Type:

Optional[StrPipeVar]

online_store_total_size_bytes#

The size of the OnlineStore in bytes.

Type:

Optional[int]

classmethod create(*args, lake_formation_config: LakeFormationConfig | None = None, iceberg_properties: IcebergProperties | None = None, **kwargs) FeatureGroupManager | None[source]#

Create a FeatureGroupManager resource with optional Lake Formation governance and Iceberg properties.

Accepts all parameters from FeatureGroup.create(), plus:

Parameters:
  • lake_formation_config

    Optional LakeFormationConfig to configure Lake Formation governance. When enabled=True, requires offline_store_config and role_arn. The config fields control the following behavior:

    • enabled (bool, default False): When True, automatically enables Lake Formation governance after the Feature Group is created. This registers the offline store S3 location with Lake Formation and grants the execution role permissions on the Glue table.

    • use_service_linked_role (bool, default True): When True, uses the Lake Formation service-linked role for S3 registration. When False, registration_role_arn must be provided.

    • registration_role_arn (str, optional): Custom IAM role ARN for S3 registration with Lake Formation. Required when use_service_linked_role is False.

    • hybrid_access_mode_enabled (bool, required): When True, IAM-based access remains alongside Lake Formation permissions (hybrid access mode). When False, revokes IAMAllowedPrincipal permissions from the Glue table, enforcing Lake Formation-only governance. Warning: setting this to False may break existing jobs (e.g., training, processing, ETL) that access the table via IAM-based permissions. After this change, all principals must be granted access through Lake Formation.

  • iceberg_properties – Optional IcebergProperties to configure Iceberg table properties for the offline store. Requires offline_store_config with table_format=’Iceberg’.

Returns:

The FeatureGroupManager resource.

Raises:
  • botocore.exceptions.ClientError – This exception is raised for AWS service related errors.

  • ResourceInUse – Resource being accessed is in use.

  • ResourceLimitExceeded – You have exceeded an SageMaker resource limit.

creation_time: datetime.datetime | None#
description: StrPipeVar | None#
enable_lake_formation(hybrid_access_mode_enabled: bool, acknowledge_risk: bool, session: boto3.Session | None = None, region: str | None = None, use_service_linked_role: bool = True, registration_role_arn: str | None = None, wait_for_active: bool = False) dict[source]#

Enable Lake Formation governance for this Feature Group’s offline store.

This method: 1. Optionally waits for Feature Group to reach ‘Created’ status 2. Validates Feature Group status is ‘Created’ 3. Registers the offline store S3 location as data lake location 4. Grants the execution role permissions on the Glue table 5. Optionally revokes IAMAllowedPrincipal permissions from the Glue table 6. Prints recommended S3 deny bucket policy

Parameters:
  • hybrid_access_mode_enabled – If True, IAM-based access remains alongside Lake Formation permissions (hybrid access mode). If False, revokes IAMAllowedPrincipal permissions from the Glue table, enforcing Lake Formation-only governance. Warning: setting this to False may break existing jobs (e.g., training, processing, ETL) that access the table via IAM-based permissions. After this change, all principals must be granted access through Lake Formation.

  • acknowledge_risk – If True, acknowledges the risks of the Lake Formation operation and proceeds. When hybrid_access_mode_enabled is False, this acknowledges that revoking IAMAllowedPrincipal permissions may break existing jobs (e.g., training, processing, ETL) that access the table via IAM-based permissions. When hybrid_access_mode_enabled is True, this acknowledges that IAM-based access remains alongside Lake Formation permissions. If False, raises RuntimeError without proceeding.

  • session – Boto3 session.

  • region – Region name.

  • use_service_linked_role – Whether to use the Lake Formation service-linked role for S3 registration. If True, Lake Formation uses its service-linked role. If False, registration_role_arn must be provided. Default is True.

  • registration_role_arn – IAM role ARN to use for S3 registration with Lake Formation. Required when use_service_linked_role is False. This can be different from the Feature Group’s execution role (role_arn)

  • wait_for_active – If True, waits for the Feature Group to reach ‘Created’ status before enabling Lake Formation. Default is False.

Returns:

  • s3_location_registered: bool

  • lf_permissions_granted: bool

  • hybrid_access_mode_enabled: bool

Return type:

Dict with status of each Lake Formation operation

Raises:
  • ValueError – If the Feature Group has no offline store configured, if role_arn is not set on the Feature Group, if use_service_linked_role is False but registration_role_arn is not provided, or if the Feature Group is not in ‘Created’ status.

  • ClientError – If Lake Formation operations fail.

  • RuntimeError – If a phase fails and subsequent phases cannot proceed, or if the user declines to proceed without disabling hybrid access mode.

event_time_feature_name: StrPipeVar | None#
failure_reason: StrPipeVar | None#
feature_definitions: List[FeatureDefinition] | None#
feature_group_arn: StrPipeVar | None#
feature_group_name: StrPipeVar#
feature_group_status: StrPipeVar | None#
classmethod get(*args, include_iceberg_properties: bool = False, **kwargs) FeatureGroup | None[source]#

Get a FeatureGroup resource with optional Iceberg property retrieval.

Accepts all parameters from FeatureGroup.get(), plus:

Parameters:

include_iceberg_properties – If True, fetches Iceberg table properties from Glue and stores them in the iceberg_properties attribute. Only applies to Feature Groups with table_format=’Iceberg’.

Returns:

The FeatureGroup resource.

iceberg_properties: IcebergProperties | None#
last_modified_time: datetime.datetime | None#
last_update_status: LastUpdateStatus | None#
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'protected_namespaces': (), 'validate_assignment': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

next_token: StrPipeVar | None#
offline_store_config: OfflineStoreConfig | None#
offline_store_status: OfflineStoreStatus | None#
online_store_config: OnlineStoreConfig | None#
online_store_total_size_bytes: int | None#
record_identifier_feature_name: StrPipeVar | None#
role_arn: StrPipeVar | None#
throughput_config: ThroughputConfigDescription | None#
update(*args, iceberg_properties: IcebergProperties | None = None, session: boto3.Session | None = None, region: str | PipelineVariable | None = None, **kwargs) FeatureGroup | None[source]#

Update a FeatureGroup resource with optional Iceberg property updates.

Accepts all parameters from FeatureGroup.update(), plus:

Parameters:
  • iceberg_properties – Optional IcebergProperties to update Iceberg table properties for the offline store. Requires offline_store_config with table_format=’Iceberg’.

  • session – Boto3 session for Iceberg property updates.

  • region – Region name for Iceberg property updates.

Returns:

The FeatureGroup resource.

class sagemaker.mlops.feature_store.feature_group_manager.IcebergProperties(*, properties: Dict[str, str] | None = None)[source]#

Bases: Base

Configuration for Iceberg table properties in a Feature Group offline store.

Use this to customize Iceberg table behavior such as compaction settings, snapshot retention, and other Iceberg-specific configurations.

properties#

A dictionary mapping Iceberg property names to their values. Common properties include: - ‘write.target-file-size-bytes’: Target size for data files - ‘commit.manifest.min-count-to-merge’: Min manifests before merging - ‘history.expire.max-snapshot-age-ms’: Max age for snapshot expiration

Type:

Dict[str, str] | None

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'protected_namespaces': (), 'validate_assignment': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

properties: Dict[str, str] | None#
validate_property_keys()[source]#
class sagemaker.mlops.feature_store.feature_group_manager.LakeFormationConfig(*, enabled: bool = False, use_service_linked_role: bool = True, registration_role_arn: str | None = None, hybrid_access_mode_enabled: bool, acknowledge_risk: bool)[source]#

Bases: Base

Configuration for Lake Formation governance on Feature Group offline stores.

enabled#

If True, enables Lake Formation governance for the offline store. Requires offline_store_config and role_arn to be set on the Feature Group.

Type:

bool

use_service_linked_role#

Whether to use the Lake Formation service-linked role for S3 registration. If True, Lake Formation uses its service-linked role. If False, registration_role_arn must be provided. Default is True.

Type:

bool

registration_role_arn#

IAM role ARN to use for S3 registration with Lake Formation. Required when use_service_linked_role is False. This can be different from the Feature Group’s execution role.

Type:

str | None

hybrid_access_mode_enabled#

If True, IAM-based access remains alongside Lake Formation permissions (hybrid access mode). If False, revokes IAMAllowedPrincipal permissions from the Glue table, enforcing Lake Formation-only governance. Warning: setting this to False may break existing jobs that access the table via IAM-based permissions. After this change, all principals must be granted access through Lake Formation.

Type:

bool

acknowledge_risk#

If True, acknowledges the risks of the Lake Formation operation and proceeds. When hybrid_access_mode_enabled is False, this acknowledges that revoking IAMAllowedPrincipal permissions may break existing jobs (e.g., training, processing, ETL) that access the table via IAM-based permissions. When hybrid_access_mode_enabled is True, this acknowledges that IAM-based access remains alongside Lake Formation permissions. If False, raises RuntimeError without proceeding.

Type:

bool

acknowledge_risk: bool#
enabled: bool#
hybrid_access_mode_enabled: bool#
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'protected_namespaces': (), 'validate_assignment': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

registration_role_arn: str | None#
use_service_linked_role: bool#