sagemaker.mlops.feature_store.feature_group_manager#
FeatureGroup with Lake Formation support.
Classes
|
Class representing resource FeatureGroup |
|
Configuration for Iceberg table properties in a Feature Group offline store. |
|
Configuration for Lake Formation governance on Feature Group offline stores. |
- class sagemaker.mlops.feature_store.feature_group_manager.FeatureGroupManager(*, feature_group_name: str | PipelineVariable, feature_group_arn: str | PipelineVariable | None = Unassigned(), record_identifier_feature_name: str | PipelineVariable | None = Unassigned(), event_time_feature_name: str | PipelineVariable | None = Unassigned(), feature_definitions: List[FeatureDefinition] | None = Unassigned(), creation_time: datetime | None = Unassigned(), last_modified_time: datetime | None = Unassigned(), online_store_config: OnlineStoreConfig | None = Unassigned(), offline_store_config: OfflineStoreConfig | None = Unassigned(), throughput_config: ThroughputConfigDescription | None = Unassigned(), role_arn: str | PipelineVariable | None = Unassigned(), feature_group_status: str | PipelineVariable | None = Unassigned(), offline_store_status: OfflineStoreStatus | None = Unassigned(), last_update_status: LastUpdateStatus | None = Unassigned(), failure_reason: str | PipelineVariable | None = Unassigned(), description: str | PipelineVariable | None = Unassigned(), next_token: str | PipelineVariable | None = Unassigned(), online_store_total_size_bytes: int | None = Unassigned(), iceberg_properties: IcebergProperties | None = None)[source]#
Bases:
FeatureGroupClass representing resource FeatureGroup
- feature_group_arn#
The Amazon Resource Name (ARN) of the FeatureGroup.
- Type:
Optional[StrPipeVar]
- feature_group_name#
he name of the FeatureGroup.
- Type:
StrPipeVar
- record_identifier_feature_name#
The name of the Feature used for RecordIdentifier, whose value uniquely identifies a record stored in the feature store.
- Type:
Optional[StrPipeVar]
- event_time_feature_name#
The name of the feature that stores the EventTime of a Record in a FeatureGroup. An EventTime is a point in time when a new event occurs that corresponds to the creation or update of a Record in a FeatureGroup. All Records in the FeatureGroup have a corresponding EventTime.
- Type:
Optional[StrPipeVar]
- feature_definitions#
A list of the Features in the FeatureGroup. Each feature is defined by a FeatureName and FeatureType.
- Type:
Optional[List[FeatureDefinition]]
- creation_time#
A timestamp indicating when SageMaker created the FeatureGroup.
- Type:
Optional[datetime.datetime]
- next_token#
A token to resume pagination of the list of Features (FeatureDefinitions).
- Type:
Optional[StrPipeVar]
- last_modified_time#
A timestamp indicating when the feature group was last updated.
- Type:
Optional[datetime.datetime]
- online_store_config#
The configuration for the OnlineStore.
- Type:
Optional[OnlineStoreConfig]
- offline_store_config#
The configuration of the offline store. It includes the following configurations: Amazon S3 location of the offline store. Configuration of the Glue data catalog. Table format of the offline store. Option to disable the automatic creation of a Glue table for the offline store. Encryption configuration.
- Type:
Optional[OfflineStoreConfig]
- throughput_config#
- Type:
Optional[ThroughputConfigDescription]
- role_arn#
The Amazon Resource Name (ARN) of the IAM execution role used to persist data into the OfflineStore if an OfflineStoreConfig is provided.
- Type:
Optional[StrPipeVar]
- feature_group_status#
The status of the feature group.
- Type:
Optional[StrPipeVar]
- offline_store_status#
The status of the OfflineStore. Notifies you if replicating data into the OfflineStore has failed. Returns either: Active or Blocked
- Type:
Optional[OfflineStoreStatus]
- last_update_status#
A value indicating whether the update made to the feature group was successful.
- Type:
Optional[LastUpdateStatus]
- failure_reason#
The reason that the FeatureGroup failed to be replicated in the OfflineStore. This is failure can occur because: The FeatureGroup could not be created in the OfflineStore. The FeatureGroup could not be deleted from the OfflineStore.
- Type:
Optional[StrPipeVar]
- description#
A free form description of the feature group.
- Type:
Optional[StrPipeVar]
- online_store_total_size_bytes#
The size of the OnlineStore in bytes.
- Type:
Optional[int]
- classmethod create(*args, lake_formation_config: LakeFormationConfig | None = None, iceberg_properties: IcebergProperties | None = None, **kwargs) FeatureGroupManager | None[source]#
Create a FeatureGroupManager resource with optional Lake Formation governance and Iceberg properties.
Accepts all parameters from FeatureGroup.create(), plus:
- Parameters:
lake_formation_config –
Optional LakeFormationConfig to configure Lake Formation governance. When enabled=True, requires offline_store_config and role_arn. The config fields control the following behavior:
enabled (bool, default False): When True, automatically enables Lake Formation governance after the Feature Group is created. This registers the offline store S3 location with Lake Formation and grants the execution role permissions on the Glue table.
use_service_linked_role (bool, default True): When True, uses the Lake Formation service-linked role for S3 registration. When False,
registration_role_arnmust be provided.registration_role_arn (str, optional): Custom IAM role ARN for S3 registration with Lake Formation. Required when
use_service_linked_roleis False.hybrid_access_mode_enabled (bool, required): When True, IAM-based access remains alongside Lake Formation permissions (hybrid access mode). When False, revokes IAMAllowedPrincipal permissions from the Glue table, enforcing Lake Formation-only governance. Warning: setting this to False may break existing jobs (e.g., training, processing, ETL) that access the table via IAM-based permissions. After this change, all principals must be granted access through Lake Formation.
iceberg_properties – Optional IcebergProperties to configure Iceberg table properties for the offline store. Requires offline_store_config with table_format=’Iceberg’.
- Returns:
The FeatureGroupManager resource.
- Raises:
botocore.exceptions.ClientError – This exception is raised for AWS service related errors.
ResourceInUse – Resource being accessed is in use.
ResourceLimitExceeded – You have exceeded an SageMaker resource limit.
- creation_time: datetime.datetime | None#
- description: StrPipeVar | None#
- enable_lake_formation(hybrid_access_mode_enabled: bool, acknowledge_risk: bool, session: boto3.Session | None = None, region: str | None = None, use_service_linked_role: bool = True, registration_role_arn: str | None = None, wait_for_active: bool = False) dict[source]#
Enable Lake Formation governance for this Feature Group’s offline store.
This method: 1. Optionally waits for Feature Group to reach ‘Created’ status 2. Validates Feature Group status is ‘Created’ 3. Registers the offline store S3 location as data lake location 4. Grants the execution role permissions on the Glue table 5. Optionally revokes IAMAllowedPrincipal permissions from the Glue table 6. Prints recommended S3 deny bucket policy
- Parameters:
hybrid_access_mode_enabled – If True, IAM-based access remains alongside Lake Formation permissions (hybrid access mode). If False, revokes IAMAllowedPrincipal permissions from the Glue table, enforcing Lake Formation-only governance. Warning: setting this to False may break existing jobs (e.g., training, processing, ETL) that access the table via IAM-based permissions. After this change, all principals must be granted access through Lake Formation.
acknowledge_risk – If True, acknowledges the risks of the Lake Formation operation and proceeds. When hybrid_access_mode_enabled is False, this acknowledges that revoking IAMAllowedPrincipal permissions may break existing jobs (e.g., training, processing, ETL) that access the table via IAM-based permissions. When hybrid_access_mode_enabled is True, this acknowledges that IAM-based access remains alongside Lake Formation permissions. If False, raises RuntimeError without proceeding.
session – Boto3 session.
region – Region name.
use_service_linked_role – Whether to use the Lake Formation service-linked role for S3 registration. If True, Lake Formation uses its service-linked role. If False, registration_role_arn must be provided. Default is True.
registration_role_arn – IAM role ARN to use for S3 registration with Lake Formation. Required when use_service_linked_role is False. This can be different from the Feature Group’s execution role (role_arn)
wait_for_active – If True, waits for the Feature Group to reach ‘Created’ status before enabling Lake Formation. Default is False.
- Returns:
s3_location_registered: bool
lf_permissions_granted: bool
hybrid_access_mode_enabled: bool
- Return type:
Dict with status of each Lake Formation operation
- Raises:
ValueError – If the Feature Group has no offline store configured, if role_arn is not set on the Feature Group, if use_service_linked_role is False but registration_role_arn is not provided, or if the Feature Group is not in ‘Created’ status.
ClientError – If Lake Formation operations fail.
RuntimeError – If a phase fails and subsequent phases cannot proceed, or if the user declines to proceed without disabling hybrid access mode.
- event_time_feature_name: StrPipeVar | None#
- failure_reason: StrPipeVar | None#
- feature_definitions: List[FeatureDefinition] | None#
- feature_group_arn: StrPipeVar | None#
- feature_group_name: StrPipeVar#
- feature_group_status: StrPipeVar | None#
- classmethod get(*args, include_iceberg_properties: bool = False, **kwargs) FeatureGroup | None[source]#
Get a FeatureGroup resource with optional Iceberg property retrieval.
Accepts all parameters from FeatureGroup.get(), plus:
- Parameters:
include_iceberg_properties – If True, fetches Iceberg table properties from Glue and stores them in the iceberg_properties attribute. Only applies to Feature Groups with table_format=’Iceberg’.
- Returns:
The FeatureGroup resource.
- iceberg_properties: IcebergProperties | None#
- last_modified_time: datetime.datetime | None#
- last_update_status: LastUpdateStatus | None#
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'protected_namespaces': (), 'validate_assignment': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- next_token: StrPipeVar | None#
- offline_store_config: OfflineStoreConfig | None#
- offline_store_status: OfflineStoreStatus | None#
- online_store_config: OnlineStoreConfig | None#
- online_store_total_size_bytes: int | None#
- record_identifier_feature_name: StrPipeVar | None#
- role_arn: StrPipeVar | None#
- throughput_config: ThroughputConfigDescription | None#
- update(*args, iceberg_properties: IcebergProperties | None = None, session: boto3.Session | None = None, region: str | PipelineVariable | None = None, **kwargs) FeatureGroup | None[source]#
Update a FeatureGroup resource with optional Iceberg property updates.
Accepts all parameters from FeatureGroup.update(), plus:
- Parameters:
iceberg_properties – Optional IcebergProperties to update Iceberg table properties for the offline store. Requires offline_store_config with table_format=’Iceberg’.
session – Boto3 session for Iceberg property updates.
region – Region name for Iceberg property updates.
- Returns:
The FeatureGroup resource.
- class sagemaker.mlops.feature_store.feature_group_manager.IcebergProperties(*, properties: Dict[str, str] | None = None)[source]#
Bases:
BaseConfiguration for Iceberg table properties in a Feature Group offline store.
Use this to customize Iceberg table behavior such as compaction settings, snapshot retention, and other Iceberg-specific configurations.
- properties#
A dictionary mapping Iceberg property names to their values. Common properties include: - ‘write.target-file-size-bytes’: Target size for data files - ‘commit.manifest.min-count-to-merge’: Min manifests before merging - ‘history.expire.max-snapshot-age-ms’: Max age for snapshot expiration
- Type:
Dict[str, str] | None
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'protected_namespaces': (), 'validate_assignment': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- properties: Dict[str, str] | None#
- class sagemaker.mlops.feature_store.feature_group_manager.LakeFormationConfig(*, enabled: bool = False, use_service_linked_role: bool = True, registration_role_arn: str | None = None, hybrid_access_mode_enabled: bool, acknowledge_risk: bool)[source]#
Bases:
BaseConfiguration for Lake Formation governance on Feature Group offline stores.
- enabled#
If True, enables Lake Formation governance for the offline store. Requires offline_store_config and role_arn to be set on the Feature Group.
- Type:
bool
- use_service_linked_role#
Whether to use the Lake Formation service-linked role for S3 registration. If True, Lake Formation uses its service-linked role. If False, registration_role_arn must be provided. Default is True.
- Type:
bool
- registration_role_arn#
IAM role ARN to use for S3 registration with Lake Formation. Required when use_service_linked_role is False. This can be different from the Feature Group’s execution role.
- Type:
str | None
- hybrid_access_mode_enabled#
If True, IAM-based access remains alongside Lake Formation permissions (hybrid access mode). If False, revokes IAMAllowedPrincipal permissions from the Glue table, enforcing Lake Formation-only governance. Warning: setting this to False may break existing jobs that access the table via IAM-based permissions. After this change, all principals must be granted access through Lake Formation.
- Type:
bool
- acknowledge_risk#
If True, acknowledges the risks of the Lake Formation operation and proceeds. When hybrid_access_mode_enabled is False, this acknowledges that revoking IAMAllowedPrincipal permissions may break existing jobs (e.g., training, processing, ETL) that access the table via IAM-based permissions. When hybrid_access_mode_enabled is True, this acknowledges that IAM-based access remains alongside Lake Formation permissions. If False, raises RuntimeError without proceeding.
- Type:
bool
- acknowledge_risk: bool#
- enabled: bool#
- hybrid_access_mode_enabled: bool#
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'protected_namespaces': (), 'validate_assignment': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- registration_role_arn: str | None#
- use_service_linked_role: bool#