SparkML Serving¶
SparkML Model¶
-
class
sagemaker.sparkml.model.
SparkMLModel
(model_data, role=None, spark_version=2.2, sagemaker_session=None, **kwargs)¶ Bases:
sagemaker.model.Model
Model data and S3 location holder for MLeap serialized SparkML model. Calling
deploy()
creates an Endpoint and return a Predictor to performs predictions against an MLeap serialized SparkML model .Initialize a SparkMLModel.
Parameters: - model_data (str) – The S3 location of a SageMaker model data
.tar.gz
file. For SparkML, this will be the output that has been produced by the Spark job after serializing the Model via MLeap. - role (str) – An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
- spark_version (str) – Spark version you want to use for executing the inference (default: ‘2.2’).
- sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain. For local mode, please do not pass this variable.
- **kwargs – Additional parameters passed to the
Model
constructor.
Tip
You can find additional parameters for initializing this class at
Model
.- model_data (str) – The S3 location of a SageMaker model data
SparkML Predictor¶
-
class
sagemaker.sparkml.model.
SparkMLPredictor
(endpoint, sagemaker_session=None)¶ Bases:
sagemaker.predictor.RealTimePredictor
Performs predictions against an MLeap serialized SparkML model.
The implementation of
predict()
in this RealTimePredictor requires a json as input. The input should follow the json format as documented.predict()
returns a csv output, comma separated if the output is a list.Initializes a SparkMLPredictor which should be used with SparkMLModel to perform predictions against SparkML models serialized via MLeap. The response is returned in text/csv format which is the default response format for SparkML Serving container.
Parameters: - endpoint (str) – The name of the endpoint to perform inference on.
- sagemaker_session (sagemaker.session.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.