LinearLearner¶
The Amazon SageMaker LinearLearner algorithm.

class
sagemaker.
LinearLearner
(role, train_instance_count, train_instance_type, predictor_type='binary_classifier', binary_classifier_model_selection_criteria=None, target_recall=None, target_precision=None, positive_example_weight_mult=None, epochs=None, use_bias=None, num_models=None, num_calibration_samples=None, init_method=None, init_scale=None, init_sigma=None, init_bias=None, optimizer=None, loss=None, wd=None, l1=None, momentum=None, learning_rate=None, beta_1=None, beta_2=None, bias_lr_mult=None, bias_wd_mult=None, use_lr_scheduler=None, lr_scheduler_step=None, lr_scheduler_factor=None, lr_scheduler_minimum_lr=None, normalize_data=None, normalize_label=None, unbias_data=None, unbias_label=None, num_point_for_scalar=None, **kwargs)¶ Bases:
sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase
An
Estimator
for binary classification and regression.Amazon SageMaker Linear Learner provides a solution for both classification and regression problems, allowing for exploring different training objectives simultaneously and choosing the best solution from a validation set. It allows the user to explore a large number of models and choose the best, which optimizes either continuous objectives such as mean square error, cross entropy loss, absolute error, etc., or discrete objectives suited for classification such as F1 measure, precision@recall, accuracy. The implementation provides a significant speedup over naive hyperparameter optimization techniques and an added convenience, when compared with solutions providing a solution only to continuous objectives.
This Estimator may be fit via calls to
fit_ndarray()
orfit()
. The former allows a LinearLearner model to be fit on a 2dimensional numpy array. The latter requires AmazonRecord
protobuf serialized data to be stored in S3.To learn more about the Amazon protobuf Record class and how to prepare bulk data in this format, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/cdftraining.html
After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking
deploy()
. As well as deploying an Endpoint,deploy
returns aLinearLearnerPredictor
object that can be used to make class or regression predictions, using the trained model.LinearLearner Estimators can be configured by setting hyperparameters. The available hyperparameters for LinearLearner are documented below. For further information on the AWS LinearLearner algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/linearlearner.html
Parameters:  role (str) – An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
 train_instance_count (int) – Number of Amazon EC2 instances to use for training.
 train_instance_type (str) – Type of EC2 instance to use for training, for example, ‘ml.c4.xlarge’.
 predictor_type (str) – The type of predictor to learn. Either “binary_classifier” or “regressor”.
 binary_classifier_model_selection_criteria (str) – One of ‘accuracy’, ‘f1’, ‘precision_at_target_recall’,
 'cross_entropy_loss' ('recall_at_target_precision',) –
 target_recall (float) – Target recall. Only applicable if binary_classifier_model_selection_criteria is precision_at_target_recall.
 target_precision (float) – Target precision. Only applicable if binary_classifier_model_selection_criteria is recall_at_target_precision.
 positive_example_weight_mult (float) – The importance weight of positive examples is multiplied by this constant. Useful for skewed datasets. Only applies for classification tasks.
 epochs (int) – The maximum number of passes to make over the training data.
 use_bias (bool) – Whether to include a bias field
 num_models (int) – Number of models to train in parallel. If not set, the number of parallel models to train will be decided by the algorithm itself. One model will be trained according to the given training
 parameter (regularization, optimizer, loss) –
 num_calibration_samples (int) – Number of observations to use from validation dataset for doing model
 calibration (finding the best threshold) –
 init_method (str) – Function to use to set the initial model weights. One of “uniform” or “normal”
 init_scale (float) – For “uniform” init, the range of values.
 init_sigma (float) – For “normal” init, the standarddeviation.
 init_bias (float) – Initial weight for bias term
 optimizer (str) – One of ‘sgd’, ‘adam’ or ‘auto’
 loss (str) – One of ‘logistic’, ‘squared_loss’, ‘absolute_loss’ or ‘auto’
 wd (float) – L2 regularization parameter i.e. the weight decay parameter. Use 0 for no L2 regularization.
 l1 (float) – L1 regularization parameter. Use 0 for no L1 regularization.
 momentum (float) – Momentum parameter of sgd optimizer.
 learning_rate (float) – The SGD learning rate
 beta_1 (float) – Exponential decay rate for first moment estimates. Only applies for adam optimizer.
 beta_2 (float) – Exponential decay rate for second moment estimates. Only applies for adam optimizer.
 bias_lr_mult (float) – Allows different learning rate for the bias term. The actual learning rate for the
 is learning rate times bias_lr_mult. (bias) –
 bias_wd_mult (float) – Allows different regularization for the bias term. The actual L2 regularization weight
 the bias is wd times bias_wd_mult. By default there is no regularization on the bias term. (for) –
 use_lr_scheduler (bool) – If true, we use a scheduler for the learning rate.
 lr_scheduler_step (int) – The number of steps between decreases of the learning rate. Only applies to learning rate scheduler.
 lr_scheduler_factor (float) – Every lr_scheduler_step the learning rate will decrease by this quantity. Only applies for learning rate scheduler.
 lr_scheduler_minimum_lr (float) – The learning rate will never decrease to a value lower than this.
 lr_scheduler_minimum_lr – Only applies for learning rate scheduler.
 normalize_data (bool) – Normalizes the features before training to have standard deviation of 1.0.
 normalize_label (bool) – Normalizes the regression label to have a standard deviation of 1.0. If set for classification, it will be ignored.
 unbias_data (bool) – If true, features are modified to have mean 0.0.
 ubias_label (bool) – If true, labels are modified to have mean 0.0.
 num_point_for_scaler (int) – The number of data points to use for calculating the normalizing and unbiasing terms.
 **kwargs – base class keyword argument values.

repo
= 'linearlearner:1'¶

unbias_label
¶ An algorithm hyperparameter with optional validation. Implemented as a python descriptor object.

num_point_for_scalar
¶ An algorithm hyperparameter with optional validation. Implemented as a python descriptor object.

data_location
¶

delete_endpoint
()¶ Delete an Amazon SageMaker
Endpoint
.Raises: ValueError
– If the endpoint does not exist.

deploy
(initial_instance_count, instance_type, endpoint_name=None, **kwargs)¶ Deploy the trained model to an Amazon SageMaker endpoint and return a
sagemaker.RealTimePredictor
object.More information: http://docs.aws.amazon.com/sagemaker/latest/dg/howitworkstraining.html
Parameters:  initial_instance_count (int) – Minimum number of EC2 instances to deploy to an endpoint for prediction.
 instance_type (str) – Type of EC2 instance to deploy to an endpoint for prediction, for example, ‘ml.c4.xlarge’.
 endpoint_name (str) – Name to use for creating an Amazon SageMaker endpoint. If not specified, the name of the training job is used.
 **kwargs – Passed to invocation of
create_model()
. Implementations may customizecreate_model()
to accept**kwargs
to customize model creation during deploy. For more, see the implementation docs.
Returns:  A predictor that provides a
predict()
method, which can be used to send requests to the Amazon SageMaker endpoint and obtain inferences.
Return type:

fit
(records, mini_batch_size=None, **kwargs)¶ Fit this Estimator on serialized Record objects, stored in S3.
records
should be an instance ofRecordSet
. This defines a collection of s3 data files to train thisEstimator
on.Training data is expected to be encoded as dense or sparse vectors in the “values” feature on each Record. If the data is labeled, the label is expected to be encoded as a list of scalas in the “values” feature of the Record label.
More information on the Amazon Record format is available at: https://docs.aws.amazon.com/sagemaker/latest/dg/cdftraining.html
See
record_set()
to construct aRecordSet
object fromndarray
arrays.Parameters:

hyperparameters
()¶

model_data
¶ str – The model location in S3. Only set if Estimator has been
fit()
.

record_set
(train, labels=None, channel='train')¶ Build a
RecordSet
from a numpyndarray
matrix and label vector.For the 2D
ndarray
train
, each row is converted to aRecord
object. The vector is stored in the “values” entry of thefeatures
property of each Record. Iflabels
is not None, each corresponding label is assigned to the “values” entry of thelabels
property of each Record.The collection of
Record
objects are protobuf serialized and uploaded to new S3 locations. A manifest file is generated containing the list of objects created and also stored in S3.The number of S3 objects created is controlled by the
train_instance_count
property on this Estimator. One S3 object is created per training instance.Parameters:  train (numpy.ndarray) – A 2D numpy array of training data.
 labels (numpy.ndarray) – A 1D numpy array of labels. Its length must be equal to the
number of rows in
train
.  channel (str) – The SageMaker TrainingJob channel this RecordSet should be assigned to.
Returns: A RecordSet referencing the encoded, uploading training and label data.
Return type: RecordSet

train_image
()¶

normalize_data
¶ An algorithm hyperparameter with optional validation. Implemented as a python descriptor object.

normalize_label
¶ An algorithm hyperparameter with optional validation. Implemented as a python descriptor object.

unbias_data
¶ An algorithm hyperparameter with optional validation. Implemented as a python descriptor object.

create_model
()¶ Return a
LinearLearnerModel
referencing the latest s3 model data produced by this Estimator.

class
sagemaker.
LinearLearnerModel
(model_data, role, sagemaker_session=None)¶ Bases:
sagemaker.model.Model
Reference LinearLearner s3 model data. Calling
deploy()
creates an Endpoint and returns aLinearLearnerPredictor

class
sagemaker.
LinearLearnerPredictor
(endpoint, sagemaker_session=None)¶ Bases:
sagemaker.predictor.RealTimePredictor
Performs binaryclassification or regression prediction from input vectors.
The implementation of
predict()
in this RealTimePredictor requires a numpyndarray
as input. The array should contain the same number of columns as the featuredimension of the data used to fit the model this Predictor performs inference on.predict()
returns a list ofRecord
objects, one for each row in the inputndarray
. The prediction is stored in the"predicted_label"
key of theRecord.label
field.