libZSservicesZSamazonka-glueZSamazonka-glue
Copyright(c) 2013-2021 Brendan Hay
LicenseMozilla Public License, v. 2.0.
MaintainerBrendan Hay <brendan.g.hay+amazonka@gmail.com>
Stabilityauto-generated
Portabilitynon-portable (GHC extensions)
Safe HaskellNone

Amazonka.Glue.Types.MLTransform

Description

 
Synopsis

Documentation

data MLTransform Source #

A structure for a machine learning transform.

See: newMLTransform smart constructor.

Constructors

MLTransform' 

Fields

  • status :: Maybe TransformStatusType

    The current status of the machine learning transform.

  • numberOfWorkers :: Maybe Int

    The number of workers of a defined workerType that are allocated when a task of the transform runs.

    If WorkerType is set, then NumberOfWorkers is required (and vice versa).

  • lastModifiedOn :: Maybe POSIX

    A timestamp. The last point in time when this machine learning transform was modified.

  • labelCount :: Maybe Int

    A count identifier for the labeling files generated by Glue for this transform. As you create a better transform, you can iteratively download, label, and upload the labeling file.

  • workerType :: Maybe WorkerType

    The type of predefined worker that is allocated when a task of this transform runs. Accepts a value of Standard, G.1X, or G.2X.

    • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
    • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
    • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

    MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

    • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
    • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
    • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
    • MaxCapacity and NumberOfWorkers must both be at least 1.
  • inputRecordTables :: Maybe [GlueTable]

    A list of Glue table definitions used by the transform.

  • glueVersion :: Maybe Text

    This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.

  • evaluationMetrics :: Maybe EvaluationMetrics

    An EvaluationMetrics object. Evaluation metrics provide an estimate of the quality of your machine learning transform.

  • schema :: Maybe [SchemaColumn]

    A map of key-value pairs representing the columns and data types that this transform can run against. Has an upper bound of 100 columns.

  • role' :: Maybe Text

    The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both Glue service role permissions to Glue resources, and Amazon S3 permissions required by the transform.

    • This role needs Glue service role permissions to allow access to resources in Glue. See Attach a Policy to IAM Users That Access Glue.
    • This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform.
  • name :: Maybe Text

    A user-defined name for the machine learning transform. Names are not guaranteed unique and can be changed at any time.

  • parameters :: Maybe TransformParameters

    A TransformParameters object. You can use parameters to tune (customize) the behavior of the machine learning transform by specifying what data it learns from and your preference on various tradeoffs (such as precious vs. recall, or accuracy vs. cost).

  • maxRetries :: Maybe Int

    The maximum number of times to retry after an MLTaskRun of the machine learning transform fails.

  • maxCapacity :: Maybe Double

    The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

    MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

    • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
    • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
    • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
    • MaxCapacity and NumberOfWorkers must both be at least 1.

    When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

  • timeout :: Maybe Natural

    The timeout in minutes of the machine learning transform.

  • transformEncryption :: Maybe TransformEncryption

    The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.

  • description :: Maybe Text

    A user-defined, long-form description text for the machine learning transform. Descriptions are not guaranteed to be unique and can be changed at any time.

  • createdOn :: Maybe POSIX

    A timestamp. The time and date that this machine learning transform was created.

  • transformId :: Maybe Text

    The unique transform ID that is generated for the machine learning transform. The ID is guaranteed to be unique and does not change.

Instances

Instances details
Eq MLTransform Source # 
Instance details

Defined in Amazonka.Glue.Types.MLTransform

Read MLTransform Source # 
Instance details

Defined in Amazonka.Glue.Types.MLTransform

Show MLTransform Source # 
Instance details

Defined in Amazonka.Glue.Types.MLTransform

Generic MLTransform Source # 
Instance details

Defined in Amazonka.Glue.Types.MLTransform

Associated Types

type Rep MLTransform :: Type -> Type #

NFData MLTransform Source # 
Instance details

Defined in Amazonka.Glue.Types.MLTransform

Methods

rnf :: MLTransform -> () #

Hashable MLTransform Source # 
Instance details

Defined in Amazonka.Glue.Types.MLTransform

FromJSON MLTransform Source # 
Instance details

Defined in Amazonka.Glue.Types.MLTransform

type Rep MLTransform Source # 
Instance details

Defined in Amazonka.Glue.Types.MLTransform

type Rep MLTransform = D1 ('MetaData "MLTransform" "Amazonka.Glue.Types.MLTransform" "libZSservicesZSamazonka-glueZSamazonka-glue" 'False) (C1 ('MetaCons "MLTransform'" 'PrefixI 'True) ((((S1 ('MetaSel ('Just "status") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe TransformStatusType)) :*: S1 ('MetaSel ('Just "numberOfWorkers") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Int))) :*: (S1 ('MetaSel ('Just "lastModifiedOn") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe POSIX)) :*: S1 ('MetaSel ('Just "labelCount") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Int)))) :*: ((S1 ('MetaSel ('Just "workerType") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe WorkerType)) :*: S1 ('MetaSel ('Just "inputRecordTables") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe [GlueTable]))) :*: (S1 ('MetaSel ('Just "glueVersion") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Text)) :*: (S1 ('MetaSel ('Just "evaluationMetrics") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe EvaluationMetrics)) :*: S1 ('MetaSel ('Just "schema") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe [SchemaColumn])))))) :*: (((S1 ('MetaSel ('Just "role'") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Text)) :*: S1 ('MetaSel ('Just "name") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Text))) :*: (S1 ('MetaSel ('Just "parameters") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe TransformParameters)) :*: (S1 ('MetaSel ('Just "maxRetries") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Int)) :*: S1 ('MetaSel ('Just "maxCapacity") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Double))))) :*: ((S1 ('MetaSel ('Just "timeout") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Natural)) :*: S1 ('MetaSel ('Just "transformEncryption") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe TransformEncryption))) :*: (S1 ('MetaSel ('Just "description") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Text)) :*: (S1 ('MetaSel ('Just "createdOn") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe POSIX)) :*: S1 ('MetaSel ('Just "transformId") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Text))))))))

newMLTransform :: MLTransform Source #

Create a value of MLTransform with all optional fields omitted.

Use generic-lens or optics to modify other optional fields.

The following record fields are available, with the corresponding lenses provided for backwards compatibility:

$sel:status:MLTransform', mLTransform_status - The current status of the machine learning transform.

$sel:numberOfWorkers:MLTransform', mLTransform_numberOfWorkers - The number of workers of a defined workerType that are allocated when a task of the transform runs.

If WorkerType is set, then NumberOfWorkers is required (and vice versa).

$sel:lastModifiedOn:MLTransform', mLTransform_lastModifiedOn - A timestamp. The last point in time when this machine learning transform was modified.

$sel:labelCount:MLTransform', mLTransform_labelCount - A count identifier for the labeling files generated by Glue for this transform. As you create a better transform, you can iteratively download, label, and upload the labeling file.

$sel:workerType:MLTransform', mLTransform_workerType - The type of predefined worker that is allocated when a task of this transform runs. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
  • MaxCapacity and NumberOfWorkers must both be at least 1.

$sel:inputRecordTables:MLTransform', mLTransform_inputRecordTables - A list of Glue table definitions used by the transform.

$sel:glueVersion:MLTransform', mLTransform_glueVersion - This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.

$sel:evaluationMetrics:MLTransform', mLTransform_evaluationMetrics - An EvaluationMetrics object. Evaluation metrics provide an estimate of the quality of your machine learning transform.

$sel:schema:MLTransform', mLTransform_schema - A map of key-value pairs representing the columns and data types that this transform can run against. Has an upper bound of 100 columns.

$sel:role':MLTransform', mLTransform_role - The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both Glue service role permissions to Glue resources, and Amazon S3 permissions required by the transform.

  • This role needs Glue service role permissions to allow access to resources in Glue. See Attach a Policy to IAM Users That Access Glue.
  • This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform.

$sel:name:MLTransform', mLTransform_name - A user-defined name for the machine learning transform. Names are not guaranteed unique and can be changed at any time.

$sel:parameters:MLTransform', mLTransform_parameters - A TransformParameters object. You can use parameters to tune (customize) the behavior of the machine learning transform by specifying what data it learns from and your preference on various tradeoffs (such as precious vs. recall, or accuracy vs. cost).

$sel:maxRetries:MLTransform', mLTransform_maxRetries - The maximum number of times to retry after an MLTaskRun of the machine learning transform fails.

$sel:maxCapacity:MLTransform', mLTransform_maxCapacity - The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
  • MaxCapacity and NumberOfWorkers must both be at least 1.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

$sel:timeout:MLTransform', mLTransform_timeout - The timeout in minutes of the machine learning transform.

$sel:transformEncryption:MLTransform', mLTransform_transformEncryption - The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.

$sel:description:MLTransform', mLTransform_description - A user-defined, long-form description text for the machine learning transform. Descriptions are not guaranteed to be unique and can be changed at any time.

$sel:createdOn:MLTransform', mLTransform_createdOn - A timestamp. The time and date that this machine learning transform was created.

$sel:transformId:MLTransform', mLTransform_transformId - The unique transform ID that is generated for the machine learning transform. The ID is guaranteed to be unique and does not change.

mLTransform_status :: Lens' MLTransform (Maybe TransformStatusType) Source #

The current status of the machine learning transform.

mLTransform_numberOfWorkers :: Lens' MLTransform (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when a task of the transform runs.

If WorkerType is set, then NumberOfWorkers is required (and vice versa).

mLTransform_lastModifiedOn :: Lens' MLTransform (Maybe UTCTime) Source #

A timestamp. The last point in time when this machine learning transform was modified.

mLTransform_labelCount :: Lens' MLTransform (Maybe Int) Source #

A count identifier for the labeling files generated by Glue for this transform. As you create a better transform, you can iteratively download, label, and upload the labeling file.

mLTransform_workerType :: Lens' MLTransform (Maybe WorkerType) Source #

The type of predefined worker that is allocated when a task of this transform runs. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
  • MaxCapacity and NumberOfWorkers must both be at least 1.

mLTransform_inputRecordTables :: Lens' MLTransform (Maybe [GlueTable]) Source #

A list of Glue table definitions used by the transform.

mLTransform_glueVersion :: Lens' MLTransform (Maybe Text) Source #

This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.

mLTransform_evaluationMetrics :: Lens' MLTransform (Maybe EvaluationMetrics) Source #

An EvaluationMetrics object. Evaluation metrics provide an estimate of the quality of your machine learning transform.

mLTransform_schema :: Lens' MLTransform (Maybe [SchemaColumn]) Source #

A map of key-value pairs representing the columns and data types that this transform can run against. Has an upper bound of 100 columns.

mLTransform_role :: Lens' MLTransform (Maybe Text) Source #

The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both Glue service role permissions to Glue resources, and Amazon S3 permissions required by the transform.

  • This role needs Glue service role permissions to allow access to resources in Glue. See Attach a Policy to IAM Users That Access Glue.
  • This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform.

mLTransform_name :: Lens' MLTransform (Maybe Text) Source #

A user-defined name for the machine learning transform. Names are not guaranteed unique and can be changed at any time.

mLTransform_parameters :: Lens' MLTransform (Maybe TransformParameters) Source #

A TransformParameters object. You can use parameters to tune (customize) the behavior of the machine learning transform by specifying what data it learns from and your preference on various tradeoffs (such as precious vs. recall, or accuracy vs. cost).

mLTransform_maxRetries :: Lens' MLTransform (Maybe Int) Source #

The maximum number of times to retry after an MLTaskRun of the machine learning transform fails.

mLTransform_maxCapacity :: Lens' MLTransform (Maybe Double) Source #

The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
  • MaxCapacity and NumberOfWorkers must both be at least 1.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

mLTransform_timeout :: Lens' MLTransform (Maybe Natural) Source #

The timeout in minutes of the machine learning transform.

mLTransform_transformEncryption :: Lens' MLTransform (Maybe TransformEncryption) Source #

The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.

mLTransform_description :: Lens' MLTransform (Maybe Text) Source #

A user-defined, long-form description text for the machine learning transform. Descriptions are not guaranteed to be unique and can be changed at any time.

mLTransform_createdOn :: Lens' MLTransform (Maybe UTCTime) Source #

A timestamp. The time and date that this machine learning transform was created.

mLTransform_transformId :: Lens' MLTransform (Maybe Text) Source #

The unique transform ID that is generated for the machine learning transform. The ID is guaranteed to be unique and does not change.