create-job

Description

Create a learning job.

Synopsis

$ abeja training create-job [--help]
Usage: abeja training create-job [OPTIONS]

  Create training job

Options:
  -j, --job_definition_name, --job-definition-name TEXT
                                  Training job definition name
  -v, --version TEXT              Job definition version. By default, latest
                                  version is used
  -e, --environment ENVIRONMENTSTRING
                                  Environment variables, ex. BATCH_SIZE:32
  -p, --params USERPARAMSTRING    [DEPRECATED] User parameters, ex.
                                  BATCH_SIZE:32. If environment is specified,
                                  this will be ignored. Please use
                                  `--environment` option.
  --instance-type TEXT            Instance Type of the machine where training
                                  job is executed. By default, cpu-1 and gpu-1
                                  is used for all-cpu and all-gpu images
                                  respectively.
  -d, --description TEXT          Description for the training job, which must
                                  be less than or equal to 256 characters.
  --dataset, --datasets DATASETPARAMSTRING
                                  Datasets name
  --export-log                    Include the log in the model file. This
                                  feature is only available with 19.04 or
                                  later images.
  --help                          Show this message and exit.

Argument

Get the parameters from the Training configuration file (training.yaml) .

Options

-j, --job_definition_name, --job-definition-name

Training job definition name

With training.yaml, values defined as name in training.yaml is set by default.
This option can overwrite name in training.yaml.

-v, --version

Specify the training version.

-e, --environment

Specify an environment variable. Registered environment variables can be referenced from the code. e.g.)IMAGE_WIDTH:100
For more information on user-specifiable environment variables, see here.

( version 0.14.0 or later ) With training.yaml, values defined as environment ( params ) in training.yaml are set by default.
This option can overwrite environment ( params ) in training.yaml.

-p, --params

[DEPRECATED] User parameter. The format is Key:value. You can input a plurality of parameters. e.g.: --params key1:val1 --params key2:val2
With --environment options, parameters given by --params are ignored.
This option is deprecated, please use --environment instead.

--instance-type

Instance Type of the machine where training job is executed. By default, cpu-1 and gpu-1 is used for all-cpu and all-gpu images.

-d, --description

Description for the training job

--dataset, --datasets

Specify the data set to be used in the following format.
{dataset_name}:{dataset_id} The registered dataset can be referenced from the context given as an argument of the learning code.

( version 0.14.0 or later ) With training.yaml, values defined as datasets in training.yaml are set by default.
This option can overwrite datasets in training.yaml.

-d option is used as an abbreviation for --description option in version 1.1.0 or later

--export-log

Include the log in the model as a file named .abeja_train.log.

This feature is only available with 19.04 or later images.

Examlple

To create a training job

Create a training job in this example

Training configuration File (training.yaml) :

name: training1
handler: train:handler
image: abeja-inc/all-gpu:19.04
datasets:
  "mnist": "1111111111111"

Command:

$ abeja training create-job --version 1

Output:

{
  "created_at": "2018-02-13T10:14:10.956198Z",
  "job_definition_version": 1,
  "modified_at": ""2018-02-13T10:13:10.956198Z"",
  "status": {
    "active": null,
    "completion_time": null,
    "conditions": null,
    "failed": null,
    "start_time": null,
    "succeeded": null
  },
  "training_job_id": "job-e45bc2647ab74427",
  "enviornment": {}
}