train-local

Added from version 0.11

The train-local command before version0.11 is debug-local. Has been renamed.

Description

The training job definition version specified by --version is executed on the local environment.

Notes

In order to use the train-local command, docker must be installed.

Synopsis

$ abeja training train-local [--help]
Usage: abeja training train-local [OPTIONS]

  Local train commands

Options:
  -o, --organization_id, --organization-id TEXT
                                  Organization ID, organization_id of current
                                  credential organization is used by default
                                  [required]
  --name TEXT                     Training Job Definition Name  [required]
  --version TEXT                  Training Job Definition Version  [required]
  -d, --description TEXT          Training Job description
  --datasets DATASETPARAMSTRING   Datasets name
  -e, --environment ENVIRONMENTSTRING
                                  Environment variables
  -v, --volume VOLUMEPARAMSTRING  Volume driver options, ex) /path/source/on/h
                                  ost:/path/destination/on/container
  --v1                            Specify if you use old custom runtime image
  --runtime TEXT                  Runtime, equivalent to docker run
                                  `--runtime` option
  --config PATH                   Read Configuration from PATH. By default
                                  read from `training.yaml`
  --help                          Show this message and exit.

Options

-o, --organization_id, --organization-id

Specify the organization ID. Organization ID is registered in the environment variable with the key ABEJA_ORGANIZATION_ID. It can be referenced from the learning code to be executed.

--name

Specify the learning job definition name.

--version

Specify the path of the function to be called. If --handler main: handler is specified,handler defined in the main.py file is called. If the file to call is placed directly under the src directory, it will be src.main: handler.

-d, --description

Description of the learning job.

--datasets

Specify the data set to be used in the following format. {dataset_name}:{dataset_id} The registered dataset can be referenced from the context given as an argument of the learning code.

( version 0.14.0 or later ) With --environment options, parameters given by --params are ignored.
This option is deprecated, please use --environment instead.

-e, --environment

Specify an environment variable. Registered environment variables can be referenced from the code. e.g.)IMAGE_WIDTH:100
For more information on user-specifiable environment variables, see here.

( version 0.14.0 or later ) With training.yaml, values defined as environment ( params ) in training.yaml are set by default.
This option can overwrite environment ( params ) in training.yaml.

-v, --volume

This corresponds to the --volume option of the docker run command. In the format --volume/path/source/on/host:/path/destination/on/container, specify the host-side path to be mounted and the container-side path to be mounted. It is possible to specify more than one. The --read-only option of the docker run command is not supported.

--runtime

This corresponds to the --runtime option of the docker run command. The train-local command runs the learning job as a container. The --runtime option specifies the runtime for starting the container.

For example, if nvidia-docker2 is installed, you can run learning using GPU by specifying --runtime nvidia.

--v1

This option should be given when using 18.10 custom images.

--config

Specify a configuration file. By default, it references training.yaml in the current directory.

-d option is used as an abbreviation for --description option in version 1.1.0 or later

Example

Run training locally

Premise: It is assumed that the learning job definition named “cats_dogs” and learning job definition version 1 are registered in the organization 1234567890123.

Command:

$ abeja training train-local \
    --organization_id 1234567890123 \
    --name cats_dogs \
    --version 1

Output:

[info] preparing ...
[info] start training job
{"log_id": "2e56cf01-8dee-444b-84f0-719bc5543c80", "log_level": "INFO", "timestamp": "2019-07-09T04:17:55.796421+00:00", "source": "model:run.download_training_source_code.80", "requester_id": "-", "message": "downloading training source code", "exc_info": null}
{"log_id": "7eefd9b3-56d9-4925-b5c2-aab77d77ff06", "log_level": "INFO", "timestamp": "2019-07-09T04:17:57.518125+00:00", "source": "model:run.download_training_source_code.89", "requester_id": "-", "message": "successfully downloaded training source code", "exc_info": null}
INFO: start installing packages from requirements.txt
INFO: requirements.txt not found, skipping
...

Handler environment variables

For environment variables available from handler function executed by debug-local, please refer Training Handler Function.