download

Description

Download file from datalake and save to specify directly on the local area。

Synopsis

$ abeja datalake download [--help]
Usage: abeja datalake download [OPTIONS]

  Download files

Options:
  -c, --channel_id TEXT   Channel identifier  [required]
  -o, --output_path TEXT  Output directory path  [required]
  -f, --file_id TEXT      File identifier NOTE: This argument is mutually
                          exclusive with  arguments: [start, end].
  -s, --start DATESTRING  Start date NOTE: This argument is mutually exclusive
                          with  arguments: [file_id].
  -e, --end DATESTRING    End date NOTE: This argument is mutually exclusive
                          with  arguments: [file_id].
  --dry-run, --dry_run    Dry run, only shows upload candidate files
  --file-name [id|name]   Defines the output file's name type; [id|name].
  --skip-duplicate-files  Don't download file if the file whose name is same
                          already exists in output directory path.
  --help                  Show this message and exit.

Options

-c, --channel_id

Specify channel_id of datalake which you want to download

-o, --output_path

Specify the relative path or absolute path directly to store the downloaded files

-f, --file_id

Specify file_id which you want to download. Available to select multiply.

-s, --start

Can download the files which has been uploaded after the date you specified. A format will be ”YYYYMMDD” format (Sample:“20170329”) and you need to use --end option with this command. It cannot be combied with --file_id option

-e, --end

Can download the files which has been uploaded after the date you specified. A format will be ”YYYYMMDD” (Sample:“20170329”) and you need to use --start option with this command. And this option cannot be combined with --file_id option

--dry-run, --dry_run

You can see the FILE_ID list is expected downloading. Actual download will not be performed in case of using this option.

--file-name [name|id]

This option is specifying the file name format. file name(--file-name=name) or file ID(--file-name=id) to specify it. It will be saved just file name if not specified.

--skip-duplicate-files

This option does not download the file if the same file name already exists in the output directory path.

Example

Download by specifying FILE_ID

Command:

$ abeja datalake download --channel_id 1234567890123 \
                          --output_path ./download_dir \
                          --file_id 20171204T025103-86c30036-78c9-4c97-ba34-05e42cacf382 \
                          --file_id 20171204T025103-3b71676e-8f66-4228-b390-9edd20b4149e

Download by filtered duration

Command:

$ abeja datalake download --channel_id 1234567890123 --start 20171203 --end 20171204 --output_path ./download_dir