Using the Raster Loader CLI

Most functions of the Raster Loader are accessible through the carto command-line interface (CLI). To start the CLI, use the carto command in a terminal.

Currently, Raster Loader allows you to upload a local raster file to a BigQuery or Snowflake table. You can also download and inspect a raster file from a BigQuery or Snowflake table.

Using the Raster Loader with BigQuery

Before you can upload a raster file, you need to have set up the following in BigQuery:

  1. A GCP project

  2. A BigQuery dataset

To use the bigquery utilities, use the carto bigquery command. This command has several subcommands, which are described below.

Note

Accessing BigQuery with Raster Loader requires the GOOGLE_APPLICATION_CREDENTIALS environment variable to be set to the path of a JSON file containing your BigQuery credentials. See the GCP documentation for more information.

Using the Raster Loader with Snowflake

Before you can upload a raster file, you need to have set up the following in Snowflake:

  1. A Snowflake account

  2. A Snowflake database

  3. A Snowflake schema

To use the snowflake utilities, use the carto snowflake command. This command has several subcommands, which are described below.

Uploading a raster layer

To upload a raster file, use the carto [bigquery|snowflake] upload command.

The input raster must be a GoogleMapsCompatible raster. You can make your raster compatible by converting it with the following GDAL command:

gdalwarp -of COG -co TILING_SCHEME=GoogleMapsCompatible -co COMPRESS=DEFLATE -co OVERVIEWS=NONE -co ADD_ALPHA=NO -co RESAMPLING=NEAREST -co BLOCKSIZE=512 <input_raster>.tif <output_raster>.tif

You have the option to also set up a table in your provider and use this table to upload your data to. In case you do not specify a table name, Raster Loader will automatically generate a table name for you and create that table.

At a minimum, the carto upload command requires a file_path to a local raster file that can be read by GDAL and processed with rasterio. It also requires the project (the GCP project name) and dataset (the BigQuery dataset name) parameters in the case of Bigquery, or the database and schema parameters in the case of Snowflake.

There are also additional parameters, such as table (table name) and overwrite (to overwrite existing data). For example:

carto bigquery upload \
  --file_path /path/to/my/raster/file.tif \
  --project my-gcp-project \
  --dataset my-bigquery-dataset \
  --table my-bigquery-table \
  --overwrite

This command uploads the TIFF file from /path/to/my/raster/file.tif to a BigQuery project named my-gcp-project, a dataset named my-bigquery-dataset, and a table named my-bigquery-table. If the table already contains data, this data will be overwritten because the --overwrite flag is set.

The same operation, performed with Snowflake, would be:

carto snowflake upload \
  --file_path /path/to/my/raster/file.tif \
  --database my-snowflake-database \
  --schema my-snowflake-schema \
  --table my-snowflake-table \
  --account my-snowflake-account \
  --username my-snowflake-user \
  --password my-snowflake-password \
  --overwrite

Authentication parameters are explicitly required in this case for Snowflake, since they are not set up in the environment.

If no band is specified, the first band of the raster will be uploaded. If the --band flag is set, the specified band will be uploaded. For example, the following command uploads the second band of the raster:

carto bigquery upload \
  --file_path /path/to/my/raster/file.tif \
  --project my-gcp-project \
  --dataset my-bigquery-dataset \
  --table my-bigquery-table \
  --band 2

Band names can be specified with the --band_name flag. For example, the following command uploads the red band of the raster:

carto bigquery upload \
  --file_path /path/to/my/raster/file.tif \
  --project my-gcp-project \
  --dataset my-bigquery-dataset \
  --table my-bigquery-table \
  --band 2 \
  --band_name red

If the raster contains multiple bands, you can upload multiple bands at once by specifying a list of bands. For example, the following command uploads the first and second bands of the raster:

carto bigquery upload \
  --file_path /path/to/my/raster/file.tif \
  --project my-gcp-project \
  --dataset my-bigquery-dataset \
  --table my-bigquery-table \
  --band 1 \
  --band 2

Or, with band names:

carto bigquery upload \
  --file_path /path/to/my/raster/file.tif \
  --project my-gcp-project \
  --dataset my-bigquery-dataset \
  --table my-bigquery-table \
  --band 1 \
  --band 2 \
  --band_name red \
  --band_name green

See also

See the CLI details for a full list of options.

For large raster files, you can use the --chunk_size flag to specify the number of rows to upload at once, and preventing BigQuery from showing you an exception like the following, due to excessive operations in the destination table:

` Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors `

The default chunk size is 1000 rows.

For example, the following command uploads the raster in chunks of 2000 rows:

carto bigquery upload \
  --file_path /path/to/my/raster/file.tif \
  --project my-gcp-project \
  --dataset my-bigquery-dataset \
  --table my-bigquery-table \
  --chunk_size 1000

Inspecting a raster file

You can also use Raster Loader to retrieve information about a raster file stored in a BigQuery or Snowflake table. This can be useful to make sure a raster file was transferred correctly or to get information about a raster file’s metadata, for example.

To access a raster file in a BigQuery table, use the carto bigquery describe command.

At a minimum, this command requires a GCP project name, a BigQuery dataset name, and a BigQuery table name. For example:

carto bigquery describe \
  --project my-gcp-project \
  --dataset my-bigquery-dataset \
  --table my-bigquery-table

See also

See the CLI details for a full list of options.

CLI details

The following is a detailed overview of all of the CLI’s subcommands and options:

carto

The carto command line interface.

carto [OPTIONS] COMMAND [ARGS]...

bigquery

Manage Google BigQuery resources.

carto bigquery [OPTIONS] COMMAND [ARGS]...
describe

Load and describe a table from BigQuery

carto bigquery describe [OPTIONS]

Options

--project <project>

Required The name of the Google Cloud project.

--dataset <dataset>

Required The name of the dataset.

--table <table>

Required The name of the table.

--limit <limit>

Limit number of rows returned

--token <token>

An access token to authenticate with.

upload

Upload a raster file to Google BigQuery.

carto bigquery upload [OPTIONS]

Options

--file_path <file_path>

The path to the raster file.

--file_url <file_url>

The path to the raster file.

--project <project>

Required The name of the Google Cloud project.

--token <token>

An access token to authenticate with.

--dataset <dataset>

Required The name of the dataset.

--table <table>

The name of the table.

--band <band>

Band(s) within raster to upload. Could repeat –band to specify multiple bands.

--band_name <band_name>

Column name(s) used to store band (Default: band_<band_num>). Could repeat –band_name to specify multiple bands column names. List of columns names HAVE to pair –band list with the same order.

--chunk_size <chunk_size>

The number of blocks to upload in each chunk.

--overwrite

Overwrite existing data in the table if it already exists.

--append

Append records into a table if it already exists.

--cleanup-on-failure

Clean up resources if the upload fails. Useful for non-interactive scripts.

info

Display system information.

carto info [OPTIONS]

snowflake

Manage Snowflake resources.

carto snowflake [OPTIONS] COMMAND [ARGS]...
describe

Load and describe a table from Snowflake

carto snowflake describe [OPTIONS]

Options

--account <account>

Required The Swnoflake account.

--username <username>

The username.

--password <password>

The password.

--token <token>

An access token to authenticate with.

--role <role>

The role to use for the file upload.

--database <database>

Required The name of the database.

--schema <schema>

Required The name of the schema.

--table <table>

Required The name of the table.

--limit <limit>

Limit number of rows returned

upload

Upload a raster file to Snowflake.

carto snowflake upload [OPTIONS]

Options

--account <account>

Required The Swnoflake account.

--username <username>

The username.

--password <password>

The password.

--token <token>

An access token to authenticate with.

--role <role>

The role to use for the file upload.

--file_path <file_path>

The path to the raster file.

--file_url <file_url>

The path to the raster file.

--database <database>

Required The name of the database.

--schema <schema>

Required The name of the schema.

--table <table>

The name of the table.

--band <band>

Band(s) within raster to upload. Could repeat –band to specify multiple bands.

--band_name <band_name>

Column name(s) used to store band (Default: band_<band_num>). Could repeat –band_name to specify multiple bands column names. List of columns names HAVE to pair –band list with the same order.

--chunk_size <chunk_size>

The number of blocks to upload in each chunk.

--overwrite

Overwrite existing data in the table if it already exists.

--append

Append records into a table if it already exists.

--cleanup-on-failure

Clean up resources if the upload fails. Useful for non-interactive scripts.