Skip to main content

Connecting with rclone

This page is for users.

Prerequisites

Before you start, find the following values under Bucket > External Connection Info (i) and Key Management:

  • Bucket Name: the unique name of the provisioned object storage
  • Access Key: the issued user Access Key
  • Secret Key: the issued user Secret Key

Installing rclone

sudo apt install rclone
rclone config

Configure an S3 remote in rclone

The example below creates an rclone remote for connecting to a Data Hub bucket (S3-compatible).

# Start configuration
$ rclone config

# Create a new remote
n) New remote
name> datahub

Storage> 4 # S3 (Amazon S3 Compliant Storage)
provider> 4 # Ceph
env_auth> # Enter (false)

access_key_id> YOUR_ACCESS_KEY
secret_access_key> YOUR_SECRET_KEY

region> # Press Enter (leave blank)

# Endpoint (required)
endpoint> https://datahub-central-01.elice.io

location_constraint> # Enter
acl> # Enter
server_side_encryption> # Enter
sse_kms_key_id> # Enter

Edit advanced config? (y/n) n
Keep this "datahub" remote? (y/n) y
q) Quit config

Verify the Connection

If you named the remote datahub, you can verify the connection as follows. In container-based instance environments, listing top-level entries (rclone lsd datahub:) may be restricted. In that case, specify the bucket name explicitly:

# List folders inside a bucket (recommended)
rclone lsd datahub:BUCKET_NAME

# List files inside a bucket
rclone ls datahub:BUCKET_NAME

Upload / Download

Upload (local/instance → bucket)

# Upload a file or directory
rclone copy /source/path datahub:BUCKET_NAME --progress

Parallel transfers for large or many-file uploads

rclone copy /source/path datahub:BUCKET_NAME \
--transfers=8 --s3-no-check-bucket --inplace --progress

Download (bucket → local/instance)

rclone copy datahub:BUCKET_NAME /dest/path --progress

Sync

sync is the most powerful way to make a local directory match a remote one. Files that don't exist in the source are deleted from the destination, so use it with care.

# Sync the local 'project-data' directory to the bucket's 'backup' directory
rclone sync ~/project-data datahub:YOUR_BUCKET_NAME/backup --s3-no-check-bucket --progress

⚠️ Warning: Before running sync, simulate the changes with --dry-run and review them. rclone sync ... --dry-run

Common Commands

# Upload a single file
rclone copyto ./file.bin datahub:BUCKET_NAME/file.bin --progress

# Upload a specific folder
rclone copy ./dir datahub:BUCKET_NAME/dir --progress

# Exclude specific files (e.g., temp files)
rclone copy /source/path datahub:BUCKET_NAME --exclude "*.tmp" --progress

# Upload only specific files (e.g., by extension)
rclone copy /source/path datahub:BUCKET_NAME --include "*.parquet" --progress

# Check bucket size
rclone size datahub:BUCKET_NAME

Mount

You can mount a bucket as a local directory and access it like a filesystem. First create an empty directory to use as the mount point, then run rclone mount in the background (--daemon).

Mount

mkdir -p /path/to/mount/point
rclone mount datahub:BUCKET_NAME /path/to/mount/point --daemon

Unmount

fusermount -u ~/datahub

Key Considerations

  • Performance: mounted filesystems can be slower than local disks due to network latency.
  • Capacity reporting: df -h does not show the real quota and returns a synthetic large value (e.g., 1.0P). Use rclone size or the ECI portal to check actual usage.
  • Large files (> 48 GiB): to handle single files larger than 48 GiB, tune the --s3-chunk-size option.
    # Example: set chunk size to 128M to support larger files
    rclone mount ... --s3-chunk-size=128M

    Trade-off: A larger chunk size increases memory usage and the cost of retransmission on failure, so tune it to your system's resources.