Managing files with Rclone

What is rclone?

Rclone ("rsync for cloud storage") is a command-line program to sync files and directories to and from different cloud storage providers. According to its official website, there are over 40 cloud storage providers support rclone. Integrating Estuary into rclone will give developers and users the ability to easily manage their data in either centralized or decentralized manner among various storage providers.

In this tutorial, we will walk you through the process of using rclone to manage your files among different storages, including local storage, Estuary and Google Drive. It will cover the following:

  • Run a Estuary local node
  • Install and setup rclone
  • Mange files between local computer and Estuary
  • Mange files between Google Drive and Estuary

Prerequirements:

  • go (1.15 or higher)
  • hwloc
  • opencl
  • git

Step1: run Estuary local node

For this tutorial, we will run an Estuary local node to store files to IPFS and Filecoin network. You can also try to use Estuary public endpoint https://api.estuary.tech instead, then you will need to request an Estuary invite and get a API token to access Estuary service.

More detailed instruction about how to run Estuary node locally, please refer to Getting Started Developing With Estuary.

  • Download Estuary and build it locally
git clone https://github.com/application-research/estuary.git
cd estuary
make clean all

If the build success, you can start setting up Estuary node and run it.

  • Start Estuary node

It is very important to save your auth token after it's generated by setting up Estuary node, you will always need it to access your files via your Estuary local node.

./estuary setup
<auth Token> #Record this token since you will need it later

./estuary --datadir=/path/to/storage --logging
Wallet address is:  <your_estuary_address_printed_here>
2022-02-25T15:37:36.633+0800    INFO    dt-impl impl/impl.go:145    start data-transfer module
/ip4/192.168.1.5/tcp/6744/p2p/12D3KooWG5nrNicWGCHcrBYrLuhds6MyzP7EngwyjuhHK9PTKu7f
/ip4/127.0.0.1/tcp/6744/p2p/12D3KooWG5nrNicWGCHcrBYrLuhds6MyzP7EngwyjuhHK9PTKu7f
2022-02-25T15:37:36.635+0800    INFO    estuary estuary/replication.go:718  queueing all content for checking: 0

   ____    __
  / __/___/ /  ___
 / _// __/ _ \/ _ \
/___/\__/_//_/\___/ v4.6.1
High performance, minimalist Go web framework
https://echo.labstack.com
____________________________________O/_______
                                    O\
⇨ http server started on [::]:3004

You can use the following command to check if Estuary node started successfully:

curl -X GET http://localhost:3004/public/stats

#If estuary node is running, then you will see the following data.
{"totalStorage":0,"totalFiles":0,"dealsOnChain":0}

Step2: build and config rclone

NOTE: In order to use the right version of rclone which supports Estuary, you will need to build and install rclone from the source code.

  • Install rclone from source.
#Download source from ARG Github and switch to estuary branch.
git clone https://github.com/application-research/rclone.git
cd rclone && git checkout estuary
go build

#Check if rclone is install success
./rclone version
rclone v1.58.0-DEV
- os/version: darwin 12.2.1 (64 bit)
...

In this tutorial, we will use some of the rclone commands. Let's get familiar with them first:

  • rclone mkdir remote:path - make the path if it doesn't already exist
  • rclone lsd remote:path - list all the directories in remote:path
  • rclone ls remote:path - list all the objects in the path/directory
  • rclone copy source:path dest:path - copy files from source to dest
  • rclone sync source:path dest:path - sync files from source to dest to ensure source and dest identical

To learn more about rclone commands, you can check Rclone Commands Instruction or also check from rclone help menu.

./rclone help
  • Config Estuary

In order to let rclone on your local machine to access storage providers like Estuary and Google Drive, you will need to first config rclone to be able to access them from command-line.

In the terminal, run the following command to start configuring Estuary for rclone.

./rclone config

#This will guild you through an interactie setup process
e) Edit existing remote
n) New remote
d) Delete remote
r) Rename remote
c) Copy remote
s) Set configuration password
q) Quit config
e/n/d/r/c/s/q> n
name> estuaryLocal #Name of remote access
Storage> estuary #Type estuary or use the number from the list.
Estuary API token:
token> <Auth Token generated in the first step>
Enter a string value. Press Enter for the default (https://api.estuary.tech).
url> http://localhost:3004

After you finish configure Estuary, you will be able to verify it in the rclone configuration.

./rclone config
Current remotes:
Name                 Type
====                 ====
estuaryLocal         estuary

Step3: sync files between Estuary and local storage

Once you finish configuring rclone to access the Estuary, let's try to manage your files between Estuary and your local storage. There are various ways to move your files from local storage to Estuary.

  • Copy or move files one by one
  • Sync the whole folder from local storage to Estuary
  • Bi-sync folder between local storage and Estuary

We will take couple methods as examples here to demonstrate how to copy files from your local storage to Estuary. Since collections are used by Estuary to organize files and folders, keep in mind that you can only list files belongs to a collection using rclone.

  • Create a new folder which will be treated as collection in Estuary
./rclone mkdir estuaryLocal:testLocal -v

#List the directories under esturayLocal
./rclone lsd estuaryLocal:
 -1 2022-02-25 17:04:43        -1 testLocal
  • Copy one file to Estuary
./rclone copy /path-to-your-file/estuary-file2.txt estuaryLocal:testLocal

#Check the file is copied to Estuary
./rclone ls estuaryLocal:testLocal
     1682 estuary-file2.txt
  • Sync the whole local folder to Estuary
./rclone sync /path-to-your-folder/ estuary:/longfei

#List all the files in your remote Estuary directory
./rclone ls estuaryLocal:testLocal
     1682 estuary-file2.txt
     1674 test1.txt
     1674 test2.txt
     1674 test3.txt

Step4: sync files between Estuary and Google Drive

In this step, we will take Google Drive as an example to demonstrate how to sync files among different storage providers. But rclone supports around 40+ storage providers other than Google drives, you can also try other examples according to your preference.

  • Config Google Drive

Moving files between Estuary and Google Drive is as simple as moving files from local to Estuary. But first, you will need to configure rclone to access your Google Drive. You can config Google Drive following this tutorial.

Once you finish configuring Google Drive, you will be able to see it in the list of remotes.

./rclone config
Current remotes:
Name                 Type
====                 ====
estuaryLocal         estuary
ggDrive              drive

Now you will be able to check folders and files in your Google Drive directly using rclone.

#This command will list all the folders
./rclone lsd ggDrive:
   -1 2022-02-25 11:30:51        -1 Filecoin-Slides
   -1 2022-02-25 11:31:01        -1 Filecoin-Workshops
   -1 2022-02-25 11:36:13        -1 estuaryTest

#This command will list all the files in a folder
./rclone ls ggDrive:estuaryTest
   158798 Estuary-addContent.png
   138349 social-profile.jpeg
      175 rclone-lf.txt
  • Create an Estuary collection for your Google Drive files

Run the following command to create a directory in your Estuary node, which will be created as a collection in Estuary.

./rclone mkdir estuaryLocal:ggDrive

#List all the folders(collections) in esturayLocal
./rclone lsd estuaryLocal:
  -1 2022-02-28 10:44:51        -1 ggDrive
  -1 2022-02-28 09:41:08        -1 testLocal
  • Sync the whole folder in Google Drive to Estuary

Let's try to sync the whole folder from Google Drive to Estuary which will copy all the files to your Estuary collection. I will take folder ggDrive:/estuaryTest as an example here.

./rclone sync ggDrive:estuaryTest estuaryLocal:ggDrive -v
Transferred:      290.354 KiB / 290.354 KiB, 100%, 92.451 KiB/s, ETA 0s
Transferred:            2 / 2, 100%
Elapsed time:         5.1s

#Verify that there are 2 files moved to Esturay
./rclone ls estuaryLocal:ggDrive
   158798 Estuary-addContent.png
      175 rclone-lf.txt
  • Move single file

You can also just copy a single files from Google Drive to Estuary using the following command.

./rclone copy ggDrive:Filecoin-Workshops/Build-on-Filecoin.pptx estuaryLocal:ggDrive

#Build-on-Filecoin.pptx is moved success
./rclone ls estuaryLocal:ggDrive
 13575031 Build-on-Filecoin.pptx
   158798 Estuary-addContent.png
      175 rclone-lf.txt

Step5: Verify the files on Estuary

Except checking the list of files in your Estuary node using rclone commands, you can also verify whether or not those files are actually moved to Estuary by directly queueing via Esturay API. First, let's check if the ggDrive collection is created in Estuary with the following command.

curl -X GET http://localhost:3004/collections/list -H "Authorization: Bearer REPLACE_ME_WITH_API_KEY" -H "Accept: application/json" -s | jq .

#Two collections were added to Estuary from previous steps
[
  {"createdAt": "2022-02-28T09:41:08.015981+08:00",
    "uuid": "uuid-for-testLocal",
    "name": "testLocal",
    "description": "",
    "userId": 1  },
  {"createdAt": "2022-02-28T10:44:51.111857+08:00",
    "uuid": "uuid-for-ggDrive",
    "name": "ggDrive",
    "description": "",
    "userId": 1 }
]

The ggDrive collection was created successfully and being listed with its details. The uuid will be used to query files under this collection and you can use the following command to get all the files in ggDrive collection.

curl -X GET http://localhost:3004/collections/content/uuid-for-ggDrive -H "Authorization: Bearer REPLACE_ME_WITH_API_KEY" -H "Accept: application/json" -s | jq .

[
  {
    "id": 21,
    "updatedAt": "2022-02-28T10:51:19.29752+08:00",
    "cid": "bafkreiazxt6iegs6nwkfa73jdywbdvb3dxb62j4ix3pstmetqqhztekd7y",
    "name": "rclone-lf.txt",
    ...},
  {
    "id": 22,
    "updatedAt": "2022-02-28T10:51:20.004562+08:00",
    "cid": "bafkreifkhy32wd3qowprao4632iqa55nifxmc44mgj4hs7xqxqutirjh4q",
    "name": "Estuary-addContent.png",
    ...},
  {
    "id": 24,
    "updatedAt": "2022-02-28T11:39:26.400316+08:00",
    "cid": "bafybeihxhjivebdivakj336dul5dvijkvzivbd7ornw52bes46ygcbrt7i",
    "name": "Build-on-Filecoin.pptx",
    ...}
]

Once Estuary stores your files on IPFS and Filecoin, corresponding CIDs will be generated for each file. You will be able to access those files on IPFS via IPFS-compatible browsers as well as IPFS gateway. Use the following command to get url and gateway for your files.

#Get the list of CID
./rclone backend lscid estuaryLocal:ggDrive
 13575031 Feb 28 11:39:26  bafkreifkhy32wd3qowprao4632iqa55nifxmc44mgj4hs7xqxqutirjh4q Estuary-addContent.png

#Get the list of URL for IPFS-compatible browser
./rclone backend lscid estuaryLocal:ggDrive -o format=url
 13575031 Feb 28 11:39:26  ipfs://bafkreifkhy32wd3qowprao4632iqa55nifxmc44mgj4hs7xqxqutirjh4q Estuary-addContent.png
 
#Get the list of IPFS gateway URL
 ./rclone backend lscid estuaryLocal:ggDrive -o format=gateway
  158798 Feb 28 10:51:20  https://dweb.link/ipfs/bafkreifkhy32wd3qowprao4632iqa55nifxmc44mgj4hs7xqxqutirjh4q Estuary-addContent.png

What is Next?

Rclone and Esutary are both very useful and powerful tools providing various features and potential to achieve convenient and fast cloud storage file management. Now you get a sense of how to use rclone to manage files between Estuary and other storage providers, feel free to explore more features such as storage disks mount, server backup and offline download, etc.