Storj Storage Node Docker-Compose file

Storj Storage Node Docker-Compose file

Get Social!

Storj V3 is now in BETA and recruiting Storage Node operators. Since V3 of Storj, Docker is used exclusively to wrap up creating a new Storage Node into a simple, manageable container.

You’ll need to see the official docs for creating your identity certificates, but when it comes to creating your docker environment it couldn’t be more simple than using docker-compose. If you haven’t got docker-compose installed then check out this blog post.

Create a new folder and a docker-compose.yml with the below content.

mkdir storj
vi storj/docker-compose.yml
version: '3'
services:
  storagenode:
    image: storjlabs/storagenode:beta
    restart: unless-stopped
    ports:
        - 28967:28967
    volumes:
        - ./config/identity:/app/identity
        - ./data:/app/config
    environment:
        - WALLET=0x123456789
        - EMAIL=EMAIL
        - ADDRESS=external.url:28967
        - BANDWIDTH=10TB
        - STORAGE=1TB
        - STORJ_LOG_LEVEL=info
  watchtower:
    image: containrrr/watchtower
    volumes:
        - /var/run/docker.sock:/var/run/docker.sock
    environment:
        - WATCHTOWER_CLEANUP=true

You’ll need to fill out the environment details to match your requirements, especially the WALLET and ADDRESS. You may want to redirect the volume elements to match your environment – the /app/config path should point to the disk that you’d like to use for storage (I know, the name is confusing) and the /app/identity path should point to your Storj identity certificates.

Run docker-compose up -d to fetch the images from the docker hub and create your Storage Node instance.


rclone Systemd startup mount script

Get Social!
rclone

Rclone is a command line utility used for reading and writing to almost any type of cloud or remote storage. From Google Drive to Ceph, rclone supports almost any cloud-based remote storage platform you can think of. You can perform upload, download or synchronisation operations between local storage and remote cloud storage, or between remote storage directly.

In addition to this, rclone has an experimental mount feature that lets a user mount a remote cloud storage provider, such as s3 or Google Drive, as a local filesystem. You can then use the mounted filesystem as if it were a local device, albeit with some performance considerations.

Before we get going, make sure you have rclone installed on your system and configured with a remote. 

curl https://rclone.org/install.sh | sudo bash
rclone config 

Once you have a remote defined, it’s time to create the mountpoint and systemd script. I’ll be using Google Drive for this example, but the mount command works for any supported remote.

Create the mount point directory to use for the remote storage:

mkdir /mnt/google-drive

Next, create the below systemd script and edit it as required:

vi /etc/systemd/system/rclone.service
# /etc/systemd/system/rclone.service
[Unit]
Description=Google Drive (rclone)
AssertPathIsDirectory=/mnt/google-drive
After=plexdrive.service

[Service]
Type=simple
ExecStart=/usr/bin/rclone mount \
        --config=/root/.config/rclone/rclone.conf \
        --allow-other \
        --cache-tmp-upload-path=/tmp/rclone/upload \
        --cache-chunk-path=/tmp/rclone/chunks \
        --cache-workers=8 \
        --cache-writes \
        --cache-dir=/tmp/rclone/vfs \
        --cache-db-path=/tmp/rclone/db \
        --no-modtime \
        --drive-use-trash \
        --stats=0 \
        --checkers=16 \
        --bwlimit=40M \
        --dir-cache-time=60m \
        --cache-info-age=60m gdrive:/ /mnt/google-drive
ExecStop=/bin/fusermount -u /mnt/google-drive
Restart=always
RestartSec=10

[Install]
WantedBy=default.target

The important parts are detailed below, however, there are various other options are detailed on the rclone mount documentation page.

  • –config – the path to the config file created by rclone config. This is usually located in the users home directory.
  • gdrive:/ /mnt/google-drive – details two things; firstly the config name created in rclone config, and secondly the mount point on the local filesystem to use.

Once all this is in place you’ll need to start the service and enable the service at system startup (if required)

systemctl start rclone
systemctl enable rclone

Where to Run Ceph Processes For High Availability

Tags :

Category : Knowledge

Get Social!

ceph-logoIn this blog post we’re going to take a detailed look at the Ceph server processes within a Ceph cluster of multiple server nodes.

Ceph is a highly available network storage layer that uses multiple disks, over multiple nodes, to provide a single storage platform for use over a network. Ceph is designed to be fault tolerant to ensure access to data is always available.

A Ceph cluster consists of 4 components:

  1. Monitor – is the daemon that holds the cluster map listing all the available nodes and Ceph daemons that are available for use by the cluster.
  2. OSD – is the daemon that handles the reading and writing of data to a physical disk. One OSD process will run per storage medium (usually a single disk device such as /dev/sdb) attached to the cluster. For example, a single server with 3 hard disks attached for use by the Ceph cluster will run 3 OSD processes.
  3. Metadata – is only required is Ceph will be used as a CephFS, that is a network filesystem. The Metadata, or MDS, daemon contains metadata on files and directories such as size and directory hierarchy so that Ceph can be used as a POSIX compliant filesystem. There can only be one active MDS daemon at any one time, but at least one secondary passive MDS daemon is advised to provide fault tolerance.
  4. Client – is used to read and write to the Ceph cluster. It allows the Linux kernel to mount a Ceph cluster as a filesystem, use Ceph as an object store or use Ceph as a block device.

Ceph Process Architecture

The Ceph storage daemons can run on dedicated hardware to provide a highly available file storage layer to remote clients over a network. This configuration is similar to a typical file server where the hardware is dedicated to only serving data. Extremely large or high performance storage requirements would usually be configured in this way to ensure there is sufficient resource (such as CPU, RAM and disk) to meet the requirements of the cluster. Ceph can also be used, in smaller and less demanding environments, to run alongside existing applications to maximise the use of existing hardware and reduce costs. This is one of the recommended setups for Proxmox VE, for example. The below example shows a 4 server cluster of Ceph daemons, suitable for either of the above two scenarios.

Ceph Cluster Process Overview

Server1 has 4 OSD processes running, which would indicate there are 4 physical disks used by Ceph on the server. The monitor daemon ensures that available nodes and daemons are tracked so that requests for I/ O can be served by active nodes. Finally, the optional MDS daemon is available and running for metadata requests from a Ceph filesystem. There can only be one active MDS daemon for a cluster, therefore server one hosts the primary MDS daemon.

Server2 is the same as Server1 however it’s MDS process is in standby mode. As only one MDS daemon can be answering requests at any given time, the MDS daemon runs in Active/ Passive mode. If the Active daemon fails, it takes around 30 seconds for the Passive daemon to become Active. This can present a small period of downtime for metadata requests. Future versions of Ceph will change this behaviour to create a more redundant process.

Server3 does not have any MDS component because there are already two nodes hosting an active and standby MDS process.

Server4 only hosts OSD processes. All other processes are already highly available and with a node of this size there is plenty of redundancy.

Scaling Ceph

 

Ceph is very modular by design, with each process having a specific task and talking to other processes over the network. Adding new servers is something that can be done easily, and without downtime to the existing storage pool. Further storage servers would look like either Server3 or Server4.

Generally additional servers are added so that more disks can be added to the cluster, and therefore it’s easy to understand why additional OSD processes would be added. In addition to adding additional servers (horizontal scaling), if existing servers have sufficient resource, further OSD processes could be added to the existing servers (vertical scaling). Adding additional OSD processes could be to either increase the overall storage pool size, increase the replication factor or to increase available bandwidth.

Mon processes, however, are not required on all servers and are only needed in multiples to prevent a failure causing downtime. Ceph requires a majority of monitors running to establish a quorum therefore monitors should always exist in odd numbers. To explain a little further; more monitor processes must be available then failed, for a cluster. So in a cluster with 10 monitor processes only 4 could fail before causing a problem, whereas with 11 monitor processes 5 could fail. Generally 3 mon processes will be sufficient for common Ceph cluster requirements.

Conclusion

Ceph should be provisioned to provide at least the minimum availability you need from your storage pool and the maximum risk you can afford to data loss. Each component must be balanced; it’s no good having a fully fault tolerant suite of monitor processes if you’ve not enabled data replication.

My final point is to plan for cluster maintenance as well as failures. Throughout a cluster’s lifetime, it’s reasonable to expect that individual servers will be taken out of service for short periods of time. For example, applying Ceph updates requires that the components are restarted, or applying kernel updates often results in a complete restart of the host taking several minutes to complete. During this time your cluster will be in a degraded state as services required by the cluster are offline. Ceph is designed to work in this scenario (depending on configuration) however if, for example, you only have 3 mon processes and you take the server down running one of them to apply a kernel update then the cluster cannot function if another mon process fails due to other reasons.


Small Scale Ceph Replicated Storage

Category : How-to

Get Social!

I’ve written a few posts about Ceph, how it works and how it’s set up and it mostly revolves around large scale storage for storing things like virtual machines. This post will focus on using Ceph  provide fault tolerant storage for a small amount of data in a low resource environment. Because of this, the main focus has been moved away from performance and switched to:

  • availability – the storage should always be available and recoverable in the event of disaster
  • portability – the storage isn’t tied to a machine and can be moved with relative ease.
  • scalability – more machines can use the storage as required.

This tutorial will focus on a small scale Ceph setup, fit for something like a Raspberry Pi or low resource VPS. We’ll use 3 machines but you could easily add more machines if your scenario requires it.

If you are looking for a larger setup, then see this blog post on installing Ceph.

ceph-local

The above diagram shows the topology of the layout. Each machine will have a file /ceph-file that will be mounted as a block device on /dev/loop0 and that’s the space that will be assigned to Ceph. Ceph will replicate any data stored to the file and ensure the data is available to all Ceph clients. The Ceph storage will be accessed from a mountpoint at /mnt/ha-pool.

Ceph block device

The first step in creating a Ceph storage pool is to set aside some storage that can be used by Ceph. Ceph stores everything twice, by default, so whatever storage you provision will be halved. For this example we’re going to use a file created with dd as the Ceph storage device, however you could use a drive mounted in /dev/ if you have one. A whole drive is by far the preferred solution, however as I’ve stated, the main goal of this post isn’t just performance.

If you’re going to use a file for storage, follow my post on creating a block device from a file and mount it on loop0. Otherwise you can continue to the next step.

OpenVZ: if you’re using Ceph inside of an OpenVZ container, make sure you pass the loop device through to the container.

Installing Ceph

At this point it’s worth noting that Ceph, in addition to the application requirements, will use approximately 1MB of RAM for each GB of storage provisioned. This means that 1TB of provisioned storage (which in today’s world is rather small) would take 1GB of RAM plus the requirements of running the Ceph daemons. For our low memory footprint, only provision the storage that you’ll need.

Before starting the install, you’ll need a couple of things in place:

  • SSH Keys are set up between all nodes in your cluster – see this post for information on how to set up SSH Keys. For security it’s good practice to set up a new user on all machines you’re going to install Ceph onto and use it to run Ceph. The key should also be copied to all machines using the ssh-copy-id command.
  • NTP is set up on all nodes in your cluster to keep the time in sync. You can install it with: apt-get install ntp

The following commands are for installing Ceph on Debian (wheezy) and should be executed on all machines that need to run Ceph. In our example, these commands will be executed on Server 1Server 2 and Server 3.

First let’s add the release key and repositories to the apt package manager. Run the following as root:

wget --no-check-certificate -q -O- 'https://git.ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | apt-key add -
echo deb http://download.ceph.com/debian-firefly/ $(lsb_release -sc) main | tee /etc/apt/sources.list.d/ceph.list

Next let’s update our apt cache and install Ceph and a few other bits.

apt-get update && apt-get install ceph-deploy ceph ceph-common

Setup and configuring for minimal resource requirements

The next step should be done on just one of your Ceph machines. This will create the monitor service and make each machine aware of the other machines running Ceph.

The command references each machine you’re going to be running Ceph on by hostname or DNS entry. Before running the command, make sure that all of your machines resolve via DNS or hosts file. Because I’m only running this in a lab, I’ve used the hosts file route and added an entry to each machine in the hosts file of all Ceph machines.

vi /etc/hosts

Add your Ceph machine IP and hostnames.

10.10.10.1 ceph1
10.10.10.2 ceph2
10.10.10.3 ceph3

You can test that each machine can see the others by using the ping command. If it works then you should be in business!

ping ceph2
ping ceph3

Once you’re happy that all machines can reference the other machines then run the ceph-deploy command:

ceph-deploy new ceph1 ceph2 ceph3

If you haven’t used your ssh keys since setting them up you may be presented with the following warning. Just type yes to continue.

The authenticity of host 'ceph1 (10.10.10.1)' can't be established.
ECDSA key fingerprint is 66:44:a8:90:e2:8e:12:0e:05:4a:c4:93:a1:43:d1:fd.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ceph1' (ECDSA) to the list of known hosts.

We now need to configure Ceph with our low resource settings. These settings are not performance driven, but instead set to minimise system resources.

See ceph.conf for the script and add the content to the ceph.conf file

vi ~/ceph.conf

Create the initial mds daemons, monitor daemons and set the proper permissions on the keyring file.

ceph-deploy mon create-initial
ceph-deploy admin ceph1 ceph2 ceph3
ceph-deploy mds create ceph1 ceph2 ceph3

ssh ceph1 "chmod 644 /etc/ceph/ceph.client.admin.keyring"
ssh ceph2 "chmod 644 /etc/ceph/ceph.client.admin.keyring"
ssh ceph3 "chmod 644 /etc/ceph/ceph.client.admin.keyring"

Test Ceph is deployed and monitors are running

At this point it’s good to take a step back and check everything is up and running. We’ve still not assigned any storage to our Ceph cluster so we can’t run it yet, but we should have the monitor daemons running and the cluster configuration be deployed on all servers.

Run the below command and take a look at the output.

ceph -s

The output should show

cluster 51e1ddff-ff28-4f58-af7e-e94448e5324b
   health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
   monmap e1: 3 mons at {ceph1=10.10.10.1:6789/0,ceph2=10.10.10.2:6789/0,ceph3=10.10.10.3:6789/0}, election epoch 6, quorum 0,1,2 ceph1,ceph2,ceph3
   osdmap e1: 0 osds: 0 up, 0 in
    pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail
   mdsmap e8: 1/1/1 up {0=web1=up:active}, 2 up:standby

As you can see, three Ceph servers are referenced on port 6789 which is the monitor daemon port number.

Add storage to the Ceph cluster

We’ve got our Ceph cluster, and we’ve got our storage device that we created as the first step, it’s time to put the two together. Run the below commands on the same machine that you ran the above steps on. You’ll need to replace /dev/sda with the block device on each ceph machines that you’d  like to use. Note that the block device (sda) does not need to be the same on all machines.

ceph-deploy osd create --fs-type ext4 ceph1:/dev/sda
ceph-deploy osd create --fs-type ext4 ceph2:/dev/sda
ceph-deploy osd create --fs-type ext4 ceph3:/dev/sda

Or…

You can use a directory as storage for Ceph, rather than a block device.

If you’re following this tutorial and creating a loop device to use with Ceph then you’ll need to ensure there is a filesystem on the loop0 device and that it’s mounted. You can skip these next step if you are just using an existing directory.

Run the below commands (if you’re using a loop device) on each of the machines that has a loop device you’d like to use. We’re assuming that you’re loop device is loop0. For this example we’ll run it on each of the three machines; ceph1, ceph2 and ceph3.

mkfs.ext4 /dev/loop0
mkdir /mnt/ceph-backing0
echo "/dev/loop0 /mnt/ceph-backing0 ext4 defaults 1 1" >> /etc/fstab
mount /mnt/ceph-backing0

You can use a directory path on the Ceph machine as the OSD device. This may be an option if you’re in an OpenVZ or Docker container that doesn’t allow you to pass through block devices.

ceph-deploy osd prepare ceph1:/mnt/ceph-backing0
ceph-deploy osd prepare ceph2:/mnt/ceph-backing0
ceph-deploy osd prepare ceph3:/mnt/ceph-backing0

And then activate the storage:

ceph-deploy osd activate ceph1:/mnt/ceph-backing0
ceph-deploy osd activate ceph2:/mnt/ceph-backing0
ceph-deploy osd activate ceph3:/mnt/ceph-backing0

Mount a Ceph device as a folder

That’s the server side done! The last step to using our Ceph storage cluster is to mount the cluster to a mountpoint on the local filesystem. Here we’re going to use /mnt/ha-pool as the mount point but you can change that to whatever you’d like. Run these commands on any machines that you’d like to mount the Ceph volume on.

First create the mount point where the Ceph storage will be accessible from.

mkdir /mnt/ha-pool

Then we need to export the key so that the ceph-client can authenticate with the Ceph daemon. You could turn authentication off, or even create a non-admin user secret but for this tutorial we’ll just use the admin user.

ceph-authtool --name client.admin /etc/ceph/ceph.client.admin.keyring --print-key >> /etc/ceph/admin.secret

Then run the below command to add an entry to your fstab file so that the Ceph volume will be automatically mounted on machine start. This will mount the Ceph volume at /mnt/ha-pool.

echo "ceph1,ceph2,ceph3:/ /mnt/ha-pool/ ceph name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2" >> /etc/fstab

Finally run the mount command

mount /mnt/ha-pool

One last check to make sure you’re up and running:

df -h | grep ha-pool
10.10.10.1,10.10.10.2,10.10.10.3:/                    6G   3G   3G  54% /mnt/ha-pool

And that’s it! You have a working Ceph cluster up and running!


Use A File As A Linux Block Device

Get Social!

Just like when creating a SWAP file, you can create a file on a disk and present it as a block device. The block device would have a maximum file size of the backing file, and (as long as it’s not in use) be moved around like a normal file. For example, I could create a 1GB file on the filesystem and make Linux treat the file as a disk mounted in /dev/. And guess what – that’s what we’re going to do.

Create a file and filesystem to use as a block device

First off, use dd to create a 1GB file on an existing disk that we’ll use for our storage device:

dd if=/dev/zero of=/root/diskimage bs=1M count=1024

Then ‘format’ the file to give it the structure of a filesystem. For this example we’re going to use ext4 but you could choose any filesystem that meets your needs.

mkfs.ext4 /root/diskimage

You’ll be promoted with Proceed anyway?. Type y and press return to proceed with the process.

mke2fs 1.42.5 (29-Jul-2012)
/root/diskimage is not a block special device.

Proceed anyway? (y,n) y

Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
65536 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376

Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

Mounting a loop device

Before mounting the file we need to check that there is a free /dev/loopX loopback device that we can use to represent our new block device.

Run the below command, and if there is any output then check if it’s one of your loop devices, which will more than likely reference /dev/loop as the mounted device. If you do have a reference to our loop device then see the below section on Unmounting a loop device, or choose a number higher than the highest listed loop device, for example: usually there are several loop devices, starting with loop0 and going up in value to loop1loop2, and so on.

cat /proc/mounts | grep /dev/loop

Once you have the file that you’d like to mount and a free loop device then you can go ahead and mount the file as a block device. You have two options:

  1. Mount the file as a block device only
  2. Mount the file as a block device and mount the filesystem of it on a local mount point (eg. /mnt/mymountpoint).

For option 1; to only mount the file as a device in /dev/, run the below command and change /root/diskimage to the path of the file you’d like to mount. loop0 can also be incremented as explained above.

losetup /dev/loop0 /root/diskimage

If you’d like this to be remounted after a machine reboot then add the above line to the rc.local file.

vi /etc/rc.local

And add:

losetup /dev/loop0 /root/diskimage

 

For option 2; to mount the file and the filesystem on it, use the mount command. You must have already created the mount point locally before running the command, as you would when mounting a disk or NFS share.

mkdir /mnt/mymountpoint

Then run the mount command and specify the loop device, the path of the file and the path to mount the filesystem on:

mount -o loop=/dev/loop0 /root/diskimage /mnt/mymountpoint

To check the file has been mounted you can use the df command:

df -h | grep mymountpoint
/dev/loop0  976M  1.3M  924M  1% /mnt/mymountpoint

Unmounting a loop device

If you’ve mounted the filesystem on the block device using the mount command then make sure it’s unmounted before proceeding.

umount /mnt/mymountpoint

To then free the loop0 device (or which ever loop device you’ve used) you’ll need the losetup command with the d switch.

losetup -d /dev/loop0

 


Proxmox 4.x bind mount – mount storage in an LXC container

Get Social!

An LXC containers storage is simple to set and maintain and is usually done through either a Web based GUI or a command line utility. It’s simple to set the size of disk allocated to an LXC container, and you can increase it easily, even while the container is still running.

Whilst simple to set up and administer, the standard storage options of LXC containers are limited. For example, you can’t mount an NFS share in an LXC container, or can you have multiple disks mounted as /dev block devices.

That’s where a bind mount comes in. You can add one or more mount points to your LXC container config that specifies a source path and a target path which is activated when the container starts. The source path would be a location on the host machine (the physical host running the LXC container – the Proxmox host in this example). The target is a location inside of the LXC container such as /mnt/myshare. This means that you can mount an NFS share, a GlusterFS share, several physical disks or anything else that can be mounted on your host and pass it through to your container.

Before you start, you’ll need to make sure both the host location and the target container location exist, otherwise the container will fail to start. You’ll then need to edit your LXC container config file. On Proxmox 4.x this can be found in /etc/pve/lxc/ and then the ID of your container. In this example the container we’re working on has an ID of 101.

vi /etc/pve/lxc/101.conf

Add the following row and substitute SOURCE with the path that you’d like to pass through to your container and TARGET to the path inside the container.

mp0: SOURCE, mp=TARGET

The below example will make /mnt/pve/nfs-share available in the container at /mnt/nfs.

mp0: /mnt/pve/nfs-share, mp=/mnt/nfs

Then restart your CT for the changes to take effect.

 

If you have multiple paths to mount then you can increment mp0 to mp1mp2, etc.

mp0: /mnt/pve/nfs-share, mp=/mnt/nfs
mp1: /mnt/pve/gluster-share, mp=/mnt/gluster
...

If you’re using version 3.x of Proxmox, or stand alone OpenVZ then see Proxmox bind mount for OpenVZ.


Visit our advertisers

Quick Poll

How many Proxmox servers do you work with?

Visit our advertisers