07 August 2017

Optimize rsync performance to remote server

A fast and easy method to synchronize or backup files from one computer to another one is rsync. This post will first briefly introduce rsync and the used command line parameters that are mostly used and will outline some modifications, which will greatly speed up the data transfer from about 20-25 MB/s to over 90 MB/s in a local and non-metered network!

Short description of rsync

The rsync protocol is able to efficiently synchronize files between computers. It does this by transferring only changed files and in addition to that, it tries to transfer only the differences between the local and remote file to further minimize the transferred data size.
The underlying data transfer is secured with the Secure Shell or SSH, which adds another layer of computational overhead to encrypt the data securely, transferring to the remote server and decrypting it again. Because I am mostly on the road with a metered internet connection, I use the maximum possible compression directly in my main SSH configuration, which will become the bottleneck, when using rsync in a local and fast network. In the latter case, disabling the compression of rsync as well as reducing the encryption of rsync is able to increase the rsync performance a lot!

Description of used command line parameters

My default command line parameters with descriptions from the full rsync man page:

  • -a: archive mode (equals -rlptgoD): recursive, copy symlinks as symlinks, preserve permissions, preserve modification times, preserve group, preserve owner, preserve device files and special files
  • -v: increase verbose information during the transfer
  • -u: skip files that are newer on the receiver
  • -r: recurse into directories
  • --progress: show progress during transfer
  • --delete: delete extraneous files from the remote server

Remove SSH bottleneck to optimize performance of rsync

The maximum compression of SSH (that is used by default in my config) is helpful on a metered connection with a small bandwidth, where you gain time from the reduced data amount. In a local and fast network, this turns out to be the bottleneck and manifests in 100% CPU usage of the SSH command. In this fast local network, it is much faster to transfer the files as is without any compression because compressing and uncompressing would take longer than just transferring the plain file.

In order to do that, these options can be used to speed up the data transfer in the local network:

  • -T: diable pseudo-tty allocation on the destination
  • -c aes128-ctr: select a weaker but faster SSH encryption. Others specify arcfour, which would require manual modification of ssh_config on the destination host. This is not always possible and this encryption worked just fine for me.
  • -x: disable X11 forwarding
  • -o Compression=no: disable SSH compression bottleneck described above

The full command to backup folder foo to the remote folder bar on the destination host desthost then:

export RSYNC_RSH="ssh -T -c aes128-ctr -o Compression=no -x"
rsync -avur --progress --delete foo desthost:bar

With this command, it was possible to increase the transfer rates from about 20-25 MB/s to more than 90 MB/s!

No comments:

Post a Comment