web123456

[rsync] Remote synchronization, fast incremental backups

rsyncremote synchronization

  • remote synchronization
    • 1.1 rsync overview
    • 1.2 Downlink synchronization
    • 1.3 Remote File Synchronization Summary
  • 2. Build rsync remote file synchronization
    • 2.1 Build rsync remote downstream synchronization
      • 2.1.1 Configuring the rsync server side (synchronization source)
      • 2.1.2 Configuring the rsync client (initiator)
    • 2.2 Interaction-free configuration
    • 2.3 rysnc authentication method
    • 2.4 Build rsync remote uplink synchronization
    • 2.5 rsync synchronization knowledge summary
  • 3. Configure rsync+inotify on the initiator side
    • 3.1 Installing the inotify-tools tool
    • 3.2 Writing triggered backup scripts
  • 4. Quickly delete a large number of files
    • 4.1 High Frequency Problem: Using rsync to achieve fast deletion of large numbers of files
    • 4.2 Summary of deletion of a large number of documents

remote synchronization

1.1 rsync overview

rsync (Remote Sync) is an open-source, fast backup tool.Can be mirrored between different hosts to synchronize the entire directory tree, support incremental backup, and maintain links and permissions, and the use of optimized synchronization algorithms, compression before transmission, so it is very suitable for off-site backups, mirroring servers and other applications

在这里插入图片描述

1.2 Downlink synchronization

In a remote synchronization task, the client responsible for initiating the rsync synchronization operation is called the initiator, and the server responsible for responding to the rsync synchronization operation from the client is called the synchronization source. During the synchronization process, theThe synchronization source is responsible for providing the original location of the file, which the initiator should have read access to.

在这里插入图片描述

1.3 Remote File Synchronization Summary

# Remote file synchronization
scp rsync svn  git(github, gitlab, gitee)
  • 1
  • 2

2. Build rsync remote file synchronization

在这里插入图片描述

2.1 Build rsync remote downstream synchronization

2.1.1 Configuring the rsync server side (synchronization source)

### Turn off and disable the firewall's boot-up feature
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i's/enforcing/disabled/' /etc/selinux/config
  • 1
  • 2
  • 3
  • 4
  • 5

(1) Check if the rsync package is already installed

rpm -q rsync							# General systems have rsync installed by default
rpm -qc rsync
cat /etc/services | grep rsync
  • 1
  • 2
  • 3

在这里插入图片描述

(2) Create /etc/configuration file to add shared modules

vim /etc/				# Add the following configuration items
uid = root
gid = root
use chroot = yes										# Confined to the source directory
address = 192.168.80.20									#Listening address
port = 873												# Listening on port tcp/udp873, can be viewed by cat /etc/services | grep rsync
log file = /var/log/							# Log file location
pid file = /var/run/							# Location of the file where the process ID is stored
hosts allow = 192.168.80.0/24							# Client addresses allowed to access
dont compress = *.gz *.bz2 *.tgz *.zip *.rar *.z		# File types that are no longer compressed when synchronized

[wwwroot]												# Shared module name
path = /var/www/html									# The actual path to the source directory
comment = Document Root of 
read only = yes											# Is it read-only
auth users = backuper									# Authorized accounts, multiple accounts separated by spaces
secrets file = /etc/rsyncd_users.db						# Data files that hold account information

#If you want to be anonymous, just remove the "auth users" and "secrets file" configuration items.
# Create data files for backup accounts
----------------------------------------------------------------------------------------------------------
vim /etc/
### Add and modify the following
uid = root
gid = root
use chroot = yes
address = 192.168.80.20
port = 873
max connections = 4
pid file = /var/run/
dont compress   = *.gz *.tgz *.zip *.z *.Z *.rpm *.deb *.bz2
hosts allow = 192.168.80.0/24

[gzy520]
path = /data
comment = gay rsync test dir
read only = yes
auth users = cli
secrets file = /opt/rsyncd_userlist
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39

在这里插入图片描述
(3) Setting up data files for storing account information and modifying file permissions

vim /opt/rsyncd_userlist
### This authorization username should be the same as the username in the /etc/ file
cli:123				# No need to create system users with the same name

chmod 600 /opt/rsyncd_userlist
# Ensure that all users have read access to the source directory /data
mkdir /data
ls -ld /data
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

在这里插入图片描述
在这里插入图片描述
(4) Restart the rsync service

# Start the rsync service program
rsync --daemon				# Start the rsync service to run as a standalone listening service (daemon)
netstat -lntp | grep rsync
  • 1
  • 2
  • 3

在这里插入图片描述
(5) Add file contents to the shared directory /data/

cd /data/
cp -r /etc/passwd /etc/shadow /etc/hosts /etc/fstab /etc/ ./
echo a > a
echo b > b
echo c > c
  • 1
  • 2
  • 3
  • 4
  • 5

在这里插入图片描述

2.1.2 Configuring the rsync client (initiator)

Basic format:

rsync [options (as in computer software settings)] Original position Target position
  • 1
Common Options functional role
-r Recursive mode, containing all files in directories and subdirectories
-l For symbolic link files still copy as symbolic link file
-v Display detailed (verbose) information about the synchronization process
-z Compression (compression) when transferring files
-a Archive mode, retains file permissions, attributes, etc., equivalent to the combination option "-rlptgoD"
-p Retaining file permission tags
-t Retention of time stamping of documents
-g Preserve the file's belonging group tag (for super users only)
-o Retain the file's owner tag (for super users only)
-H Preservation of hard-linked documents
-A Preserve ACL attribute information
-D Retention of equipment documentation and other specialized documentation
- -delete Delete files that are present in the target location but not in the original location (that is, delete files that exist locally on the client side but not in the shared directory on the remote server side)
–checksum Decide whether to skip files based on checksums (rather than file size, modification time)
### Turn off and disable the firewall's boot-up feature
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i's/enforcing/disabled/' /etc/selinux/config
  • 1
  • 2
  • 3
  • 4
  • 5

Download the specified resource to the local /opt/test directory for backup

mkdir /opt/test
cd /opt/test

### Format I
rsync -avz [email protected]::gzy520 /opt/test

### Format II
rsync -avz --delete rsync://[email protected]/gzy520 /opt/test/
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

在这里插入图片描述
在这里插入图片描述

2.2 Interaction-free configuration

Interaction-free format configuration, saving passwords in advance to a data file

echo 123 > /opt/rsync_pass
chmod 600 /opt/rsync_pass

### Performs interaction-free formatting to download remote server-side data files locally.
rsync -avz --delete --password-file=/opt/rsync_pass rsync://[email protected]/gzy520 /opt/test/

### Remote backup operations, added to a timed schedule
crontab -e 
30 22 * * * /usr/bin/rsync -az --delete --password-file=/opt/rsync_pass rsync://[email protected]/gzy520 /opt/test/
# In order not to have to enter a password during synchronization, you need to create a password file that holds the cli user's password, such as /opt/rsync_pass. use the option "--password-file=/opt/rsync_pass" when performing an rsync synchronization to specify that to be used when performing rsync synchronization.

systemctl restart crond
systemctl enable crond
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

在这里插入图片描述

2.3 rysnc authentication method

rsync has two common authentication methods.One is the rsync-demon method and the other is the ssh. There are occasions when using the rsync-demon method can be rather inflexible, and the ssh method becomes preferred.
-e 'ssh -p 22': rsync uses ssh to specify the port, the port does not need to be specified if it is the 22 default port.

rsync -avz -e 'ssh -p 22' [email protected]:///etc// /opt/test/
  • 1

在这里插入图片描述

2.4 Build rsync remote uplink synchronization

在这里插入图片描述

Uploading local data files to a remote server
在这里插入图片描述

rsync -avz -e 'ssh -p 22' /opt/test/ [email protected]:/data/  #/opt/test/ with a final / for files inside the push directory, without a / for the push directory
  • 1

在这里插入图片描述
Use the rsync-demon method for uploading (pushing) data

vim /etc/ 
### Set read-only to no
read only = no

### Modify shared directory permissions
chmod 777 /data/

### Shut down the rsync service, refresh and then restart the rsync service after
kill $(cat /var/run/)
netstat -lntp | grep rsync
rsync --daemon
netstat -lntp | grep rsync
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

在这里插入图片描述
在这里插入图片描述

rsync -avz /opt/test [email protected]::gzy520
  • 1

在这里插入图片描述

2.5 rsync synchronization knowledge summary

Synchronizing source servers (server-side)

yum install -y rsync
vim /etc/
...
[XXX]   # Synchronize source directory module names
path= ...
read only = yes/no
auth users =
secrets file =
rsync --daemon  #873 Rikou.
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

Originating server (client)

Pull-up, downstream synchronization

rsync -avz   Original position (synchronized source) Target position(local catalog)
rsync -avz   User@Address::Synchronization Source Shared Module Name Local Directory
rsync -avz   rsync://user@address/sync source shared module name local directory
                --password-file=         --delete
rsync -avz -e 'ssh -p ssh target port' User@Address://Synchronize Source Shared Directory Local Directory
  • 1
  • 2
  • 3
  • 4
  • 5

Push, uplink synchronization

rsync -avz  original location(local catalog)   target location(synchronization source)
rsync -avz  Local directory user@address::Synchronization source shared module name
rsync -avz  Local directory rsync://user@address/sync source shared module name
rsync -avz-e 'ssh -p ssh target port'  User@Address://local directory Synchronized source shared directory
  • 1
  • 2
  • 3
  • 4

3. Configure rsync+inotify on the initiator side

在这里插入图片描述

Using the inotify notification interface, you canUsed to monitor the file system for various changes, such as file access, deletion, movement, and modification. Using this mechanism, it is possible toIt is very easy to realize file movement alarms, incremental backups, and timely response to changes in directories or files.
Combining the inotify mechanism with the rsync utility allowsTriggered backups (real-time synchronization), i.e., incremental backups are initiated as soon as the documents in the original location change; otherwise in a silent waiting state. In this way, theAvoids latency, overcrowded cycles, etc. that exist when backing up on a fixed cycle basis
on account ofThe inotify notification mechanism is provided by the Linux kernel, so it is mainly used for local monitoring, and is more suitable for uplink synchronization when used in triggered backups.

(1) Modify the rsync source server configuration file (server-side configuration file)

vim /etc/
......
read only = no			# Turn off read-only, uplink synchronization needs to be writable

kill $(cat /var/run/)
rm -rf /var/run/
rsync --daemon	
netstat -lntp | grep rsync

chmod 777 /data/
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

(2) Adjusting inotify kernel parameters

In the Linux kernel, the default inotify mechanism provides three regulatory parameters:max_queue_events (monitor event queue, default value is 16384)max_user_instances (maximum number of monitoring instances, default value is 128)max_user_watches (maximum number of monitored files per instance, default value is 8192). When the number of directories and files to be monitored is large or changes frequently, it is recommended to increase the values of these three parameters.

cat /proc/sys/fs/inotify/max_queued_events
cat /proc/sys/fs/inotify/max_user_instances
cat /proc/sys/fs/inotify/max_user_watches

vim /etc/
.max_queued_events = 16384
.max_user_instances = 1024
.max_user_watches = 1048576

sysctl -p
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

在这里插入图片描述

3.1 Installing the inotify-tools tool

Using the inotify mechanism also requires the installation of inotify-tools in order toinotifywait, inotifywatch utility program, used to monitor, summarize changes
inotifywaitMonitor events such as modify, create, move, delete, attribute change, etc. and output results as soon as a change occurs.
inotifywatch:: MayUsed to collect file system changes and output summarized changes at the end of the run

cd /opt
rz -E
#inotify-tools-3.
tar xf inotify-tools-3. 

cd /opt/inotify-tools-3.14
./configure
make && make -j2 install
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

在这里插入图片描述

You can execute the "inotifywait" command first, and then open a new terminal to add and move files to the /opt/test/ directory, and follow the screen output in the original terminal.

inotifywait -mrq -e modify,create,move,delete /opt/test/
# Option "-e": Used to specify which events are to be monitored.
# Option "-m": for continuous monitoring
#The "-r" option recurses through the entire directory.
# Option "-q": simplifies output information
  • 1
  • 2
  • 3
  • 4
  • 5

在这里插入图片描述

在这里插入图片描述

3.2 Writing triggered backup scripts

Write a triggered synchronization script in another terminal (note that the script name must not contain the rsync string, or the script may not take effect)

vim /opt/
#!/bin/bash

INOTIFY_CMD="inotifywait -mrq -e modify,create,attrib,move,delete /opt/test/"
RSYNC_CMD="rsync -az --delete --password-file=/opt/rsync_pass /opt/test/ [email protected]::gzy520/"
# Use while, read to continuously get the monitoring results, according to the results can be further judgment, whether to read the output of the monitoring records.
$INOTIFY_CMD | while read DIRECTORY EVENT FILE
do
    COUNT=$(pgrep rsync | wc -l)
    if [ $COUNT  -le 0 ] ; then
        # If rsync is not executing, start it immediately
        $RSYNC_CMD
    fi
done

bash /opt/
chmod +x /etc//
echo '/opt/' >> /etc//				# Add power-on auto-execution
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18

在这里插入图片描述

The above script is used to detect changes in the local /opt/test/ directory.Once an update triggers an rsync sync operation, upload a backup to the /data/ shared directory on server 192.168.80.20
The verification process for triggered uplink synchronization is as follows:

(1) Run the /opt/script program locally.
(2) Switch to the local /opt/test/ directory and perform operations such as adding, deleting, and modifying files.
(3) View the changes in the /data/ directory in the remote server.

在这里插入图片描述
Additional Information: while read reads the contents of the file line by line;
在这里插入图片描述

4. Quickly delete a large number of files

4.1 High Frequency Problem: Using rsync to achieve fast deletion of large numbers of files

If you want to delete a large number of files under linux, say 1,000,000,000,000, 10,000,000,000, like the nginx cache in /usr/local/nginx/proxy_temp, etc., then rm -rf * may not work well because it will take a long time to wait. In this case we can use rsync to handle it smartly.rsync actually uses the replacement principle

Create an empty folder /opt/kong, and then create a large number of files a1-a1000, the use of rsync replacement principle to delete a large number of files a1-a1000;

mkdir /etc/kong
cd /opt
touch a{1..1000}
  • 1
  • 2
  • 3

在这里插入图片描述

Delete a large number of files with the rsync --delete command;

rsync -a --delete /etc/kong/ /opt/
  • 1

Option Description:

# Delete the target directory with rsync
rsync --delete-before -a -H -v --progress --stats /home/blank/ /usr/local/nginx/proxy_temp/
--delete-before The receiver is transmitting for a delete operation.
-a Archive mode, which indicates that files are transferred recursively and all file attributes are maintained
-H Maintain hardwired files
-v Detailed Output Modes
--progress Displaying the transfer process during transmission
--stats Gives the transfer status of certain files
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

在这里插入图片描述

rm:Too many files to use
find with -500,000 files in 43 minutes.
find with -delete 9Minutes
Perl 16s
Python9minutes
rsync with -delete  16s
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

Conclusion: Delete a lot of small files rsync fastest and most convenient.

4.2 Summary of deletion of a large number of documents

# Quickly Empty Your Local Directory
mkdir  empty directory
rsync -a --delete  Empty directory/ Destination directory to delete a large number of files/
  • 1
  • 2
  • 3