【Manager Handbook for Distributed AntDB-T】Data Recovery-Antdb,Antdb database

English

简体中文

English

Home > About > News > Technical Column

【Manager Handbook for Distributed AntDB-T】Data Recovery

News-2023-09-04

Asiainfo Anhui Technologies

Restore to a particular backup id

Command:

barman recover server_name backup_id destination_directory

For the meaning of specific parameter, refer to barman recover --help:

For example:

[antdb@localhost1 ~]$ barman recover dm0 20190129T152916 /home/antdb/pgdata/datanode/0 --remote-ssh-command='ssh antdb@10.1.226.201'
Starting remote restore for server dm0 using backup 20190129T152916
Destination directory: /home/antdb/pgdata_xc/datanode/0
Copying the base backup.
Copying required WAL segments.
Generating archive status files
Identify dangerous settings in destination directory.

IMPORTANT
These settings have been modified to prevent data losses

postgresql.conf line 747: archive_command = false

Your PostgreSQL server has been successfully prepared for recovery!

Be aware of the last tip that the node's archive command is turned off after the recovery is complete. You need to set the archive command of all nodes back up after the recovery of all nodes is complete, the data is confirmed to be correct, and the cluster starts to run officially.

Point-in-time based recovery

Point-in-time based recovery can easily lead to inconsistent recovery points if the time of the host where the datanode is located is inconsistent. If you can be sure that you can recover to the same point in time, you can configure the --target-timeparameter in the command.

For example:

[antdb@localhost1 ~]$ barman recover db1 20170928T163937 /home/antdb/adb/data/db1_barman --target-time '2017-09-28 16:40:00'
Starting local restore for server db1 using backup 20170928T163937
Destination directory: /home/antdb/adb/data/db1_barman
Doing PITR. Recovery target time: '2017-09-28 16:40:00'
Copying the base backup.
Copying required WAL segments.
Generating recovery.conf
Identify dangerous settings in destination directory.
IMPORTANT
These settings have been modified to prevent data losses
postgresql.conf line 687: archive_command = false
Your PostgreSQL server has been successfully prepared for recovery!

Recovery based on barrier points

In cluster version, it is recommended to use thebarrier technique provided by AntDB if you want each node to be able to recover to a consistent state.

Each node is restored to the same barrier point to truly achieve a globally consistent point-in-time recovery.

1. This function needs to create the barrier and execute the following command on the coordinator before backup

create barrier 'bar1';

2. Add the--target-barrier=barrier_name parameter when restoring

barman recover dm0 20181227T110223 /home/antdb/pgdata_xc/datanode/0 --remote-ssh-command='ssh antdb@10.1.226.201' --target-barrier='bar1'

The following is an example of the actual operation of backup and recovery of a cluster:

• Before backup, create a barrier, execute a log switch, and then backup:

create barrier 'bar1';

barman switch-xlog --force --archive dm0
barman switch-xlog --force --archive dm1
barman switch-xlog --force --archive coord0
barman switch-xlog --force --archive gtmcoord

barman backup dm0
barman backup dm1
barman backup coord0
barman backup gtmcoord

• When you need to restore the data, query the backup list and restore the data.

barman list-backup dm0
barman list-backup dm1
barman list-backup coord0
barman list-backup gtmcoord

barman recover dm0 backup_id /home/antdb/pgdata/datanode/0 --remote-ssh-command='ssh antdb@10.1.226.201' --target-barrier='bar1'

barman recover dm1 backup_id /home/antdb/pgdata/datanode/1 --remote-ssh-command='ssh antdb@10.1.226.201' --target-barrier='bar1'

barman recover coord0 backup_id /home/antdb/pgdata/coord/0 --remote-ssh-command='ssh antdb@10.1.226.201' --target-barrier='bar1'

barman recover gtmcoord backup_id /home/antdb/pgdata/gtmcoord/1 --remote-ssh-command='ssh antdb@10.1.226.203' --target-barrier='bar1'

After the recovery is completed, start the cluster. After each node is started, in order to protect the recovered cluster data, the nodes are still in the recovery state and cannot write data. If you login to the node, executepg_is_in_recovery, and t is returned, the node is in the recovery state. At this point, select pg_wal_replay_resume();needs to be executed on each node so that the node can accept read and write requests. Thebarman_operate.sh script also provides the operation to execute the command in all master nodes:sh barman_operate.sh replay_resume.

Caution:

• After recovery, the original data in slave node is not consistent with that in master node, you need to re-append the slave node.

• If the master node has been switched during the cluster operation, the barman configuration file needs to be regenerated to correspond to the latest structure of the cluster.

Hello！

Tell us what you need.

Consultation

antdb@asiainfo.com

flyingserver@asiainfo.com

AntDB
Carrier-level core transaction database

AntDB has been providing online services for more than 1 billion subscribers in 24provinces across the country on the operator's core system since 2008.

Boasting features such as high performance, flexible expansion and high reliability, AntDB can handle millions of communication core transactions per second at peak.

Besides, it has been successfully commercialized in communications, finance, transportation, energy Internet of Things and other industries.