【AntDB Installation and Deployment Documentation】Centralized High Availability Environment Construction
Centralized high availability environment construction
Build a one-master-two-standby standalone cluster
The standalone cluster built here is one master and two spares, one synchronous and one asynchronous. A total of three hosts are used, distributed as follows:
Mainframe | Node |
10.20.16.227 | Host Node |
10.20.16.228 | Synchronous Backup Node |
10.20.16.214 | Asynchronous Backup Node |
Build the master library
Step 1: Follow the steps in the previous "Standalone Installation" section to build the standalone master library.
Step 2: In order to build the standby library later, you need to modify the configuration of the master library.
Here the data directory of the master is/data/shan2/data/, please modify it according to the actual.
• Add replication access authentication in hba.
cat >> /data/shan2/data/pg_hba.conf << EOF
host all all 0.0.0.0/0 trust
host replication all 0.0.0.0/0 trust
EOF
• Modify the main library postgres.conf file.
cat >> /data/shan2/data/postgresql.conf << EOF
listen_addresses = '*'
wal_level = replica
hot_standby = on
EOF
Step 3: Restart the main repository to make the above changes take effect. If the master library is not started, start it.
pg_ctl restart -D /data/shan2/data
After the start, try to log in the database with psql (AntDB's command line client program) to see if the start is successful.
psql -p 55551 -d postgres
Build the backup database
Step 1: Install the RPM package, refer to the installation steps of standalone version, the same as the installation steps of main library.
Step 2: Configure environment variables, refer to the steps of standalone version, same as the steps of main library.
Step 3: Run on the host where the backup library is located.
pg_basebackup -D the actual data directory -Fp -R -Xs -v -P -h the primary database ip -p the port of the primary database -U username
**Note: **When building a standby library, you can also use the -C and -S parameters to create physical replication slots at the same time to ensure the synchronization of the master and standby. For more information on the use of replication slots, please refer to the later section: Using Replication Slots.
pg_basebackup -D the actual data directory -Fp -R -Xs -v -P -h the primary database ip -p the port of the primary database -U username -C -S the name of copy slot
For example:
The standby is on 10.20.16.228, then log on to this host and execute the following command.
pg_basebackup -D /data/shan2/data -Fp -R -Xs -v -P -h 10.10.16.227 -p 55551 -U adbpuq
• -p: port number of the master node
• -h: IP address of the master node
• -D: data directory of the backup node
(If an error is reported: error: could not connect to server: No route to host. Then check 1. the configuration file, 2. whether the firewall is cleared of rules and has been closed)
Step 4: Start the backup machine, start the same way as the master node
pg_ctl start -D /data/shan2/data
Step 5: Check whether the primary and backup stream replication is established successfully.
psql to the master library, execute a query, you can see the backup machine just built 'state' for streaming, if it is down or query is empty, it means that the build failed, please check the build steps.
select * from pg_stat_replication:
Through the above steps, you can build multiple backup libraries.
Modify the same asynchronous state
The default primary and backup stream replication is asynchronous. If you want to change the stream replication to synchronous, you need to modify some parameters.
Step 1: Configure the backup library application_name.
Modify the backup library postgresql.auto.conf file and add application_name to primary_conninfo.
For example, specify application_name=ds1.
primary_conninfo = 'application_name=ds1 user=shan2 passfile=''/data/shan2/.pgpass'' host=10.20.16.227 port=40571 sslmode=prefer sslcompression=0 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
Step 2: Modify the name of the primary synchronous_standby_names for the application_name configured on the standby.
synchronous_standby_names = '1 (ds1)'
Finally, restart the primary and backup libraries, and then log on to the primary library to check the stream replication status.
The following is the case where 2 backup libraries are built, one primary and one backup.
postgres=# select application_name , state, sync_state from pg_stat_replication;
application_name | state | sync_state
------------------+-----------+------------
ds2 | streaming | async
ds1 | streaming | sync
(2 rows)
Use replication slots (optional)
The advantage of using physical replication slots is that the primary instance keeps the pre-written log (WAL) file until all the slots required by the standby are confirmed to have received the specific segment. Only after this operation is completed does the primary library instance remove the corresponding WAL file.
Replication slots (physical replication slots) provide a convenient way to prevent stream replication connections from disconnecting when the standby can't keep up with the master. Here is the error reported in the logs when the asynchronous node fails to keep up with the master, and the master has removed the WAL file before the backup has had a chance to receive it.
requested WAL segment 000000010000001700000001 has already been removed
However, there is a drawback to using replication slots, as the master will keep the WAL file when the backup fails. If no intervention is made, the primary library may run out of disk space and crash completely.
Therefore, users have to decide whether to use replication slots according to the actual situation.
The following describes the methods of using physical replication slots for asynchronous nodes, and users can choose one of them.
• Use an existing replication slot
1. First create a replication slot on the master node.
SELECT * FROM pg_create_physical_replication_slot('ds1');
2、Modify the asynchronous backup postgresql.auto.conf file and add the following configuration.
primary_slot_name = 'ds1'
3、Restart the backup machine.
• Automatically create a replication slot when pg_basebackup.
-C means create automatically, -S specifies the name of the replication slot.
Note that the name of the replication slot cannot be the same as the name of the replication slot that already exists on the host, otherwise an error is reported.
pg_basebackup -D the actual data directory -Fp -R -Xs -v -P -h the primary database ip -p the port of the primary database -U username -C -S the name of copy slot
For example:
pg_basebackup -D /data/shan2/data -Fp -R -Xs -v -P -h 10.10.16.227 -p 55551 -U adbpuq -C -S ds1
At this point, check the status of the replication slot on the master node as follows. ds1 is the parameter specified by -S above, or created by the pg_create_physical_replication_slot function.
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog
_xmin | restart_lsn | confirmed_flush_lsn
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------
------+-------------+---------------------
ds1 | | physical | | | f | t | 13735 | |
| 2/60 |
(1 row)
Check the processes in the database and find that the process with active_pid of 13735 is the asynchronous stream replication walsender process
12847 ? Ss 0:00 /data/shan/pgsql_adb/bin/postgres -D /data/shan/pgdata_adb
12852 ? Ss 0:00 postgres: logger
12858 ? Ss 0:00 postgres: checkpointer
12859 ? Ss 0:00 postgres: background writer
12860 ? Ss 0:00 postgres: walwriter
12861 ? Ss 0:00 postgres: autovacuum launcher
12862 ? Ss 0:00 postgres: stats collector
12863 ? Ss 0:00 postgres: logical replication launcher
13102 ? Ss 0:00 postgres: walsender shan 10.21.10.220(50670) streaming 1/E6C00000
13735 ? Ss 0:00 postgres: walsender shan 10.21.10.221(55126) streaming 2/60
Create replicator user
Log in to the primary repository and create the replicator user.
psql –hthe primary database ip -p the port of the primary database –c " create role replicator with login replication password 'antdb'"
For example:
psql -h 10.19.32.121 -p 55551 -d postgres -c"create role replicator with login replication password 'antdb'"
Shut down SELINUX
Modify the /etc/selinux/config file to set the status to disabled.
For example, replace enforcing with disabled using the following command.
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
Or
sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/selinux/config
Check if SELINUX is disabled
/usr/sbin/sestatus –v # The expected result is: SELinux status: disabled