Postgres Installation and configuration Steps: for single node::
1. If we are installing postgres on RHEL 8+ OS then we need to follow below
steps:
Command to unsubscribe Redhat subscription-manager
Go to
vi /etc/yum/pluginconf.d/subscription-manager.conf
And make
enabled=0
Save the configuration.
wq!
2. Add Postgres repository to download and install postgres
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-
x86_64/pgdg-redhat-repo-latest.noarch.rpm
3. Disable postgres AppStream repository on RHEL 9
sudo dnf -qy module disable postgresql
4. Install PostgreSQL 16:
sudo dnf install -y postgresql16-server postgresql16
sudo dnf install -y postgresql17-server postgresql17
5. Initialize the PostgreSQL database:
sudo /usr/pgsql-16/bin/postgresql-16-setup initdb
sudo /usr/pgsql-17/bin/postgresql-17-setup initdb
6. Start and enable PostgreSQL:
sudo systemctl enable --now postgresql-16
sudo systemctl enable --now postgresql-17
7. Set Postgres admin user’s password
sudo su - postgres
psql -c "alter user postgres with password 'StrongDBPassword'"
exit
psql -c "alter user postgres with password 'User@123'"
8. Configure to listen postgres server on the all the interfaces.
vi /var/lib/pgsql/16/data/postgresql.conf
vi /var/lib/pgsql/17/data/postgresql.conf
Go to line 59 and update the line as below
click on ese and then type 59
after then press Shift+g to go to te 59th line.
add below line.
listen_addresses = ‘*’
max_connections = 5000
9. Configure to accept remote connections.
vi /var/lib/pgsql/16/data/pg_hba.conf
go the last line of the file and add below highlighted entries in it. Please note
that textual version of it is provided below.
host all all 0.0.0.0/0 md5
host all all ::/0 md5
10. Restart the postgres server
systemctl restart postgresql-16
For Custom Database: Skip it if you are going with default user postgres
11. Create database user
su - postgres
psql
<enter the password for postgres user>
postgres= create database <database_name>;
postgres= create user <database_user>;
postgres= alter user <database_user> with encrypted password '<database_password>';
postgres= grant all privileges on database <database_name> to <database_user>;
postgres= GRANT ALL PRIVILEGES ON TABLE pg_catalog.pg_largeobject TO
<database_user>;
postgres= GRANT ALL PRIVILEGES ON TABLE pg_catalog.pg_largeobject_metadata TO
<database_user>;
Example:
[root@localhost /] su - postgres
[postgres@localhost ~]$ psql
postgres= create database pgdse;
CREATE DATABASE
postgres= create user dsedbusr;
CREATE ROLE
postgres= alter user dsedbusr with encrypted password 'User@123';
ALTER ROLE
postgres= grant all privileges on database pgdse to dsedbusr;
GRANT
postgres= GRANT ALL PRIVILEGES ON TABLE pg_catalog.pg_largeobject TO dsedbusr;
GRANT
postgres= GRANT ALL PRIVILEGES ON TABLE pg_catalog.pg_largeobject_metadata TO
dsedbusr;
GRANT
Post database and user creation. You can use these details in the dse.ini
configuration.
12. In case you need to drop user and database for some reason. Refer below
commands.
su – postgres
psql
SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE
pg_stat_activity.datname = '<database_name>'
revoke all privileges on database <database_name> from <database_user>
revoke ALL PRIVILEGES ON TABLE pg_catalog.pg_largeobject from <database_user>
revoke ALL PRIVILEGES ON TABLE pg_catalog.pg_largeobject_metadata from
<database_user>
drop database <database_name>
drop role <database_user>
13. Important info:
Command to check database connection to the user and port number:
su – postgres
psql
postgres= \conninfo
O/P:
you will be getting similar to the below text.
You are connected to database "postgres" as user "postgres" via socket in
"/var/run/postgresql" at port "5432".
Note Database name is Schema name in IX.
15. Command to connect to any other database.
postgres= \c databasename
ex: postgres= \c pgdse
16. Command to check schema name.
connect to the disire database.
ex: postgres= \c pgdse
Now run \dn command.
ex: pgdse= \dn
17. Do not forget to add postgres port number permanently to the filewall
firewall-cmd --add-port=5432/tcp --permanent
firewall-cmd --reload
18: command to check if the port is in use.
less /etc/services
/portnumber
ex: /5432
19: command to check port status
netstat -na | grep "portnumber"
netstat -na | grep "5432"
For PostgreSQL 16, setting up a 2-node cluster with streaming replication involves
configuring one node as the primary and the other as the standby. Below are the
detailed steps:
Step 1: Install PostgreSQL 16 on Both Nodes
Ensure PostgreSQL 16 is installed on both the primary and standby servers. Follow
the steps below to install PostgreSQL 16 on each node.
1. Add PostgreSQL 16 repository:
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-
x86_64/pgdg-redhat-repo-latest.noarch.rpm
2. Disable the default PostgreSQL module:
sudo dnf -qy module disable postgresql
3. Install PostgreSQL 16:
sudo dnf install -y postgresql16-server postgresql16
4. Initialize the PostgreSQL database:
sudo /usr/pgsql-16/bin/postgresql-16-setup initdb
5. Start and enable PostgreSQL:
sudo systemctl enable --now postgresql-16
Step 2: Configure the Primary Node
1. Edit postgresql.conf:
Open the PostgreSQL configuration file:
sudo vi /var/lib/pgsql/16/data/postgresql.conf
Modify the following parameters:
listen_addresses = '*'
wal_level = replica
max_wal_senders = 10
hot_standby = on
wal_keep_size = 64MB
max_connections = 5000
Explanation:
listen_addresses = '*'
Meaning: This setting determines which IP addresses PostgreSQL will listen on
for incoming connections. The wildcard * means that PostgreSQL will listen on all
available IP addresses.
Use: This is often used in configurations where you want PostgreSQL to accept
connections from any network interface, which is useful for replication setups or
when PostgreSQL is accessed remotely.
wal_level = replica
Meaning: This setting controls the amount of information written to the Write-
Ahead Log (WAL). Setting it to replica ensures that enough information is logged to
support replication.
Use: For replication purposes, you need the WAL to contain enough information
so that standby servers (replicas) can stay in sync with the primary server.
replica is a common setting for this, as it provides sufficient data for streaming
replication.
max_wal_senders = 10
Meaning: This setting specifies the maximum number of concurrent connections
from WAL sender processes. These processes are responsible for sending WAL data to
replica servers.
Use: This setting is crucial in replication environments. If you have multiple
replicas, you'll need to set max_wal_senders to a value that accommodates all the
connections required by your replicas. Setting it to 10 means up to 10 replicas can
connect simultaneously for streaming replication.
hot_standby = on
Meaning: This parameter enables or disables the ability for standby servers to
accept read-only queries while they're in recovery mode.
Use: In a replication setup, this allows standby servers to be used for read
operations (e.g., reporting or analytics) while still applying changes from the
primary server. This is useful for balancing load and improving performance in
read-heavy environments.
2. Edit pg_hba.conf:
Add the IP address of the standby server to allow replication. Open the file:
sudo vi /var/lib/pgsql/16/data/pg_hba.conf
Add the following line:
host replication all standby_ip/32 md5
host all all 0.0.0.0/0 md5
host all all ::/0 md5
Example:
host replication all 10.215.7.82/32 md5
host all all 0.0.0.0/0 md5
host all all ::/0 md5
Replace standby_ip with the actual IP address of the standby server.
3. Create a replication user:
Create a user for replication with sufficient privileges:
sudo -u postgres psql
CREATE ROLE replicator WITH REPLICATION PASSWORD 'your_password' LOGIN;
example:
CREATE ROLE replicator WITH REPLICATION PASSWORD 'User@123' LOGIN;
exit
also, Set Postgres admin user’s password
sudo su - postgres
psql -c "alter user postgres with password 'StrongDBPassword'"
Example:
psql -c "alter user postgres with password 'User@123'"
exit
sudo systemctl reload postgresql-16
4. Restart PostgreSQL:
Restart the PostgreSQL service to apply the changes:
sudo systemctl restart postgresql-16
Step 3: Set Up the Standby Node
1. Stop PostgreSQL:
Stop the PostgreSQL service on the standby server:
sudo systemctl stop postgresql-16
2. Copy Data from Primary to Standby:
Use pg_basebackup to copy the data from the primary server to the standby
server:
sudo -u postgres pg_basebackup -h primary_ip -D /var/lib/pgsql/16/data -U
replicator -P -v --write-recovery-conf
Replace primary_ip with the actual IP address of the primary server.
3. Edit recovery.conf (if required): Not needed, Skip It.
If pg_basebackup didn't create a recovery.conf file, create one manually in the
data directory:
sudo vi /var/lib/pgsql/16/data/recovery.conf
Add the following:
standby_mode = 'on'
primary_conninfo = 'host=primary_ip port=5432 user=replicator
password=your_password'
trigger_file = '/tmp/postgresql.trigger'
In case, after above configuration postgres restart failing then you need to
delete recovery.conf file or move it and create a backup of it.
mv /var/lib/pgsql/16/data/recovery.conf /var/lib/pgsql/16/data/recovery.conf.bak
4. Start PostgreSQL:
Start the PostgreSQL service on the standby server:
sudo systemctl start postgresql-16
Step 4: Verify the Setup
1. Check replication status on the primary:
Log in to the primary server and check the replication status:
sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"
You should see the standby server listed here.
Example:
[root@sjs-rhel-93-76-pg1 ~] sudo -u postgres psql -c "SELECT * FROM
pg_stat_replication;"
pid | usesysid | usename | application_name | client_addr | client_hostname |
client_port | backend_start | backe
nd_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag
| flush_lag | replay_lag | sync_priority | sync_sta
te | reply_time
-------+----------+------------+------------------+-------------+-----------------
+-------------+-------------------------------+------
--------+-----------+------------+------------+------------+------------
+-----------+-----------+------------+---------------+---------
---+-------------------------------
67103 | 21390 | replicator | walreceiver | 10.215.7.77 | |
44740 | 2024-09-05 08:22:57.131145-05 |
| streaming | 0/1801C2C0 | 0/1801C2C0 | 0/1801C2C0 | 0/1801C2C0 |
| | | 0 | async
| 2024-09-05 08:32:30.255202-05
(1 row)
On Standby-server
root@sjs-rhel-93-77-pg2 ~] sudo -u postgres psql -c "SELECT pg_is_in_recovery();"
It should return true:
pg_is_in_recovery
-------------------
t
(1 row)
2. Failover Test (Optional):
To test failover, stop the primary server:
sudo systemctl stop postgresql-16
chmod +x /usr/pgsql-16/bin/pg_ctl
Then, on the standby server, promote it to the primary:
sudo -u postgres pg_ctl promote
[root@poc-database2 bin] su - postgres
Last login: Tue Sep 10 16:38:32 +07 2024 on pts/0
[postgres@poc-database2 ~]$ /usr/pgsql-16/bin/pg_ctl promote
waiting for server to promote.... done
server promoted
[postgres@poc-database2 ~]$
PG log check:
log file location:
/var/lib/pgsql/16/data/log
check the recent log
run: ls-ltr
cat postgresql-<>.log
or
tail -f postgresql-<>.log
another command to see If PostgreSQL is failing to start, you can use this command
to find detailed log entries that indicate what might be causing the problem. This
might include configuration errors, permission issues, or other service-related
problems.
journalctl -xeu postgresql-16.service
journalctl: This is a utility to query and display messages from the journal, which
is the central logging system in systemd-based Linux distributions.
-x: This option provides extra explanation and details for log messages, which can
help in understanding the context of the logs.
-e: This option jumps to the end of the journal log output, showing the most recent
log entries. This is useful for quickly viewing the latest messages.
-u postgresql-16.service: This specifies the unit (service) for which you want to
view the logs. In this case, it filters the logs to show only those related to the
PostgreSQL 16 service.
SELECT * FROM pg_stat_replication;
Some important note::
/var/lib/pgsql/<version>/data/postgresql.conf
max_connections
The max_connections setting in PostgreSQL defines the maximum number of concurrent
connections allowed to the database. Each connection consumes a certain amount of
memory, and the total memory usage will depend on various factors like shared
memory, work memory, and maintenance work memory.
Memory Calculation for max_connections:
Shared Memory (shared_buffers):
PostgreSQL allocates shared memory for shared buffers, which is used for caching
data.
Example: If shared_buffers is set to 128MB, this memory is used regardless of the
number of connections.
Connection Overhead:
Each connection in PostgreSQL has an overhead, which generally includes backend
memory and process overhead. This can vary but is typically around 10MB per
connection.
This includes things like stack size, local variables, memory allocated for
queries, etc.
Work Memory (work_mem):
work_mem is the memory allocated for each operation like sorting and hashing within
a query. If a query uses multiple operations, it can use more memory.
Default work_mem is often set to 4MB.
Maintenance Work Memory (maintenance_work_mem):
This is used for maintenance operations like VACUUM, CREATE INDEX, etc. Default
size is 64MB.
Example Calculation for max_connections = 100:
Connection Overhead: 10MB per connection
Work Memory: 4MB per connection
So, the memory required per connection would be:
Connection Overhead: 10MB
Work Memory: 4MB
Total per connection: 10MB + 4MB = 14MB
For 100 connections:
Total Memory: 100 connections * 14MB = 1400MB (1.4GB)
To see the total number of active connections
SELECT count(*) FROM pg_stat_activity;
To see more details about each connection:
SELECT pid, usename, datname, state, client_addr, query FROM pg_stat_activity;
To see only active connections
SELECT count(*) FROM pg_stat_activity WHERE state = 'active';
To see idle connections
SELECT count(*) FROM pg_stat_activity WHERE state = 'idle';
To check the number of active connections per database
SELECT datname, count(*) AS connections FROM pg_stat_activity GROUP BY datname;
SELECT pid, usename, datname, state, client_addr, query FROM pg_stat_activity WHERE
state = 'active';
Postgres complete Uninstallation Process:
To uninstall PostgreSQL from RHEL 9.3, follow these steps:
1. Stop PostgreSQL Service
Before uninstalling PostgreSQL, ensure that the PostgreSQL service is stopped:
sudo systemctl stop postgresql-16
2. Remove PostgreSQL Packages
You can remove PostgreSQL packages using dnf, the package manager for RHEL:
sudo dnf remove postgresql
This command will remove PostgreSQL and any related packages installed on your
system.
3. Remove PostgreSQL Data Directory
PostgreSQL's data directory is usually located in /var/lib/pgsql/. You should
remove this directory to delete all your database data:
sudo rm -rf /var/lib/pgsql/
4. Remove PostgreSQL Configuration Files
PostgreSQL configuration files are typically stored in /etc/postgresql/. Remove
this directory if it exists:
sudo rm -rf /etc/postgresql/
5. Remove PostgreSQL Logs
PostgreSQL logs are often located in /var/log/postgresql/. If this directory
exists, remove it:
sudo rm -rf /var/log/postgresql/
6. Remove PostgreSQL User and Group
If you want to remove the PostgreSQL system user and group, execute the following
commands:
sudo userdel postgres
sudo groupdel postgres
7. Verify Removal
You can verify that PostgreSQL has been completely removed by checking if the
PostgreSQL binaries or related files still exist:
which psql
If the psql command is not found, PostgreSQL has been successfully removed.
8. Restart the System (Optional)
Restarting the system can help ensure that all changes take effect:
sudo reboot
These steps should completely remove PostgreSQL from your RHEL 9.3 system. If you
encounter any issues or need further assistance, feel free to ask!
Few more setting w.r.t pooling:
Handling 10,000 user connections in PostgreSQL effectively requires a strategy that
leverages connection pooling. Even though max_connections is set to 10,000, it's
not practical or efficient to have all those connections active simultaneously
without pooling.
Explanation of Your Configuration:
1. max_connections = 10,000:
- This allows up to 10,000 connections to PostgreSQL, but handling this many
active connections directly without pooling will strain the system.
2. shared_buffers = 8GB:
- PostgreSQL uses this memory area to cache data. 8 GB is a solid allocation,
but increasing it could be necessary if your workload is intensive.
3. work_mem = 4MB:
- This memory is allocated per connection for operations like sorting and
hashing. With 10,000 connections, it could lead to significant memory consumption
(4 MB * 10,000 = 40 GB), so consider scaling it based on your workload.
4. maintenance_work_mem = 1GB:
- This is used for maintenance tasks like VACUUM and CREATE INDEX. It's
reasonable but should be monitored during heavy maintenance operations.
5. effective_cache_size = 24GB:
- This hints to PostgreSQL how much OS-level cache is available. It's an
advisory setting that helps PostgreSQL's planner make decisions about query
execution.
6. max_worker_processes = 8:
- Controls parallelism in queries. This setting allows up to 8 worker processes
for parallel execution.
7. wal_buffers = 16MB:
- This is used for write-ahead logging, which ensures data durability. 16 MB is
a common setting that balances performance with resource usage.
8. max_prepared_transactions = 10,000:
- This allows up to 10,000 prepared transactions, which is useful if you use
two-phase commits.
Using Connection Pooling to Handle 10,000 Connections:
Even with a max_connections setting of 10,000, you can handle connections more
efficiently by using a connection pooler like PgBouncer or Pgpool-II. These tools
manage the connection load by maintaining a smaller number of active connections to
the database while serving a larger number of clients.
Here's how you can do it:
1. Set Up Connection Pooling:
- Install and configure PgBouncer or a similar tool on your application
server(s).
- Pool Mode: Transaction (recommended):
- This mode reuses connections after each transaction, making it suitable for
applications with a high connection churn rate.
2. Configure PgBouncer Settings:
- max_client_conn = 10,000:
- This allows PgBouncer to accept 10,000 client connections, which it will
then pool and share among fewer active connections.
- default_pool_size = 500-1000:
- This defines the number of active connections PgBouncer maintains with
PostgreSQL. For 10,000 connections, 500 to 1,000 should balance load effectively.
- min_pool_size = 50-100:
- This keeps a minimum number of connections ready to reduce connection setup
times.
- reserve_pool_size = 50-100:
- Provides additional capacity for connection spikes.
3. Reduce max_connections in PostgreSQL:
- Adjust max_connections to 500-1000:
- Since PgBouncer handles connection pooling, you don't need to maintain
10,000 direct connections to PostgreSQL. Lowering max_connections to around 500-
1,000 will conserve resources.
4. System and Resource Considerations:
- CPU: Ensure your server has sufficient cores (e.g., 16-32 cores) to handle
parallel queries and connections.
- RAM: With a connection pool, you'll need sufficient memory to handle both
active connections and background operations. Ensure you have at least 64-128 GB of
RAM, depending on your workload.
- Disk I/O: Fast SSDs or NVMe drives are crucial for handling high transaction
volumes.
How Pooling Helps:
- Efficiency: Connection pooling reduces the overhead of setting up and tearing
down connections. Instead of 10,000 connections, your database might only need to
manage 500-1,000 active connections at any time.
- Idle Connection Management: PgBouncer can manage idle connections and avoid
overloading the database with unnecessary connections.
- Scalability: This setup allows your database to scale efficiently by maintaining
a small number of active connections while serving a large number of clients.
Example Scenario:
- Without pooling: If you tried to directly manage 10,000 connections with
PostgreSQL, you'd quickly run into resource limitations (CPU, RAM, context
switching) leading to performance degradation.
- With pooling: By using PgBouncer, 10,000 user connections can be managed with
just 500-1,000 active connections in PostgreSQL, significantly reducing resource
strain while maintaining high concurrency.
In summary, with a max_connections of 10,000 and using pooling, your PostgreSQL
server will be able to handle a large number of user connections without
overwhelming the system, provided you optimize settings and hardware accordingly.