SlideShare a Scribd company logo
Postgresql Database Architecture
Shared Memory Process Memory
Shared_Buffers
Wal_Buffers
Clog_Buffers
Memory_Locks
Maintenance_Work_Memory
Temp_Buffers
Work_Mem
Vacuum_Buffers
Logger
Postgres DB writer Check pointer
Stats collector
Wal Writer Archive Auto Vacuum
Data files WAL Files Temp Files Clog files Log files
WAL archives Other files
Physical Data Files
Process
Shared Buffers
t2
------
8k
i=20
t1
t3
t4
t5 t6 t7 t8
Pinned buffers
Dirty buffers
LRU
MRU
t2
By default size of the shared_buffers is 128 MB, We can assign 25% of
the physical memory to the Shared_Buffers.
Shared buffers allocating during the startup process using Shared_buffers parameter
values.
Dirty Buffers: These buffers contains modified data, those modified buffers called as dirty
shared buffers.
Pinned Buffers: The buffers which are access by the transactions are called as pinned
buffers
Algorithms: Entire Buffer management is working internally using LRU & MRU
algorithms
If you want to increase the Shared_buffers size, we need to restart the Database to effect
the new values.
Dirty Read: A transaction reads data written by a concurrent uncommitted transaction.
Non-repeatable read: A transaction re reads the data previously read and finds the data
changed by another transaction which has committed since the initial read.
Phantom read: A transaction re executes a query returning a set of rows satisfying a
search condition and finds that the set of rows satisfying the condition has changed
because another recently committed transaction.
WAL Buffers
To ensure that no data has been lost by server failures, PostgreSQL supports
the WAL mechanism. WAL data (also referred to as XLOG records) are
transaction log in PostgreSQL; and WAL buffer is a buffering area of the
WAL data before writing to a persistent storage.
Default value of the WAL_Buffers is 4 MB
WAL_BUFFERS allocate during the startup of the cluster used by the
values of WAL_Buffers in configuration file.
Changes of this parameter needs restart the cluster to effect the new values.
CLOG Buffers
• CLOG stands for "commit log", and the CLOG
buffers is an area in operating system RAM
dedicated to hold commit log pages. The commit log
pages contain log of transaction metadata and differ
from the WAL data. The commit logs have commit
status of all transactions and indicate whether or
not a transaction has been completed (committed).
• There is no specific parameter to control this area of
memory. This is automatically managed by the
database engine in tiny amounts. This is a shared
memory component, which is accessible to all the
background server and user processes of a
PostgreSQL database.
Memory_Locks
• This memory component is to store all
heavyweight locks used by the PostgreSQL
instance. These locks are shared across all the
background server and user processes connecting
to the database.
• A non-default larger setting of two database
parameters
namely max_locks_per_transaction and max_pr
ed_locks_per_transaction in a way influences the
size of this memory component.
• max_locks_per_transaction (integer) The shared lock
table tracks locks on max_locks_per_transaction *
(max_connections + max_prepared_transactions)
objects (e.g., tables); hence, no more than this many
distinct objects can be locked at any one time. This
parameter controls the average number of object locks
allocated for each transaction; individual transactions
can lock more objects as long as the locks of all
transactions fit in the lock table. This is not the number
of rows that can be locked; that value is unlimited. The
default, 64, has historically proven sufficient, but you
might need to raise this value if you have clients that
touch many different tables in a single transaction. This
parameter can only be set at server start.
• PREPARE TRANSACTION prepares the current
transaction for two-phase commit. After this command, the
transaction is no longer associated with the current
session; instead, its state is fully stored on disk, and there
is a very high probability that it can be committed
successfully, even if a database crash occurs before the
commit is requested.
• Once prepared, a transaction can later be committed or
rolled back with COMMIT PREPARED or ROLLBACK
PREPARED, respectively. Those commands can be issued
from any session, not only the one that executed the
original transaction.
• From the point of view of the issuing session, PREPARE
TRANSACTION is not unlike a ROLLBACK command:
after executing it, there is no active current transaction,
and the effects of the prepared transaction are no longer
visible.
Maintenance_work_mem
• This memory to be used by maintenance operations, such
as VACUUM, CREATE INDEX, and ALTER TABLE for ADDING FOREIGN
KEYS.
• This memory segment is allocated for user and is used for the
maintenance operations
• Parameter to maintain this memory is maintenance_work_mem,
default value is 64MB
• Ex:
postgres=# SET maintenance_work_mem TO '384MB ';
SET
postgres=# CREATE INDEX ON personen_vorname_nachname_idx
persons ( "last_name”);
CREATE INDEX
WORK_MEM
Work_mem is mainly used for expensive operations
like sort or hash operations like Joining, and
filtering of certain Data used
• ORDER BY, DISTINCT, and merge joins require
memory for sort operations.
hash joins, hash and hash-based processing
aggregations of IN operations require memory for
hash tables.
Bitmap index scans require memory for the
internal bitmap.
• Parameter is work_mem, defaul value is 4MB
postgres=# SET work_mem TO '8 GB ';
SET
postgres=# SET trace_sort TO on;
SET
postgres=#SET client_min_messages TO DEBUG;
SET
postgres=# SELECT relname, relkind, relpages FROM pg_class c WHERE
relkind = 'r'
postgres=# relpages ORDER BY DESC, ASC relname
postgres=#LIMIT 1;
LOG: begin tuple sort: NKEYS = 2, Workmen = 8192, random access = f
LOG: starting performsort: CPU 0.00s/0.00u sec elapsed 0.00 sec
LOG: performsort done: CPU 0.00s/0.00u sec elapsed 0.00 sec
LOG: internal sort ended, 31 KB used: CPU 0.00s/0.00u sec elapsed
0.00 sec
VACUUM_BUFFER
• Starts the daemon which cleans up tables and indexes, preventing
bloat and poor response times.
• This is the maximum amount of memory used by each of the
autovacuum worker processes, and it is controlled by
the autovacuum_work_mem database parameter. The memory is
allocated from the operating system RAM and is also influenced by
the autovacuum_max_workers database parameter. The setting
of autovacuum_work_mem should be configured carefully
as autovacuum_max_workers times this memory will be allocated
from the RAM. All these parameter settings only come into play
when the auto vacuum daemon is enabled, otherwise, these
settings have no effect on the behavior of VACUUM when run in
other contexts. This memory component is not shared by any other
background server or user process.
temp_buffers
• A database may have one or more temporary
tables, and the data blocks (pages) of such
temporary tables need a separate allocation of
memory to be processed in. The temp buffers
serve this purpose by utilising a portion of RAM,
defined by the temp_buffers parameter. The
temp buffers are only used for access
to temporary tables in a user session. There is no
relation between temp buffers in memory and
the temporary files that are created under
the pgsql_tmp directory during
large sort and hash table operations.
Bg Writer Process: Default
• The background writer continues to flush dirty
pages/buffers to datafiles. This background
writer scans the shared buffer for dirty pages to
write down to the disk level.
• The default run times of BG writer process is
200 mill. Seconds and for every 100 LRU (least
recently used ) pages/buffers reached in shared
buffers.
• Parameters to handle the BG Writer:
-bgwriter_delay= 200 (milli seconds)
-bgwriter_lru_maxpages=100 (buffers)
Check Pointer Process: Default
This process is takes care of cluster checkpoint.
When a check point starts all the dirty pages in
memory are written to the datafiles.
A checkpoint is a known safe starting point for
recovery, since at that time we write all currently
outstanding database changes to disk.
Checkpoint process has two aspects: the preparation
of database recovery, and the cleaning of dirty
pages on the shared buffer pool.
Checkpoint also updates PG_CONTROL file. which
holds the metadata of the current checkpoint.
Parameters:
checkpoint_timeout = 5 min
checkpoint_completion_target = 0.5
pg_control File:
• As the pg_control file contains the fundamental information of the checkpoint, it
is certainly essential for database recovery. If it is broken or unreadable, the
recovery process cannot start up in order to not obtained a starting point.
• Even though pg_control file stores over 40 items,
• State – The state of database server at the time of the latest check pointing
starts.
• There are seven states in total:
 'start up' is the state that system is starting up;
 'shut down' is the state that system is going down normally by the shutdown
command;
 'in production' is the state that system is running;
……………. and so on.
• Latest checkpoint location – LSN Location of the latest checkpoint record.
• Prior checkpoint location – LSN Location of the prior checkpoint record.
• --------------
• ---------------
WAL Writer Process: default
• WAL writer is a background process to
check the WAL buffer periodically and
write all unwritten XLOG records into
the WAL segments.
• Parameter:
• Wal_writer_delay = 200 ms
• Auto vacuum Launcher Process: non- default
• PostgreSQL does not immediately remove the deleted tuples from
the data files. These are marked as deleted. Similarly, when a
record is updated, it's roughly equivalent to one delete and one
insert. The previous version of the record continues to be in the
data file. Each update of a database row generates a new version
of the row. The reason is simple: there can be active transactions,
which want to see the data as it was before. As a result of this
activity, there will be a lot of unusable space in the data files.
After some time, these dead records become irrelevant as there
are no transactions still around to see the old data.
• To recover or reuse disk space occupied by updated or deleted
rows.
• To update data statistics used by the PostgreSQL query planner.
• To protect against loss of very old data due to transaction ID
wraparound.
statistics collector: Default
• statistics collector is a subsystem that supports collection and reporting of
information about server activity. Presently, the collector can count accesses to
tables and indexes in both disk-block and individual-row terms. It also tracks the
total number of rows in each table, and information about vacuum and analyze
actions for each table
• The statistics collector transmits the collected information to other PostgreSQL
processes through temporary files. These files are stored in the directory named
by the stats_temp_directory parameter, pg_stat_tmp by default.
• Echo $PG_STAT_TMP -- linux
• Echo %pg_stat_tmp% -- windows
• The parameter track_activities enables monitoring of the current command being
executed by any server process.
• The parameter track_counts controls whether statistics are collected about table
and index accesses.
• The parameter track_functions enables tracking of usage of user-defined
functions.
• The parameter track_io_timing enables monitoring of block read and write times.
postgres server process
• A postgres server process is a parent of
all processes related to a database
cluster management.
• This process is the database backend
process for each established connection.
• It handles the values of connected
database, username, host and
transaction information
• One to one :
PG_HBA.CONF
• TYPE :
Host- Type matches either an SSL or plain
connection.
Hostssl- only SSL connection
Hostnossl- only plain text
• Database & User :
Database and user columns are used to match
specific databases or users from the incoming
connection. Accept everything using * .
• Address : Is used only if the type uses the tcp/ip
and can be an ip address with network mask or
a hostname. Both ipv4 and ipv6 are supported.
• Method : Authentication, Postgresql supports
many methods from the password challenge to
sophisticated radius or Kerberos's.
– Trust: Allow the connection without any request. Is
quite useful if the password lost.
– Peer: Allow the connections if the OS user matches
the database user. Useful to authenticate to the
database on the local boxes. It is default by
installation.
• Password: Allow the connection matching the user and
password with shadow system table. PG_SHADOW
• Md5: Same as the password, this method offers a md5
encryption for the passwords. As the md5 is
deterministic a pseudo random subroutine is used
during the password challenge to avoid the same
string to be sent over the network.
• Crypt: The crypt method is similar to the password
method. When using crypt, the password is not sent in
clear text, but through a simple form of encryption.
The use of this method is not very secure, but is better
than using the clear text password method.
• krb4, krb5 : The krb4 and krb5 methods are used to
specify Version 4 or 5 of the Kerberos authentication
system. The installation and configuration of
Kerberos is beyond the scope of this book, but if you
wish to authenticate via Kerberos, these methods
are available.
• Ident: The ident method specifies that an ident map
should be used when a host is requesting
connections from a valid IP address listed in the
pg_hba.conf file. This method requires one option.
• - When the connection request matches the
pg_hba.conf and the authentication method is
cleared.
• Reject: Reject the connections with the
matching parameters. You should
typically use this to restrict access from
specific hosts or insecure connections.
• Vacuum_freeze_table_age
• Autovacuum_freeze_max_age
DATA
BASE
Pg_tblspc
Postgres.auto.conf
BASE
• base directory contains the database files. Each database
have a dedicated sub-directories-named after the internal
database's object id. A freshly initialized data directory
shows only three subdirectories in the base folder.
• Those corresponds to the two template databases,
template0 and template1, plus the postgres database.
• The numerical directories contains various files, also with
the numerical name which are actually the database's
relations, tables and indices.
• The relation's name is set initially from the relation's object
id. Any file altering operation like VACUUM FULL or
REINDEX, will generate a new file with a deferent name.
To find out the real relation's file name the relfilenode
inside the pg_class sub-system table must be queried.
Global:
The global directory contains all the
cluster wide relations. In addition
there's the very critical control file
Pg_xlog
• This is the most important and critical directory, for
the performances and for the reliability.
• The directory contains the transaction's logs, named
wal file. Each le is usually 16 Mb and contains all
the data blocks changed during the database
activity. The blocks are written first on this not
volatile area to ensure the cluster's recovery in case
of crash. The data blocks are then written later to
the corresponding data files.
• If the cluster's shutdown is not clean then the wal
files are replayed during the startup process from
the last known consistent location read from control
file.
• In order to ensure good performance this location
should stay on a dedicated device.
pg clog
• This directory contains the committed
transactions in small 8k les, except for the
serializable transactions.
• The the files are managed by the cluster and the
amount is related with the two GUC parameters
autovacuum_freeze_max_age and
vacuum_freeze_table_age.
• Increasing the values for the two parameters
the pg_clog must store the commit status to the
“event horizon" of the oldest frozen transaction
id.
Pg_serial:
Same as pg clog this directory stores the
information's about the committed
transactions in serializable transaction
isolation level.
Pg_multixact:
Stores the information's about the multi
transaction status, used generally for the
row share locks.
Pg_subtrans:
Stores the sub transactions status data.
Pg_notify:
Stores information's about the LISTEN/NOTIFY
operations.
Pg_stat:
This directory contains the permanent les for
the statistic subsystem.
Pg_twophase:
Stores the two phase commit data. The two
phase commit allows the transaction opening
independently from the session. This way even a
different session can commit or rollback the
transaction later.
Pg _tblspc:
• The directory contains the symbolic
links to the tablespace locations. A
tablespace is a logical name pointing a
physical location.
• As from PostgreSQL 9.2 the location is
read directly from the symbolic link.
• This make possible to change the
tablespace's position simply stopping the
cluster, moving the data file in the new
location, creating the new symbolic link
and starting the cluster.
Pg_snapshots:
• This directory is used to store the exported
snapshots. From the version 9.2
PostgreSQL offers the transaction's
snapshot export where one session can
open a transaction and export a consistent
snapshot.
• This way different session can access the
snapshot and read all together the same
consistent data snapshot.
• This feature is used, for example, by pg
dump for the parallel export.
Pg_ident.conf
• The content of the pg_ident.conf associates identifying usernames with PostgreSQL
usernames via definitions called ident maps . This is useful for users whose system
usernames do not match their PostgreSQL usernames. Some rules you should keep in
mind when defining and using an ident map are:
• Each ident map member is defined on a single line, which associates a map name with
an identifying username, and a translated PostgreSQL username.
• The pg_ident.conf file can contain multiple map names. Each group of single lines with
the same associative map name are considered a single map.
• A single line record to define an ident map consist of 3 tokens: the name of the map, the
identifying username, and the translated PostgreSQL username. This syntax is
entered as follows, where each token is separated by spaces, or tabs:
• mapname, identname, postgresqlname
• mapname: The map name used in the pg_hba.conf file to refer to the ident map.
• identname: The identifying username, which is generally the name of the system user
attempting to establish a connection to the database. This is the name provided by the
identd daemon, which must be running on the system attempting to connect.
• postgresqlname:
PG_CTL
• pg_ctl is a utility for initializing a
PostgreSQL database cluster, starting,
stopping, or restarting the PostgreSQL
database server (postgres), or displaying
the status of a running server.
PG_CTL
• pg_ctl can be used to stop a database cluster
• pg_ctl supports three modes of shutdown
− smart quit after all clients have disconnected
−Fast (default) quit directly, with proper
shutdown (default)
− immediate quit without complete shutdown; will
lead to recovery
Syntax:
• − pg_ctl stop [-W] [-t SECS] [-D DATADIR] [-s] [-
m SHUTDOWN-MODE]
• Restart: mode effectively executes a stop followed by
a start
• Reload: mode simply sends the postgres server
process a SIGHUP signal, causing it to reread its
configuration files (postgresql.conf, pg_hba.conf,
etc.). This allows changing configuration-file options
that do not require a full server restart to take
effect.
• Status: mode checks whether a server is running in
the specified data directory. If it is, the
server's PID and the command line options that
were used to invoke it are displayed. If the server is
not running, pg_ctl returns an exit status of 3. If an
accessible data directory is not
specified, pg_ctl returns an exit status of 4.
postgresql.conf
• There are many configuration parameters that
effect the behavior of the database system.
• All parameter names are case-insensitive.
• Every parameter takes a value of one of five types:
− boolean
− integer
− floating point
− string
− enum
• One way to set these parameters is to edit the file
postgresql.conf, which is normally kept in the data
directory.
• Holds parameters used by a cluster
• Parameters are case-insensitive
• Normally stored in data directory
• initdb installs default copy
• Some parameters only take effect on server
restart (pg_ctl restart)
• # used for comments
• One parameter per line
• Use include directive to read and process
another file
• Can also be set using command-line option
• Some parameters can be changed per
session using the SET command.
• Some parameters can be changed at the
user level using ALTER USER.
• Some parameters can be changed at the
database level using ALTER
DATABASE.
• The SHOW command can be used to see
settings.
• The pg_settings and pg_file_settings
catalog table lists settings information.
• ALTER SYSTEM command:
− Available starting with PostgreSQL 9.4
− Edits cluster settings without editing
postgresql.conf
− Writes the setting to a file called
postgresql.auto.conf
• Example:
− ALTER SYSTEM SET work_mem=20480;
• postgresql.auto.conf is always read last
during server reload/restarts
• Reset a parameter change -
− ALTER SYSTEM SET work_mem =
DEFAULT;
Postgres.conf : Connection settings
listen _addresses (default localhost) - Specifies the
addresses on which the server is to listen for
connections. Use * for all.
• port (default 5432) - The port the server listens on.
• max_connections (default 100) - Maximum number of
concurrent
connections the server can support.
• superuser_reserved_connections (default 3) - Number
of connection slots reserved for superusers.
• unix_socket_directory (default /tmp) - Directory to be
used for UNIX socket connections to the server.
• unix_socket_permissions (default 0777) - access
permissions of the Unix-domain socket.
Security and Authentication Settings
authentication_timeout (default: 1 minute): Maximum
time to complete client authentication, in seconds.
• ssl (default: off) - Enables SSL connections.
• ssl_ca_file - Specifies the name of the file containing the
SSL server certificate authority (CA).
• ssl_cert_file - Specifies the name of the file containing
the SSL server certificate.
• ssl_key_file - Specifies the name of the file containing
the SSL server private key.
• ssl_ciphers - List of SSL ciphers that may be used for
secure connections.
Memory Settings
• shared_buffers (default: 128MB) - Size of PostgreSQL shared
buffer pool for a cluster.
• temp_buffers (default: 8MB) - Amount of memory used by each
backend for caching temporary table data.
• work_mem (default: 4MB) - Amount of memory used for each sort
or hash operation before switching to temporary disk files.
• maintenance_work_mem (default: 64MB) - Amount of memory
used for each index build or VACUUM.
• temp_file_limit (default: 1) - amount of disk space that a session
can use for temporary files. A transaction attempting to exceed
this limit will be cancelled. Default is unlimited.
Query Planner Settings
random_page_cost (default 4.0) - Estimated cost of a random
page fetch, in abstract cost units. May need to be reduced to
account for caching effects.
• seq_page_cost (default 1.0) - Estimated cost of a sequential
page fetch, in abstract cost units. May need to be reduced to
account for caching effects.
Must always set random_page_cost >= seq_page_cost.
• effective_cache_size (default 4GB) - Used to estimate the cost of
an index scan. Rule of thumb is 75% of system memory.
There are plenty of enable_* parameters which influence the
planner in choosing an optimal plan.
Write Ahead Log Settings
• wal_level (default: minimal) - Determines how much information is
written to the WAL. Change this to enable replication. Other values are
archive, logical and hot_standby.
• fsync (default on) - Turn this off to make your database much faster – and
silently cause arbitrary corruption in case of a system crash.
• wal_buffers (default: -1, autotune) - The amount of memory used in
shared memory for WAL data. The default setting of -1 selects a size
equal to 1/32nd (about 3%) of shared_buffers.
• min_wal_size (default 80 MB) – The WAL size to start recycling the WAL
files.
• max_wal_size (default 1GB) – The WAL size to start checkpoint. Controls
the number of WAL Segments(16MB each) after which checkpoint is
forced.
• checkpoint_timeout (default 5 minutes) - Maximum time between
checkpoints.
• wal_compression (default off) – The WAL of Full Page write will be
compressed and written.
Where To Log
• log_destination - Valid values are combinations of stderr, csvlog, syslog,
and eventlog, depending on platform.
• logging_collector - Enables advanced logging features. csvlog requires
logging_collector.
‒ log_directory - Directory where log files are written. Requires
logging_collector.
‒ log_filename - Format of log file name (e.g. postgresql-%Y-%M-%d.log). Allows
regular log rotation. Requires logging_collector.
‒ log_file_mode (default 0600) - On Unix systems this parameter sets the
permissions
for log files when logging_collector is enabled.
‒ log_rotation_age - Automatically rotate logs after this much time. Requires
logging_collector.
‒ log_rotation_size - Automatically rotate logs when they get this big. Requires
logging_collector.
When To Log
• client_min_messages (default NOTICE) - Messages
of this severity level or above are sent to the client.
• log_min_messages (default WARNING) - Messages
of this severity level or above are sent to the server.
• log_min_error_statement (default ERROR) - When a
message of this severity or higher is written to the
server log, the statement that caused it is logged
along with it.
• log_min_duration_statement (default -1, disabled) -
When a statement runs for at least this long, it is
written to the server log, with its duration.
What To Log
log_connections (default off) - Log successful connections to the server log.
• log_disconnections (default off) - Log some information each time a
session disconnects, including the duration of the session.
• log_error_verbosity (default “default”) - Can also select “terse” or
“verbose”.
• log_duration (default off) - Log duration of each statement.
• log_line_prefix - Additional details to log with each line.
• log_statement (default none) - Legal values are none, ddl, mod (DDL and
all
other data-modifying statements), or all.
• log_temp_files (default -1) - Log temporary files of this size or larger, in
kilobytes.
• log_checkpoints (default off) - Causes checkpoints and restart points to be
logged in the server log.
Background Writer Settings
bgwriter_delay (default 200 ms) - Specifies time
between activity rounds for the background writer.
• bgwriter_lru_maxpages (default 100) - Maximum
number of pages that the background writer may
clean per activity round.
• bgwriter_lru_multiplier (default 2.0) - Multiplier on
buffers scanned per round. By default, if system
thinks 10 pages will be needed,
it cleans 10 * bgwriter_lru_multiplier of 2.0 = 20.
• Primary tuning technique is to lower
bgwriter_delay.
• search_path - This parameter specifies the
order in which schemas are searched. The
default value for this parameter is "$user",
public.
• default_tablespace - Name of the
tablespace in which to objects are created
by default.
• temp_tablespaces - Tablespaces name(s) in
which temporary objects are created.
• statement_timeout - Postgres will abort
any statement that takes over the specified
number of milliseconds A value of zero (the
default) turns this off.
Autovacuum
autovacuum (default on) - Controls whether the
autovacuum launcher runs, and starts worker
processes to vacuum and analyze tables.
• log_autovacuum_min_duration (default -1) -
Autovacuum tasks running longer than this
duration are logged. Can now be specified per table.
• autovacuum_max_workers (default 3) - Maximum
number of autovacuum worker processes which may
be running at one time.
• autovacuum_work_mem (default -1, to use
maintenance_work_mem) -
Maximum amount of memory used by each
autovacuum worker
postmaster.pid
• PostgreSQL puts a file named postmaster.pid in the data
directory to store the process id of the PostgreSQL server
process. If PostgreSQL crashes, this file can contain an
old pid that confuses PostgreSQL. ... make sure that there
are no processes named 'postgres' or 'postmaster’.
• FATAL: lock file "postmaster.pid" already exists.
postmaster.opts
• If this file exists in the data directory, pg_ctl (in restart
mode) will pass the contents of the file as options to
postgres, unless overridden by the -o option. The contents
of this file are also displayed in status mode.
Vacuum:
• Tuples that are deleted or obsolete by an update are not
physically removed from their table; they remain present until
a VACUUM is done.
• PostgreSQL databases require periodic maintenance known
as vacuuming.
• PostgreSQL's VACUUM command has to process each table on a
regular basis for several reasons:
 To recover or reuse disk space occupied by updated or deleted
rows.
 To update data statistics used by the PostgreSQL query planner.
 To update the visibility map, which speeds up index-only scans.
 To protect against loss of very old data due to transaction ID
wraparound or multixact ID wraparound.
Standard VACUUM and VACUUM FULL
VACUUM FULL: can reclaim more disk space,
reclaims storage occupied by dead tuples but
runs much more slowly.
Full vacuum applies exclusive lock on tables and it
doesn’t allow new connections on table until
completed. This method also requires extra disk
space, since it writes a new copy of the table and
doesn't release the old copy until the operation is
complete.
Therefore it's necessary to do VACUUM periodically,
especially on frequently-updated tables.
• vacuum_cost_delay (integer)The length of time, in milliseconds, that the
process will sleep when the cost limit has been exceeded. The default
value is zero, which disables the cost-based vacuum delay feature.
Positive values enable cost-based vacuuming. Note that on many
systems, the effective resolution of sleep delays is 10 milliseconds;
setting vacuum_cost_delay to a value that is not a multiple of 10 might
have the same results as setting it to the next higher multiple of 10.
• When using cost-based vacuuming, appropriate values
for vacuum_cost_delay are usually quite small, perhaps 10 or 20
milliseconds. Adjusting vacuum's resource consumption is best done by
changing the other vacuum cost parameters.
• vacuum_cost_page_hit (integer)The estimated cost for vacuuming a
buffer found in the shared buffer cache. It represents the cost to lock the
buffer pool, lookup the shared hash table and scan the content of the
page. The default value is one.
• vacuum_cost_page_miss (integer)The estimated cost for vacuuming a
buffer that has to be read from disk. This represents the effort to lock
the buffer pool, lookup the shared hash table, read the desired block in
from the disk and scan its content. The default value is 10.
• vacuum_cost_page_dirty (integer)The estimated cost charged when
vacuum modifies a block that was previously clean. It represents the
extra I/O required to flush the dirty block out to disk again. The default
value is 20.
• vacuum_cost_limit (integer)The accumulated cost that will cause the
vacuuming process to sleep. The default value is 200.
• Free Space Map (FSM)
To keep track of available space in the
relation.
It stores all free space related information
in an alongside primary relation and
that relation starts with the file node
number plus the suffix _fsm.
The VACUUM process also updates the
Free Space Map
• Visibility Map (VM)
To keep track of which pages contain only
tuples that are known to be visible to all
active transactions.
All tuples on the page are known to be
visible to all transactions, this means
that the page does not contain any
tuples that need to be vacuumed
515689311-Postgresql-DBA-Architecture.pptx

More Related Content

PDF
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
PDF
PostgreSQL WAL for DBAs
PGConf APAC
 
PDF
PostgreSQL and RAM usage
Alexey Bashtanov
 
PDF
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
PPTX
PostGreSQL Performance Tuning
Maven Logix
 
PDF
PostgreSQL Deep Internal
EXEM
 
PDF
PostgreSQL Replication High Availability Methods
Mydbops
 
PDF
Mastering PostgreSQL Administration
EDB
 
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
PostgreSQL WAL for DBAs
PGConf APAC
 
PostgreSQL and RAM usage
Alexey Bashtanov
 
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
PostGreSQL Performance Tuning
Maven Logix
 
PostgreSQL Deep Internal
EXEM
 
PostgreSQL Replication High Availability Methods
Mydbops
 
Mastering PostgreSQL Administration
EDB
 

What's hot (20)

DOCX
Oracle architecture
Soumya Das
 
PDF
Oracle_Multitenant_19c_-_All_About_Pluggable_D.pdf
SrirakshaSrinivasan2
 
PDF
MariaDB 마이그레이션 - 네오클로바
NeoClova
 
PDF
MariaDB 제품 소개
NeoClova
 
PDF
5 Steps to PostgreSQL Performance
Command Prompt., Inc
 
PPTX
Running MariaDB in multiple data centers
MariaDB plc
 
PDF
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
PDF
The Full MySQL and MariaDB Parallel Replication Tutorial
Jean-François Gagné
 
PDF
Oracle Database Performance Tuning Concept
Chien Chung Shen
 
PDF
Planning for Disaster Recovery (DR) with Galera Cluster
Codership Oy - Creators of Galera Cluster
 
PDF
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Zohar Elkayam
 
PDF
[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정
PgDay.Seoul
 
PDF
Using all of the high availability options in MariaDB
MariaDB plc
 
PPTX
Maria DB Galera Cluster for High Availability
OSSCube
 
PDF
Understanding oracle rac internals part 2 - slides
Mohamed Farouk
 
PDF
Patroni - HA PostgreSQL made easy
Alexander Kukushkin
 
PDF
MySQL-InnoDB
Mayank Prasad
 
PDF
MySQL Data Encryption at Rest
Mydbops
 
DOCX
Dba 3+ exp qus
krreddy21
 
PPTX
Achieving High Availability in PostgreSQL
Mydbops
 
Oracle architecture
Soumya Das
 
Oracle_Multitenant_19c_-_All_About_Pluggable_D.pdf
SrirakshaSrinivasan2
 
MariaDB 마이그레이션 - 네오클로바
NeoClova
 
MariaDB 제품 소개
NeoClova
 
5 Steps to PostgreSQL Performance
Command Prompt., Inc
 
Running MariaDB in multiple data centers
MariaDB plc
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
The Full MySQL and MariaDB Parallel Replication Tutorial
Jean-François Gagné
 
Oracle Database Performance Tuning Concept
Chien Chung Shen
 
Planning for Disaster Recovery (DR) with Galera Cluster
Codership Oy - Creators of Galera Cluster
 
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Zohar Elkayam
 
[pgday.Seoul 2022] 서비스개편시 PostgreSQL 도입기 - 진소린 & 김태정
PgDay.Seoul
 
Using all of the high availability options in MariaDB
MariaDB plc
 
Maria DB Galera Cluster for High Availability
OSSCube
 
Understanding oracle rac internals part 2 - slides
Mohamed Farouk
 
Patroni - HA PostgreSQL made easy
Alexander Kukushkin
 
MySQL-InnoDB
Mayank Prasad
 
MySQL Data Encryption at Rest
Mydbops
 
Dba 3+ exp qus
krreddy21
 
Achieving High Availability in PostgreSQL
Mydbops
 
Ad

Similar to 515689311-Postgresql-DBA-Architecture.pptx (20)

PPTX
Postgresql Database Administration Basic - Day1
PoguttuezhiniVP
 
PDF
The Essential postgresql.conf
Robert Treat
 
PDF
9.6_Course Material-Postgresql_002.pdf
sreedb2
 
PDF
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
pgdayrussia
 
PDF
PostgreSQL Prologue
Md. Golam Hossain
 
PDF
PostgreSQL High_Performance_Cheatsheet
Lucian Oprea
 
KEY
Grabbing the PostgreSQL Elephant by the Trunk
Harold Giménez
 
PPTX
Migrating To PostgreSQL
Grant Fritchey
 
PDF
Introduction to PostgreSQL for System Administrators
Jignesh Shah
 
PDF
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Severalnines
 
PDF
PostgreSQL 9.5 - Major Features
InMobi Technology
 
PPTX
Tales from production with postgreSQL at scale
Soumya Ranjan Subudhi
 
PDF
PostgreSQL 9.6 Performance-Scalability Improvements
PGConf APAC
 
PDF
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
Equnix Business Solutions
 
PDF
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Jignesh Shah
 
PDF
PostgreSQL Write-Ahead Log (Heikki Linnakangas)
Ontico
 
ODP
GUC Tutorial Package (9.0)
PostgreSQL Experts, Inc.
 
PPTX
How to be a Postgres DBA in a Pinch
ElizabethGarrettChri
 
PDF
a look at the postgresql engine
Federico Campoli
 
PPTX
PostgreSQL Terminology
Showmax Engineering
 
Postgresql Database Administration Basic - Day1
PoguttuezhiniVP
 
The Essential postgresql.conf
Robert Treat
 
9.6_Course Material-Postgresql_002.pdf
sreedb2
 
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
pgdayrussia
 
PostgreSQL Prologue
Md. Golam Hossain
 
PostgreSQL High_Performance_Cheatsheet
Lucian Oprea
 
Grabbing the PostgreSQL Elephant by the Trunk
Harold Giménez
 
Migrating To PostgreSQL
Grant Fritchey
 
Introduction to PostgreSQL for System Administrators
Jignesh Shah
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Severalnines
 
PostgreSQL 9.5 - Major Features
InMobi Technology
 
Tales from production with postgreSQL at scale
Soumya Ranjan Subudhi
 
PostgreSQL 9.6 Performance-Scalability Improvements
PGConf APAC
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
Equnix Business Solutions
 
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Jignesh Shah
 
PostgreSQL Write-Ahead Log (Heikki Linnakangas)
Ontico
 
GUC Tutorial Package (9.0)
PostgreSQL Experts, Inc.
 
How to be a Postgres DBA in a Pinch
ElizabethGarrettChri
 
a look at the postgresql engine
Federico Campoli
 
PostgreSQL Terminology
Showmax Engineering
 
Ad

Recently uploaded (20)

DOCX
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
PPTX
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
PDF
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
PPTX
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
PPTX
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
PDF
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
PPTX
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
PDF
BÀI TẬP TEST BỔ TRỢ THEO TỪNG CHỦ ĐỀ CỦA TỪNG UNIT KÈM BÀI TẬP NGHE - TIẾNG A...
Nguyen Thanh Tu Collection
 
PPTX
How to Manage Leads in Odoo 18 CRM - Odoo Slides
Celine George
 
PPTX
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
PPTX
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
PPTX
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
PPTX
Gupta Art & Architecture Temple and Sculptures.pptx
Virag Sontakke
 
PPTX
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
PPTX
Care of patients with elImination deviation.pptx
AneetaSharma15
 
DOCX
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
PPTX
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
PDF
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
PPTX
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 
Unit 5: Speech-language and swallowing disorders
JELLA VISHNU DURGA PRASAD
 
A Smarter Way to Think About Choosing a College
Cyndy McDonald
 
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
CONCEPT OF CHILD CARE. pptx
AneetaSharma15
 
HEALTH CARE DELIVERY SYSTEM - UNIT 2 - GNM 3RD YEAR.pptx
Priyanshu Anand
 
Biological Classification Class 11th NCERT CBSE NEET.pdf
NehaRohtagi1
 
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
BÀI TẬP TEST BỔ TRỢ THEO TỪNG CHỦ ĐỀ CỦA TỪNG UNIT KÈM BÀI TẬP NGHE - TIẾNG A...
Nguyen Thanh Tu Collection
 
How to Manage Leads in Odoo 18 CRM - Odoo Slides
Celine George
 
20250924 Navigating the Future: How to tell the difference between an emergen...
McGuinness Institute
 
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
Gupta Art & Architecture Temple and Sculptures.pptx
Virag Sontakke
 
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
Care of patients with elImination deviation.pptx
AneetaSharma15
 
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
Module 2: Public Health History [Tutorial Slides]
JonathanHallett4
 
CARE OF UNCONSCIOUS PATIENTS .pptx
AneetaSharma15
 

515689311-Postgresql-DBA-Architecture.pptx

  • 1. Postgresql Database Architecture Shared Memory Process Memory Shared_Buffers Wal_Buffers Clog_Buffers Memory_Locks Maintenance_Work_Memory Temp_Buffers Work_Mem Vacuum_Buffers Logger Postgres DB writer Check pointer Stats collector Wal Writer Archive Auto Vacuum Data files WAL Files Temp Files Clog files Log files WAL archives Other files Physical Data Files Process
  • 2. Shared Buffers t2 ------ 8k i=20 t1 t3 t4 t5 t6 t7 t8 Pinned buffers Dirty buffers LRU MRU t2 By default size of the shared_buffers is 128 MB, We can assign 25% of the physical memory to the Shared_Buffers.
  • 3. Shared buffers allocating during the startup process using Shared_buffers parameter values. Dirty Buffers: These buffers contains modified data, those modified buffers called as dirty shared buffers. Pinned Buffers: The buffers which are access by the transactions are called as pinned buffers Algorithms: Entire Buffer management is working internally using LRU & MRU algorithms If you want to increase the Shared_buffers size, we need to restart the Database to effect the new values. Dirty Read: A transaction reads data written by a concurrent uncommitted transaction. Non-repeatable read: A transaction re reads the data previously read and finds the data changed by another transaction which has committed since the initial read. Phantom read: A transaction re executes a query returning a set of rows satisfying a search condition and finds that the set of rows satisfying the condition has changed because another recently committed transaction.
  • 4. WAL Buffers To ensure that no data has been lost by server failures, PostgreSQL supports the WAL mechanism. WAL data (also referred to as XLOG records) are transaction log in PostgreSQL; and WAL buffer is a buffering area of the WAL data before writing to a persistent storage. Default value of the WAL_Buffers is 4 MB WAL_BUFFERS allocate during the startup of the cluster used by the values of WAL_Buffers in configuration file. Changes of this parameter needs restart the cluster to effect the new values.
  • 5. CLOG Buffers • CLOG stands for "commit log", and the CLOG buffers is an area in operating system RAM dedicated to hold commit log pages. The commit log pages contain log of transaction metadata and differ from the WAL data. The commit logs have commit status of all transactions and indicate whether or not a transaction has been completed (committed). • There is no specific parameter to control this area of memory. This is automatically managed by the database engine in tiny amounts. This is a shared memory component, which is accessible to all the background server and user processes of a PostgreSQL database.
  • 6. Memory_Locks • This memory component is to store all heavyweight locks used by the PostgreSQL instance. These locks are shared across all the background server and user processes connecting to the database. • A non-default larger setting of two database parameters namely max_locks_per_transaction and max_pr ed_locks_per_transaction in a way influences the size of this memory component.
  • 7. • max_locks_per_transaction (integer) The shared lock table tracks locks on max_locks_per_transaction * (max_connections + max_prepared_transactions) objects (e.g., tables); hence, no more than this many distinct objects can be locked at any one time. This parameter controls the average number of object locks allocated for each transaction; individual transactions can lock more objects as long as the locks of all transactions fit in the lock table. This is not the number of rows that can be locked; that value is unlimited. The default, 64, has historically proven sufficient, but you might need to raise this value if you have clients that touch many different tables in a single transaction. This parameter can only be set at server start.
  • 8. • PREPARE TRANSACTION prepares the current transaction for two-phase commit. After this command, the transaction is no longer associated with the current session; instead, its state is fully stored on disk, and there is a very high probability that it can be committed successfully, even if a database crash occurs before the commit is requested. • Once prepared, a transaction can later be committed or rolled back with COMMIT PREPARED or ROLLBACK PREPARED, respectively. Those commands can be issued from any session, not only the one that executed the original transaction. • From the point of view of the issuing session, PREPARE TRANSACTION is not unlike a ROLLBACK command: after executing it, there is no active current transaction, and the effects of the prepared transaction are no longer visible.
  • 9. Maintenance_work_mem • This memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE for ADDING FOREIGN KEYS. • This memory segment is allocated for user and is used for the maintenance operations • Parameter to maintain this memory is maintenance_work_mem, default value is 64MB • Ex: postgres=# SET maintenance_work_mem TO '384MB '; SET postgres=# CREATE INDEX ON personen_vorname_nachname_idx persons ( "last_name”); CREATE INDEX
  • 10. WORK_MEM Work_mem is mainly used for expensive operations like sort or hash operations like Joining, and filtering of certain Data used • ORDER BY, DISTINCT, and merge joins require memory for sort operations. hash joins, hash and hash-based processing aggregations of IN operations require memory for hash tables. Bitmap index scans require memory for the internal bitmap. • Parameter is work_mem, defaul value is 4MB
  • 11. postgres=# SET work_mem TO '8 GB '; SET postgres=# SET trace_sort TO on; SET postgres=#SET client_min_messages TO DEBUG; SET postgres=# SELECT relname, relkind, relpages FROM pg_class c WHERE relkind = 'r' postgres=# relpages ORDER BY DESC, ASC relname postgres=#LIMIT 1; LOG: begin tuple sort: NKEYS = 2, Workmen = 8192, random access = f LOG: starting performsort: CPU 0.00s/0.00u sec elapsed 0.00 sec LOG: performsort done: CPU 0.00s/0.00u sec elapsed 0.00 sec LOG: internal sort ended, 31 KB used: CPU 0.00s/0.00u sec elapsed 0.00 sec
  • 12. VACUUM_BUFFER • Starts the daemon which cleans up tables and indexes, preventing bloat and poor response times. • This is the maximum amount of memory used by each of the autovacuum worker processes, and it is controlled by the autovacuum_work_mem database parameter. The memory is allocated from the operating system RAM and is also influenced by the autovacuum_max_workers database parameter. The setting of autovacuum_work_mem should be configured carefully as autovacuum_max_workers times this memory will be allocated from the RAM. All these parameter settings only come into play when the auto vacuum daemon is enabled, otherwise, these settings have no effect on the behavior of VACUUM when run in other contexts. This memory component is not shared by any other background server or user process.
  • 13. temp_buffers • A database may have one or more temporary tables, and the data blocks (pages) of such temporary tables need a separate allocation of memory to be processed in. The temp buffers serve this purpose by utilising a portion of RAM, defined by the temp_buffers parameter. The temp buffers are only used for access to temporary tables in a user session. There is no relation between temp buffers in memory and the temporary files that are created under the pgsql_tmp directory during large sort and hash table operations.
  • 14. Bg Writer Process: Default • The background writer continues to flush dirty pages/buffers to datafiles. This background writer scans the shared buffer for dirty pages to write down to the disk level. • The default run times of BG writer process is 200 mill. Seconds and for every 100 LRU (least recently used ) pages/buffers reached in shared buffers. • Parameters to handle the BG Writer: -bgwriter_delay= 200 (milli seconds) -bgwriter_lru_maxpages=100 (buffers)
  • 15. Check Pointer Process: Default This process is takes care of cluster checkpoint. When a check point starts all the dirty pages in memory are written to the datafiles. A checkpoint is a known safe starting point for recovery, since at that time we write all currently outstanding database changes to disk. Checkpoint process has two aspects: the preparation of database recovery, and the cleaning of dirty pages on the shared buffer pool. Checkpoint also updates PG_CONTROL file. which holds the metadata of the current checkpoint. Parameters: checkpoint_timeout = 5 min checkpoint_completion_target = 0.5
  • 16. pg_control File: • As the pg_control file contains the fundamental information of the checkpoint, it is certainly essential for database recovery. If it is broken or unreadable, the recovery process cannot start up in order to not obtained a starting point. • Even though pg_control file stores over 40 items, • State – The state of database server at the time of the latest check pointing starts. • There are seven states in total:  'start up' is the state that system is starting up;  'shut down' is the state that system is going down normally by the shutdown command;  'in production' is the state that system is running; ……………. and so on. • Latest checkpoint location – LSN Location of the latest checkpoint record. • Prior checkpoint location – LSN Location of the prior checkpoint record. • -------------- • ---------------
  • 17. WAL Writer Process: default • WAL writer is a background process to check the WAL buffer periodically and write all unwritten XLOG records into the WAL segments. • Parameter: • Wal_writer_delay = 200 ms
  • 18. • Auto vacuum Launcher Process: non- default • PostgreSQL does not immediately remove the deleted tuples from the data files. These are marked as deleted. Similarly, when a record is updated, it's roughly equivalent to one delete and one insert. The previous version of the record continues to be in the data file. Each update of a database row generates a new version of the row. The reason is simple: there can be active transactions, which want to see the data as it was before. As a result of this activity, there will be a lot of unusable space in the data files. After some time, these dead records become irrelevant as there are no transactions still around to see the old data. • To recover or reuse disk space occupied by updated or deleted rows. • To update data statistics used by the PostgreSQL query planner. • To protect against loss of very old data due to transaction ID wraparound.
  • 19. statistics collector: Default • statistics collector is a subsystem that supports collection and reporting of information about server activity. Presently, the collector can count accesses to tables and indexes in both disk-block and individual-row terms. It also tracks the total number of rows in each table, and information about vacuum and analyze actions for each table • The statistics collector transmits the collected information to other PostgreSQL processes through temporary files. These files are stored in the directory named by the stats_temp_directory parameter, pg_stat_tmp by default. • Echo $PG_STAT_TMP -- linux • Echo %pg_stat_tmp% -- windows • The parameter track_activities enables monitoring of the current command being executed by any server process. • The parameter track_counts controls whether statistics are collected about table and index accesses. • The parameter track_functions enables tracking of usage of user-defined functions. • The parameter track_io_timing enables monitoring of block read and write times.
  • 20. postgres server process • A postgres server process is a parent of all processes related to a database cluster management. • This process is the database backend process for each established connection. • It handles the values of connected database, username, host and transaction information • One to one :
  • 21. PG_HBA.CONF • TYPE : Host- Type matches either an SSL or plain connection. Hostssl- only SSL connection Hostnossl- only plain text • Database & User : Database and user columns are used to match specific databases or users from the incoming connection. Accept everything using * .
  • 22. • Address : Is used only if the type uses the tcp/ip and can be an ip address with network mask or a hostname. Both ipv4 and ipv6 are supported. • Method : Authentication, Postgresql supports many methods from the password challenge to sophisticated radius or Kerberos's. – Trust: Allow the connection without any request. Is quite useful if the password lost. – Peer: Allow the connections if the OS user matches the database user. Useful to authenticate to the database on the local boxes. It is default by installation.
  • 23. • Password: Allow the connection matching the user and password with shadow system table. PG_SHADOW • Md5: Same as the password, this method offers a md5 encryption for the passwords. As the md5 is deterministic a pseudo random subroutine is used during the password challenge to avoid the same string to be sent over the network. • Crypt: The crypt method is similar to the password method. When using crypt, the password is not sent in clear text, but through a simple form of encryption. The use of this method is not very secure, but is better than using the clear text password method.
  • 24. • krb4, krb5 : The krb4 and krb5 methods are used to specify Version 4 or 5 of the Kerberos authentication system. The installation and configuration of Kerberos is beyond the scope of this book, but if you wish to authenticate via Kerberos, these methods are available. • Ident: The ident method specifies that an ident map should be used when a host is requesting connections from a valid IP address listed in the pg_hba.conf file. This method requires one option. • - When the connection request matches the pg_hba.conf and the authentication method is cleared.
  • 25. • Reject: Reject the connections with the matching parameters. You should typically use this to restrict access from specific hosts or insecure connections. • Vacuum_freeze_table_age • Autovacuum_freeze_max_age
  • 27. BASE • base directory contains the database files. Each database have a dedicated sub-directories-named after the internal database's object id. A freshly initialized data directory shows only three subdirectories in the base folder. • Those corresponds to the two template databases, template0 and template1, plus the postgres database. • The numerical directories contains various files, also with the numerical name which are actually the database's relations, tables and indices. • The relation's name is set initially from the relation's object id. Any file altering operation like VACUUM FULL or REINDEX, will generate a new file with a deferent name. To find out the real relation's file name the relfilenode inside the pg_class sub-system table must be queried.
  • 28. Global: The global directory contains all the cluster wide relations. In addition there's the very critical control file
  • 29. Pg_xlog • This is the most important and critical directory, for the performances and for the reliability. • The directory contains the transaction's logs, named wal file. Each le is usually 16 Mb and contains all the data blocks changed during the database activity. The blocks are written first on this not volatile area to ensure the cluster's recovery in case of crash. The data blocks are then written later to the corresponding data files. • If the cluster's shutdown is not clean then the wal files are replayed during the startup process from the last known consistent location read from control file. • In order to ensure good performance this location should stay on a dedicated device.
  • 30. pg clog • This directory contains the committed transactions in small 8k les, except for the serializable transactions. • The the files are managed by the cluster and the amount is related with the two GUC parameters autovacuum_freeze_max_age and vacuum_freeze_table_age. • Increasing the values for the two parameters the pg_clog must store the commit status to the “event horizon" of the oldest frozen transaction id.
  • 31. Pg_serial: Same as pg clog this directory stores the information's about the committed transactions in serializable transaction isolation level. Pg_multixact: Stores the information's about the multi transaction status, used generally for the row share locks. Pg_subtrans: Stores the sub transactions status data.
  • 32. Pg_notify: Stores information's about the LISTEN/NOTIFY operations. Pg_stat: This directory contains the permanent les for the statistic subsystem. Pg_twophase: Stores the two phase commit data. The two phase commit allows the transaction opening independently from the session. This way even a different session can commit or rollback the transaction later.
  • 33. Pg _tblspc: • The directory contains the symbolic links to the tablespace locations. A tablespace is a logical name pointing a physical location. • As from PostgreSQL 9.2 the location is read directly from the symbolic link. • This make possible to change the tablespace's position simply stopping the cluster, moving the data file in the new location, creating the new symbolic link and starting the cluster.
  • 34. Pg_snapshots: • This directory is used to store the exported snapshots. From the version 9.2 PostgreSQL offers the transaction's snapshot export where one session can open a transaction and export a consistent snapshot. • This way different session can access the snapshot and read all together the same consistent data snapshot. • This feature is used, for example, by pg dump for the parallel export.
  • 35. Pg_ident.conf • The content of the pg_ident.conf associates identifying usernames with PostgreSQL usernames via definitions called ident maps . This is useful for users whose system usernames do not match their PostgreSQL usernames. Some rules you should keep in mind when defining and using an ident map are: • Each ident map member is defined on a single line, which associates a map name with an identifying username, and a translated PostgreSQL username. • The pg_ident.conf file can contain multiple map names. Each group of single lines with the same associative map name are considered a single map. • A single line record to define an ident map consist of 3 tokens: the name of the map, the identifying username, and the translated PostgreSQL username. This syntax is entered as follows, where each token is separated by spaces, or tabs: • mapname, identname, postgresqlname • mapname: The map name used in the pg_hba.conf file to refer to the ident map. • identname: The identifying username, which is generally the name of the system user attempting to establish a connection to the database. This is the name provided by the identd daemon, which must be running on the system attempting to connect. • postgresqlname:
  • 36. PG_CTL • pg_ctl is a utility for initializing a PostgreSQL database cluster, starting, stopping, or restarting the PostgreSQL database server (postgres), or displaying the status of a running server.
  • 37. PG_CTL • pg_ctl can be used to stop a database cluster • pg_ctl supports three modes of shutdown − smart quit after all clients have disconnected −Fast (default) quit directly, with proper shutdown (default) − immediate quit without complete shutdown; will lead to recovery Syntax: • − pg_ctl stop [-W] [-t SECS] [-D DATADIR] [-s] [- m SHUTDOWN-MODE]
  • 38. • Restart: mode effectively executes a stop followed by a start • Reload: mode simply sends the postgres server process a SIGHUP signal, causing it to reread its configuration files (postgresql.conf, pg_hba.conf, etc.). This allows changing configuration-file options that do not require a full server restart to take effect. • Status: mode checks whether a server is running in the specified data directory. If it is, the server's PID and the command line options that were used to invoke it are displayed. If the server is not running, pg_ctl returns an exit status of 3. If an accessible data directory is not specified, pg_ctl returns an exit status of 4.
  • 39. postgresql.conf • There are many configuration parameters that effect the behavior of the database system. • All parameter names are case-insensitive. • Every parameter takes a value of one of five types: − boolean − integer − floating point − string − enum • One way to set these parameters is to edit the file postgresql.conf, which is normally kept in the data directory.
  • 40. • Holds parameters used by a cluster • Parameters are case-insensitive • Normally stored in data directory • initdb installs default copy • Some parameters only take effect on server restart (pg_ctl restart) • # used for comments • One parameter per line • Use include directive to read and process another file • Can also be set using command-line option
  • 41. • Some parameters can be changed per session using the SET command. • Some parameters can be changed at the user level using ALTER USER. • Some parameters can be changed at the database level using ALTER DATABASE. • The SHOW command can be used to see settings. • The pg_settings and pg_file_settings catalog table lists settings information.
  • 42. • ALTER SYSTEM command: − Available starting with PostgreSQL 9.4 − Edits cluster settings without editing postgresql.conf − Writes the setting to a file called postgresql.auto.conf • Example: − ALTER SYSTEM SET work_mem=20480; • postgresql.auto.conf is always read last during server reload/restarts • Reset a parameter change - − ALTER SYSTEM SET work_mem = DEFAULT;
  • 43. Postgres.conf : Connection settings listen _addresses (default localhost) - Specifies the addresses on which the server is to listen for connections. Use * for all. • port (default 5432) - The port the server listens on. • max_connections (default 100) - Maximum number of concurrent connections the server can support. • superuser_reserved_connections (default 3) - Number of connection slots reserved for superusers. • unix_socket_directory (default /tmp) - Directory to be used for UNIX socket connections to the server. • unix_socket_permissions (default 0777) - access permissions of the Unix-domain socket.
  • 44. Security and Authentication Settings authentication_timeout (default: 1 minute): Maximum time to complete client authentication, in seconds. • ssl (default: off) - Enables SSL connections. • ssl_ca_file - Specifies the name of the file containing the SSL server certificate authority (CA). • ssl_cert_file - Specifies the name of the file containing the SSL server certificate. • ssl_key_file - Specifies the name of the file containing the SSL server private key. • ssl_ciphers - List of SSL ciphers that may be used for secure connections.
  • 45. Memory Settings • shared_buffers (default: 128MB) - Size of PostgreSQL shared buffer pool for a cluster. • temp_buffers (default: 8MB) - Amount of memory used by each backend for caching temporary table data. • work_mem (default: 4MB) - Amount of memory used for each sort or hash operation before switching to temporary disk files. • maintenance_work_mem (default: 64MB) - Amount of memory used for each index build or VACUUM. • temp_file_limit (default: 1) - amount of disk space that a session can use for temporary files. A transaction attempting to exceed this limit will be cancelled. Default is unlimited.
  • 46. Query Planner Settings random_page_cost (default 4.0) - Estimated cost of a random page fetch, in abstract cost units. May need to be reduced to account for caching effects. • seq_page_cost (default 1.0) - Estimated cost of a sequential page fetch, in abstract cost units. May need to be reduced to account for caching effects. Must always set random_page_cost >= seq_page_cost. • effective_cache_size (default 4GB) - Used to estimate the cost of an index scan. Rule of thumb is 75% of system memory. There are plenty of enable_* parameters which influence the planner in choosing an optimal plan.
  • 47. Write Ahead Log Settings • wal_level (default: minimal) - Determines how much information is written to the WAL. Change this to enable replication. Other values are archive, logical and hot_standby. • fsync (default on) - Turn this off to make your database much faster – and silently cause arbitrary corruption in case of a system crash. • wal_buffers (default: -1, autotune) - The amount of memory used in shared memory for WAL data. The default setting of -1 selects a size equal to 1/32nd (about 3%) of shared_buffers. • min_wal_size (default 80 MB) – The WAL size to start recycling the WAL files. • max_wal_size (default 1GB) – The WAL size to start checkpoint. Controls the number of WAL Segments(16MB each) after which checkpoint is forced. • checkpoint_timeout (default 5 minutes) - Maximum time between checkpoints. • wal_compression (default off) – The WAL of Full Page write will be compressed and written.
  • 48. Where To Log • log_destination - Valid values are combinations of stderr, csvlog, syslog, and eventlog, depending on platform. • logging_collector - Enables advanced logging features. csvlog requires logging_collector. ‒ log_directory - Directory where log files are written. Requires logging_collector. ‒ log_filename - Format of log file name (e.g. postgresql-%Y-%M-%d.log). Allows regular log rotation. Requires logging_collector. ‒ log_file_mode (default 0600) - On Unix systems this parameter sets the permissions for log files when logging_collector is enabled. ‒ log_rotation_age - Automatically rotate logs after this much time. Requires logging_collector. ‒ log_rotation_size - Automatically rotate logs when they get this big. Requires logging_collector.
  • 49. When To Log • client_min_messages (default NOTICE) - Messages of this severity level or above are sent to the client. • log_min_messages (default WARNING) - Messages of this severity level or above are sent to the server. • log_min_error_statement (default ERROR) - When a message of this severity or higher is written to the server log, the statement that caused it is logged along with it. • log_min_duration_statement (default -1, disabled) - When a statement runs for at least this long, it is written to the server log, with its duration.
  • 50. What To Log log_connections (default off) - Log successful connections to the server log. • log_disconnections (default off) - Log some information each time a session disconnects, including the duration of the session. • log_error_verbosity (default “default”) - Can also select “terse” or “verbose”. • log_duration (default off) - Log duration of each statement. • log_line_prefix - Additional details to log with each line. • log_statement (default none) - Legal values are none, ddl, mod (DDL and all other data-modifying statements), or all. • log_temp_files (default -1) - Log temporary files of this size or larger, in kilobytes. • log_checkpoints (default off) - Causes checkpoints and restart points to be logged in the server log.
  • 51. Background Writer Settings bgwriter_delay (default 200 ms) - Specifies time between activity rounds for the background writer. • bgwriter_lru_maxpages (default 100) - Maximum number of pages that the background writer may clean per activity round. • bgwriter_lru_multiplier (default 2.0) - Multiplier on buffers scanned per round. By default, if system thinks 10 pages will be needed, it cleans 10 * bgwriter_lru_multiplier of 2.0 = 20. • Primary tuning technique is to lower bgwriter_delay.
  • 52. • search_path - This parameter specifies the order in which schemas are searched. The default value for this parameter is "$user", public. • default_tablespace - Name of the tablespace in which to objects are created by default. • temp_tablespaces - Tablespaces name(s) in which temporary objects are created. • statement_timeout - Postgres will abort any statement that takes over the specified number of milliseconds A value of zero (the default) turns this off.
  • 53. Autovacuum autovacuum (default on) - Controls whether the autovacuum launcher runs, and starts worker processes to vacuum and analyze tables. • log_autovacuum_min_duration (default -1) - Autovacuum tasks running longer than this duration are logged. Can now be specified per table. • autovacuum_max_workers (default 3) - Maximum number of autovacuum worker processes which may be running at one time. • autovacuum_work_mem (default -1, to use maintenance_work_mem) - Maximum amount of memory used by each autovacuum worker
  • 54. postmaster.pid • PostgreSQL puts a file named postmaster.pid in the data directory to store the process id of the PostgreSQL server process. If PostgreSQL crashes, this file can contain an old pid that confuses PostgreSQL. ... make sure that there are no processes named 'postgres' or 'postmaster’. • FATAL: lock file "postmaster.pid" already exists. postmaster.opts • If this file exists in the data directory, pg_ctl (in restart mode) will pass the contents of the file as options to postgres, unless overridden by the -o option. The contents of this file are also displayed in status mode.
  • 55. Vacuum: • Tuples that are deleted or obsolete by an update are not physically removed from their table; they remain present until a VACUUM is done. • PostgreSQL databases require periodic maintenance known as vacuuming. • PostgreSQL's VACUUM command has to process each table on a regular basis for several reasons:  To recover or reuse disk space occupied by updated or deleted rows.  To update data statistics used by the PostgreSQL query planner.  To update the visibility map, which speeds up index-only scans.  To protect against loss of very old data due to transaction ID wraparound or multixact ID wraparound.
  • 56. Standard VACUUM and VACUUM FULL VACUUM FULL: can reclaim more disk space, reclaims storage occupied by dead tuples but runs much more slowly. Full vacuum applies exclusive lock on tables and it doesn’t allow new connections on table until completed. This method also requires extra disk space, since it writes a new copy of the table and doesn't release the old copy until the operation is complete. Therefore it's necessary to do VACUUM periodically, especially on frequently-updated tables.
  • 57. • vacuum_cost_delay (integer)The length of time, in milliseconds, that the process will sleep when the cost limit has been exceeded. The default value is zero, which disables the cost-based vacuum delay feature. Positive values enable cost-based vacuuming. Note that on many systems, the effective resolution of sleep delays is 10 milliseconds; setting vacuum_cost_delay to a value that is not a multiple of 10 might have the same results as setting it to the next higher multiple of 10. • When using cost-based vacuuming, appropriate values for vacuum_cost_delay are usually quite small, perhaps 10 or 20 milliseconds. Adjusting vacuum's resource consumption is best done by changing the other vacuum cost parameters. • vacuum_cost_page_hit (integer)The estimated cost for vacuuming a buffer found in the shared buffer cache. It represents the cost to lock the buffer pool, lookup the shared hash table and scan the content of the page. The default value is one. • vacuum_cost_page_miss (integer)The estimated cost for vacuuming a buffer that has to be read from disk. This represents the effort to lock the buffer pool, lookup the shared hash table, read the desired block in from the disk and scan its content. The default value is 10. • vacuum_cost_page_dirty (integer)The estimated cost charged when vacuum modifies a block that was previously clean. It represents the extra I/O required to flush the dirty block out to disk again. The default value is 20. • vacuum_cost_limit (integer)The accumulated cost that will cause the vacuuming process to sleep. The default value is 200.
  • 58. • Free Space Map (FSM) To keep track of available space in the relation. It stores all free space related information in an alongside primary relation and that relation starts with the file node number plus the suffix _fsm. The VACUUM process also updates the Free Space Map
  • 59. • Visibility Map (VM) To keep track of which pages contain only tuples that are known to be visible to all active transactions. All tuples on the page are known to be visible to all transactions, this means that the page does not contain any tuples that need to be vacuumed