summaryrefslogtreecommitdiff
path: root/src/backend/postmaster/postmaster.c
AgeCommit message (Collapse)Author
2025-04-19Fix typos and grammar in the codeMichael Paquier
The large majority of these have been introduced by recent commits done in the v18 development cycle. Author: Alexander Lakhin <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-04-07Use XLOG_CONTROL_FILE macro consistently for control file name.Fujii Masao
The XLOG_CONTROL_FILE macro (defined in access/xlog_internal.h) represents the control file name. While some parts of the codebase already use this macro, others previously hardcoded the file name as a string. This commit replaces those hardcoded strings with the macro, ensuring consistent usage throughout the code. This makes future maintenance easier and improves searchability, for example when grepping for control file usage. Author: Anton A. Melnikov <[email protected]> Reviewed-by: Daniel Gustafsson <[email protected]> Reviewed-by: Kyotaro Horiguchi <[email protected]> Reviewed-by: Masao Fujii <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-04-02Improve error message when standby does accept connections.Fujii Masao
Even after reaching the minimum recovery point, if there are long-lived write transactions with 64 subtransactions on the primary, the recovery snapshot may not yet be ready for hot standby, delaying read-only connections on the standby. Previously, when read-only connections were not accepted due to this condition, the following error message was logged: FATAL: the database system is not yet accepting connections DETAIL: Consistent recovery state has not been yet reached. This DETAIL message was misleading because the following message was already logged in this case: LOG: consistent recovery state reached This contradiction, i.e., indicating that the recovery state was consistent while also stating it wasn’t, caused confusion. This commit improves the error message to better reflect the actual state: FATAL: the database system is not yet accepting connections DETAIL: Recovery snapshot is not yet ready for hot standby. HINT: To enable hot standby, close write transactions with more than 64 subtransactions on the primary server. To implement this, the commit introduces a new postmaster signal, PMSIGNAL_RECOVERY_CONSISTENT. When the startup process reaches a consistent recovery state, it sends this signal to the postmaster, allowing it to correctly recognize that state. Since this is not a clear bug, the change is applied only to the master branch and is not back-patched. Author: Atsushi Torikoshi <[email protected]> Co-authored-by: Fujii Masao <[email protected]> Reviewed-by: Yugo Nagata <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-29aio, bufmgr: Comment fixes/improvementsAndres Freund
Some of these comments have been wrong for a while (12f3867f5534), some I recently introduced (da7226993fd, 55b454d0e14). This includes an update to a comment in FlushBuffer(), which will be copied in a future commit. These changes seem big enough to be worth doing in separate commits. Suggested-by: Noah Misch <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-18aio: Infrastructure for io_method=workerAndres Freund
This commit contains the basic, system-wide, infrastructure for io_method=worker. It does not yet actually execute IO, this commit just provides the infrastructure for running IO workers, kept separate for easier review. The number of IO workers can be adjusted with a PGC_SIGHUP GUC. Eventually we'd like to make the number of workers dynamically scale up/down based on the current "IO load". To allow the number of IO workers to be increased without a restart, we need to reserve PGPROC entries for the workers unconditionally. This has been judged to be worth the cost. If it turns out to be problematic, we can introduce a PGC_POSTMASTER GUC to control the maximum number. As io workers might be needed during shutdown, e.g. for AIO during the shutdown checkpoint, a new PMState phase is added. IO workers are shut down after the shutdown checkpoint has been performed and walsender/archiver have shut down, but before the checkpointer itself shuts down. See also 87a6690cc69. Updates PGSTAT_FILE_FORMAT_ID due to the addition of a new BackendType. Reviewed-by: Noah Misch <[email protected]> Co-authored-by: Thomas Munro <[email protected]> Co-authored-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-13pg_noreturn to replace pg_attribute_noreturn()Peter Eisentraut
We want to support a "noreturn" decoration on more compilers besides just GCC-compatible ones, but for that we need to move the decoration in front of the function declaration instead of either behind it or wherever, which is the current style afforded by GCC-style attributes. Also rename the macro to "pg_noreturn" to be similar to the C11 standard "noreturn". pg_noreturn is now supported on all compilers that support C11 (using _Noreturn), as well as GCC-compatible ones (using __attribute__, as before), as well as MSVC (using __declspec). (When PostgreSQL requires C11, the latter two variants can be dropped.) Now, all supported compilers effectively support pg_noreturn, so the extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped. This also fixes a possible problem if third-party code includes stdnoreturn.h, because then the current definition of #define pg_attribute_noreturn() __attribute__((noreturn)) would cause an error. Note that the C standard does not support a noreturn attribute on function pointer types. So we have to drop these here. There are only two instances at this time, so it's not a big loss. In one case, we can make up for it by adding the pg_noreturn to a wrapper function and adding a pg_unreachable(), in the other case, the latter was already done before. Reviewed-by: Dagfinn Ilmari Mannsåker <[email protected]> Reviewed-by: Andres Freund <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
2025-03-12Add connection establishment duration loggingMelanie Plageman
Add log_connections option 'setup_durations' which logs durations of several key parts of connection establishment and backend setup. For an incoming connection, starting from when the postmaster gets a socket from accept() and ending when the forked child backend is first ready for query, there are multiple steps that could each take longer than expected due to external factors. This logging provides visibility into authentication and fork duration as well as the end-to-end connection establishment and backend initialization time. To make this portable, the timings captured in the postmaster (socket creation time, fork initiation time) are passed through the BackendStartupData. Author: Melanie Plageman <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: Fujii Masao <[email protected]> Reviewed-by: Daniel Gustafsson <[email protected]> Reviewed-by: Jacob Champion <[email protected]> Reviewed-by: Jelte Fennema-Nio <[email protected]> Reviewed-by: Guillaume Lelarge <[email protected]> Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com
2025-03-12Modularize log_connections outputMelanie Plageman
Convert the boolean log_connections GUC into a list GUC comprised of the connection aspects to log. This gives users more control over the volume and kind of connection logging. The current log_connections options are 'receipt', 'authentication', and 'authorization'. The empty string disables all connection logging. 'all' enables all available connection logging. For backwards compatibility, the most common values for the log_connections boolean are still supported (on, off, 1, 0, true, false, yes, no). Note that previously supported substrings of on, off, true, false, yes, and no are no longer supported. Author: Melanie Plageman <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: Fujii Masao <[email protected]> Reviewed-by: Daniel Gustafsson <[email protected]> Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com
2025-03-05Split WaitEventSet functions to separate source fileHeikki Linnakangas
latch.c now only contains the Latch related functions, which build on the WaitEventSet abstraction. Most of the platform-dependent stuff is now in waiteventset.c. Reviewed-by: Andres Freund <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2025-02-21backend launchers void * arguments for binary dataPeter Eisentraut
Change backend launcher functions to take void * for binary data instead of char *. This removes the need for numerous casts. Reviewed-by: Dagfinn Ilmari Mannsåker <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-01-25Change shutdown sequence to terminate checkpointer lastAndres Freund
The main motivation for this change is to have a process that can serialize stats after all other processes have terminated. Serializing stats already happens in checkpointer, even though walsenders can be active longer. The only reason the current shutdown sequence does not actively cause problems is that walsender currently does not generate any stats. However, there is an upcoming patch changing that. Another need for this change originates in the AIO patchset, where IO workers (which, in some edge cases, can emit stats of their own) need to run while the shutdown checkpoint is being written. This commit changes the shutdown sequence so checkpointer is signalled (via SIGINT) to trigger writing the shutdown checkpoint without also causing checkpointer to exit. Once checkpointer wrote the shutdown checkpoint it notifies postmaster via PMSIGNAL_XLOG_IS_SHUTDOWN and waits for the termination signal (SIGUSR2, as before). Checkpointer now is terminated after all children, other than dead-end children and logger, have been terminated, tracked using the new PM_WAIT_CHECKPOINTER PMState. Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: Nazir Bilal Yavuz <[email protected]> Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24postmaster: Adjust which processes we expect to have exitedAndres Freund
Comments and code stated that we expect checkpointer to have been signalled in case of immediate shutdown / fatal errors, but didn't treat archiver and walsenders the same. That doesn't seem right. I had started digging through the history to see where this oddity was introduced, but it's not the fault of a single commit. Instead treat archiver, checkpointer, and walsenders the same. Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24postmaster: Commonalize FatalError pathsAndres Freund
This includes some behavioral changes: - Previously PM_WAIT_XLOG_ARCHIVAL wasn't handled in HandleFatalError(), that doesn't seem quite right. - Previously a fatal error in PM_WAIT_XLOG_SHUTDOWN lead to jumping back to PM_WAIT_BACKENDS, no we go to PM_WAIT_DEAD_END. Jumping backwards doesn't seem quite right and we didn't do so when checkpointer failed to fork during a shutdown. - Previously a checkpointer fork failure didn't call SetQuitSignalReason(), which would lead to quickdie() reporting "terminating connection because of unexpected SIGQUIT signal" which seems even worse than the PMQUIT_FOR_CRASH message. If I saw that in the log I'd suspect somebody outside of postgres sent SIGQUITs Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24postmaster: Move code to switch into FatalError state into functionAndres Freund
There are two places switching to FatalError mode, behaving somewhat differently. An upcoming commit will introduce a third. That doesn't seem seem like a good idea. This commit just moves the FatalError related code from HandleChildCrash() into its own function, a subsequent commit will evolve the state machine change to be suitable for other callers. Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24postmaster: Don't repeatedly transition to crashing stateAndres Freund
Previously HandleChildCrash() skipped logging and signalling child exits if already in an immediate shutdown or in FatalError state, but still transitioned server state in response to a crash. That's redundant. In the other place we transition to FatalError, we do take care to not do so when already in FatalError state. To make it easier to combine different paths for entering FatalError state, only do so once in HandleChildCrash(). Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24postmaster: Don't open-code TerminateChildren() in HandleChildCrash()Andres Freund
After removing the duplication no user of sigquit_child() remains, therefore remove it. Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-23Don't ask for bug reports about pthread_is_threaded_np() != 0.Tom Lane
We thought that this condition was unreachable in ExitPostmaster, but actually it's possible if you have both a misconfigured locale setting and some other mistake that causes PostmasterMain to bail out before reaching its own check of pthread_is_threaded_np(). Given the lack of other reports, let's not ask for bug reports if this occurs; instead just give the same hint as in PostmasterMain. Bug: #18783 Reported-by: [email protected] Author: Tom Lane <[email protected]> Reviewed-by: Noah Misch <[email protected]> Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Backpatch-through: 13
2025-01-10postmaster: Rename some shutdown related PMState phase namesAndres Freund
The previous names weren't particularly clear. Future patches will add more shutdown phases, making it even more important to have understandable shutdown phases. Suggested-by: Heikki Linnakangas <[email protected]> Reviewed-by: Nazir Bilal Yavuz <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-01-10postmaster: Make btmask_add() variadicAndres Freund
Suggested-by: Heikki Linnakangas <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-01-10postmaster: Introduce variadic btmask_all_except()Andres Freund
Upcoming patches would otherwise need btmask_all_except3(). Reviewed-by: Heikki Linnakangas <[email protected]> Discussion: https://postgr.es/m/w3z6w3g4aovivs735nk4pzjhmegntecesm3kktpebchegm5o53@aonnq2kn27xi
2025-01-10postmaster: Improve logging of signals sent by postmasterAndres Freund
Previously many, in some cases important, signals we never logged. In other cases the signal name was only included numerically. As part of this, change the debug log level the signal is logged at to DEBUG3, previously some where DEBUG2, some DEBUG4. Also move from direct use of kill() to signal the av launcher to signal_child(). There doesn't seem to be a reason for directly using kill(). Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: Nazir Bilal Yavuz <[email protected]> Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-10postmaster: Update pmState via a wrapper functionAndres Freund
This makes logging of state changes easier - state transitions are now logged at DEBUG1. Without that logging it was surprisingly hard to understand the current state of the system while debugging. Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: Nazir Bilal Yavuz <[email protected]> Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-01Fix an assortment of spelling mistakes and typosDavid Rowley
Author: Alexander Lakhin <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-01-01Update copyright for 2025Bruce Momjian
Backpatch-through: 13
2024-12-17Set the stack_base_ptr in main(), not in random other places.Tom Lane
Previously we did this in PostmasterMain() and InitPostmasterChild(), which meant that stack depth checking was disabled in non-postmaster server processes, for instance in single-user mode. That seems like a fairly bad idea, since there's no a-priori restriction on the complexity of queries we will run in single-user mode. Moreover, this led to not having quite the same stack depth limit in all processes, which likely has no real-world effect but it offends my inner neatnik. Setting the depth in main() guarantees that check_stack_depth() is armed and has a consistent interpretation of stack depth in all forms of server processes. While at it, move the code associated with checking the stack depth out of tcop/postgres.c (which was never a great home for it) into a new file src/backend/utils/misc/stack_depth.c. Discussion: https://postgr.es/m/[email protected]
2024-12-14Fix warnings about declaration of environ on MinGW.Thomas Munro
POSIX says that the global variable environ shouldn't be declared in a header, and that you have to declare it yourself. MinGW declares it in <stdlib.h> with some macrology that messes up our declarations. Visual Studio doesn't warn (there are clues that it may also declare it, but if so, apparently compatibly). Suppress our declarations, on MinGW only. This clears the last warnings on CI's optional MinGW task, and hopefully on build farm animal fairywren too. Like 1319997d, no back-patch for now as it's not known to be breaking anything, and my humble goal is just to keep the MinGW build clean going forward. Reviewed-by: Tom Lane <[email protected]> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKGJLMh%2B6W5E4M_jSFb43gnrA_-Q6-%2BBf3HkBXyGfRFcBsQ%40mail.gmail.com
2024-12-10Fix elog(FATAL) before PostmasterMain() or just after fork().Noah Misch
Since commit 97550c0711972a9856b5db751539bbaf2f88884c, these failed with "PANIC: proc_exit() called in child process" due to uninitialized or stale MyProcPid. That was reachable if close() failed in ClosePostmasterPorts() or setlocale(category, "C") failed, both unlikely. Back-patch to v13 (all supported versions). Discussion: https://postgr.es/m/[email protected]
2024-12-04Provide a better error message for misplaced dispatch options.Nathan Bossart
Before this patch, misplacing a special must-be-first option for dispatching to a subprogram (e.g., postgres -D . --single) would fail with an error like FATAL: --single requires a value This patch adjusts this error to more accurately complain that the special option wasn't listed first. The aforementioned error message now looks like FATAL: --single must be first argument The dispatch option parsing code has been refactored for use wherever ParseLongOption() is called. Beyond the obvious advantage of avoiding code duplication, this should prevent similar problems when new dispatch options are added. Note that we assume that none of the dispatch option names match another valid command-line argument, such as the name of a configuration parameter. Ideally, we'd remove this must-be-first requirement for these options, but after some investigation, we decided that wasn't worth the added complexity and behavior changes. Author: Nathan Bossart, Greg Sabino Mullane Reviewed-by: Greg Sabino Mullane, Peter Eisentraut, Álvaro Herrera, Tom Lane Discussion: https://postgr.es/m/CAKAnmmJkZtZAiSryho%3DgYpbvC7H-HNjEDAh16F3SoC9LPu8rqQ%40mail.gmail.com
2024-11-27postmaster: Reduce verbosity of environment dump debug messageAndres Freund
Emitting each variable separately is unnecessarily verbose / hard to skim over. Emit the whole thing in one ereport() to address that. Also remove program name and function reference from the message. The former doesn't seem particularly helpful and the latter is provided by the elog.c infrastructure these days. Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Discussion: https://postgr.es/m/leouteo5ozcrux3fepuhtbp6c56tbfd4naxeokidbx7m75cabz@hhw6g4urlowt
2024-11-14Pass MyPMChildSlot as an explicit argument to child processHeikki Linnakangas
All the other global variables passed from postmaster to child have the same value in all the processes, while MyPMChildSlot is more like a parameter to each child process. Reviewed-by: Andres Freund <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2024-11-14Assign a child slot to every postmaster child processHeikki Linnakangas
Previously, only backends, autovacuum workers, and background workers had an entry in the PMChildFlags array. With this commit, all postmaster child processes, including all the aux processes, have an entry. Dead-end backends still don't get an entry, though, and other processes that don't touch shared memory will never mark their PMChildFlags entry as active. We now maintain separate freelists for different kinds of child processes. That ensures that there are always slots available for autovacuum and background workers. Previously, pre-authentication backends could prevent autovacuum or background workers from starting up, by using up all the slots. The code to manage the slots in the postmaster process is in a new pmchild.c source file. Because postmaster.c is just so large. Assigning pmsignal slot numbers is now pmchild.c's responsibility. This replaces the PMChildInUse array in pmsignal.c. Some of the comments in postmaster.c still talked about the "stats process", but that was removed in commit 5891c7a8ed. Fix those while we're at it. Reviewed-by: Andres Freund <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2024-11-14Kill dead-end children when there's nothing else leftHeikki Linnakangas
Previously, the postmaster would never try to kill dead-end child processes, even if there were no other processes left. A dead-end backend will eventually exit, when authentication_timeout expires, but if a dead-end backend is the only thing that's preventing the server from shutting down, it seems better to kill it immediately. It's particularly important, if there was a bug in the early startup code that prevented a dead-end child from timing out and exiting normally. Includes a test for that case where a dead-end backend previously prevented the server from shutting down. Reviewed-by: Andres Freund <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2024-11-14Replace postmaster.c's own backend type codes with BackendTypeHeikki Linnakangas
Introduce a separate BackendType for dead-end children, so that we don't need a separate dead_end flag. Reviewed-by: Andres Freund <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2024-10-31Remove duplicate words in commentsDaniel Gustafsson
A few comments contained duplicate "the" in sentences, fix by removing one occurrence. Author: Vignesh C <[email protected]> Discussion: https://postgr.es/m/CALDaNm2aEEiPwGJmPdzBxROVvs8n75yCjKz4K1f1B2TdWpzxTA@mail.gmail.com
2024-10-27Remove unused #include's from backend .c filesPeter Eisentraut
as determined by IWYU These are mostly issues that are new since commit dbbca2cf299. Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-09-21Increase the number of fast-path lock slotsTomas Vondra
Replace the fixed-size array of fast-path locks with arrays, sized on startup based on max_locks_per_transaction. This allows using fast-path locking for workloads that need more locks. The fast-path locking introduced in 9.2 allowed each backend to acquire a small number (16) of weak relation locks cheaply. If a backend needs to hold more locks, it has to insert them into the shared lock table. This is considerably more expensive, and may be subject to contention (especially on many-core systems). The limit of 16 fast-path locks was always rather low, because we have to lock all relations - not just tables, but also indexes, views, etc. For planning we need to lock all relations that might be used in the plan, not just those that actually get used in the final plan. So even with rather simple queries and schemas, we often need significantly more than 16 locks. As partitioning gets used more widely, and the number of partitions increases, this limit is trivial to hit. Complex queries may easily use hundreds or even thousands of locks. For workloads doing a lot of I/O this is not noticeable, but for workloads accessing only data in RAM, the access to the shared lock table may be a serious issue. This commit removes the hard-coded limit of the number of fast-path locks. Instead, the size of the fast-path arrays is calculated at startup, and can be set much higher than the original 16-lock limit. The overall fast-path locking protocol remains unchanged. The variable-sized fast-path arrays can no longer be part of PGPROC, but are allocated as a separate chunk of shared memory and then references from the PGPROC entries. The fast-path slots are organized as a 16-way set associative cache. You can imagine it as a hash table of 16-slot "groups". Each relation is mapped to exactly one group using hash(relid), and the group is then processed using linear search, just like the original fast-path cache. With only 16 entries this is cheap, with good locality. Treating this as a simple hash table with open addressing would not be efficient, especially once the hash table gets almost full. The usual remedy is to grow the table, but we can't do that here easily. The access would also be more random, with worse locality. The fast-path arrays are sized using the max_locks_per_transaction GUC. We try to have enough capacity for the number of locks specified in the GUC, using the traditional 2^n formula, with an upper limit of 1024 lock groups (i.e. 16k locks). The default value of max_locks_per_transaction is 64, which means those instances will have 64 fast-path slots. The main purpose of the max_locks_per_transaction GUC is to size the shared lock table. It is often set to the "average" number of locks needed by backends, with some backends using significantly more locks. This should not be a major issue, however. Some backens may have to insert locks into the shared lock table, but there can't be too many of them, limiting the contention. The only solution is to increase the GUC, even if the shared lock table already has sufficient capacity. That is not free, especially in terms of memory usage (the shared lock table entries are fairly large). It should only happen on machines with plenty of memory, though. In the future we may consider a separate GUC for the number of fast-path slots, but let's try without one first. Reviewed-by: Robert Haas, Jakub Wartak Discussion: https://postgr.es/m/[email protected]
2024-09-03Fix typos and grammar in code comments and docsMichael Paquier
Author: Alexander Lakhin Discussion: https://postgr.es/m/[email protected]
2024-08-19Fix garbled process name on backend crashHeikki Linnakangas
The log message on backend crash used wrong variable, which could be uninitialized. Introduced in commit 28a520c0b7. Reported-by: Alexander Lakhin Discussion: https://www.postgresql.org/message-id/[email protected]
2024-08-12Consolidate postmaster code to launch background processesHeikki Linnakangas
Much of the code in process_pm_child_exit() to launch replacement processes when one exits or when progressing to next postmaster state was unnecessary, because the ServerLoop will launch any missing background processes anyway. Remove the redundant code and let ServerLoop handle it. In ServerLoop, move the code to launch all the processes to a new subroutine, to group it all together. Reviewed-by: Thomas Munro <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2024-08-09Fix comment on processes being kept over a restartHeikki Linnakangas
All child processes except the syslogger are killed on a restart. The archiver might be already running though, if it was started during recovery. The split in the comments between "other special children" and the first group of "background tasks" seemed really arbitrary, so I just merged them all into one group. Reviewed-by: Thomas Munro <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2024-08-09Refactor code to handle death of a backend or bgworker in postmasterHeikki Linnakangas
Currently, when a child process exits, the postmaster first scans through BackgroundWorkerList, to see if it the child process was a background worker. If not found, then it scans through BackendList to see if it was a regular backend. That leads to some duplication between the bgworker and regular backend cleanup code, as both have an entry in the BackendList that needs to be cleaned up in the same way. Refactor that so that we scan just the BackendList to find the child process, and if it was a background worker, do the additional bgworker-specific cleanup in addition to the normal Backend cleanup. Change HandleChildCrash so that it doesn't try to handle the cleanup of the process that already exited, only the signaling of all the other processes. When called for any of the aux processes, the caller had already cleared the *PID global variable, so the code in HandleChildCrash() to do that was unused. On Windows, if a child process exits with ERROR_WAIT_NO_CHILDREN, it's now logged with that exit code, instead of 0. Also, if a bgworker exits with ERROR_WAIT_NO_CHILDREN, it's now treated as crashed and is restarted. Previously it was treated as a normal exit. If a child process is not found in the BackendList, the log message now calls it "untracked child process" rather than "server process". Arguably that should be a PANIC, because we do track all the child processes in the list, so failing to find a child process is highly unexpected. But if we want to change that, let's discuss and do that as a separate commit. Reviewed-by: Thomas Munro <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2024-08-09Make BackgroundWorkerList doubly-linkedHeikki Linnakangas
This allows ForgetBackgroundWorker() and ReportBackgroundWorkerExit() to take a RegisteredBgWorker pointer as argument, rather than a list iterator. That feels a little more natural. But more importantly, this paves the way for more refactoring in the next commit. Reviewed-by: Thomas Munro <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2024-08-01Minor refactoring of assign_backendlist_entry()Heikki Linnakangas
Make assign_backendlist_entry() responsible just for allocating the Backend struct. Linking it to the RegisteredBgWorker is the caller's responsibility now. Seems more clear that way. Discussion: https://www.postgresql.org/message-id/[email protected]
2024-07-29Move cancel key generation to after forking the backendHeikki Linnakangas
Move responsibility of generating the cancel key to the backend process. The cancel key is now generated after forking, and the backend advertises it in the ProcSignal array. When a cancel request arrives, the backend handling it scans the ProcSignal array to find the target pid and cancel key. This is similar to how this previously worked in the EXEC_BACKEND case with the ShmemBackendArray, just reusing the ProcSignal array. One notable change is that we no longer generate cancellation keys for non-backend processes. We generated them before just to prevent a malicious user from canceling them; the keys for non-backend processes were never actually given to anyone. There is now an explicit flag indicating whether a process has a valid key or not. I wrote this originally in preparation for supporting longer cancel keys, but it's a nice cleanup on its own. Reviewed-by: Jelte Fennema-Nio Discussion: https://www.postgresql.org/message-id/[email protected]
2024-07-02Move bgworker specific logic to bgworker.cHeikki Linnakangas
For clarity, we've been slowly moving functions that are not called from the postmaster process out of postmaster.c. Author: Xing Guo <[email protected]> Discussion: https://www.postgresql.org/message-id/CACpMh%2BDBHVT4xPGimzvex%3DwMdMLQEu9PYhT%2BkwwD2x2nu9dU_Q%40mail.gmail.com
2024-07-02Improve some global variable declarationsPeter Eisentraut
We have in launch_backend.c: /* * The following need to be available to the save/restore_backend_variables * functions. They are marked NON_EXEC_STATIC in their home modules. */ extern slock_t *ShmemLock; extern slock_t *ProcStructLock; extern PGPROC *AuxiliaryProcs; extern PMSignalData *PMSignalState; extern pg_time_t first_syslogger_file_time; extern struct bkend *ShmemBackendArray; extern bool redirection_done; That comment is not completely true: ShmemLock, ShmemBackendArray, and redirection_done are not in fact NON_EXEC_STATIC. ShmemLock once was, but was then needed elsewhere. ShmemBackendArray was static inside postmaster.c before launch_backend.c was created. redirection_done was never static. This patch moves the declaration of ShmemLock and redirection_done to a header file. ShmemBackendArray gets a NON_EXEC_STATIC. This doesn't make a difference, since it only exists if EXEC_BACKEND anyway, but it makes it consistent. After that, the comment is now correct. Reviewed-by: Andres Freund <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/[email protected]
2024-05-17Revise GUC names quoting in messages againPeter Eisentraut
After further review, we want to move in the direction of always quoting GUC names in error messages, rather than the previous (PG16) wildly mixed practice or the intermittent (mid-PG17) idea of doing this depending on how possibly confusing the GUC name is. This commit applies appropriate quotes to (almost?) all mentions of GUC names in error messages. It partially supersedes a243569bf65 and 8d9978a7176, which had moved things a bit in the opposite direction but which then were abandoned in a partial state. Author: Peter Smith <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com
2024-03-18Move code for backend startup to separate fileHeikki Linnakangas
This is code that runs in the backend process after forking, rather than postmaster. Move it out of postmaster.c for clarity. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/[email protected]
2024-03-18Refactor postmaster child process launchingHeikki Linnakangas
Introduce new postmaster_child_launch() function that deals with the differences in EXEC_BACKEND mode. Refactor the mechanism of passing information from the parent to child process. Instead of using different command-line arguments when launching the child process in EXEC_BACKEND mode, pass a variable-length blob of startup data along with all the global variables. The contents of that blob depend on the kind of child process being launched. In !EXEC_BACKEND mode, we use the same blob, but it's simply inherited from the parent to child process. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/[email protected]
2024-03-18Move some functions from postmaster.c to a new source fileHeikki Linnakangas
This just moves the functions, with no other changes, to make the next commits smaller and easier to review. The moved functions are related to launching postmaster child processes in EXEC_BACKEND mode. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/[email protected]