Age | Commit message (Collapse) | Author |
|
This moves/renames some of the functions defined in pg_numa.c:
* pg_numa_get_pagesize() is renamed to pg_get_shmem_pagesize(), and
moved to src/backend/storage/ipc/shmem.c. The new name better reflects
that the page size is not related to NUMA, and it's specifically about
the page size used for the main shared memory segment.
* move pg_numa_available() to src/backend/storage/ipc/shmem.c, i.e. into
the backend (which more appropriate for functions callable from SQL).
While at it, improve the comment to explain what page size it returns.
* remove unnecessary includes from src/port/pg_numa.c, adding
unnecessary dependencies (src/port should be suitable for frontent).
These were either leftovers or unnecessary thanks to the other changes
in this commit.
This eliminates unnecessary dependencies on backend symbols, which we
don't want in src/port.
Reported-by: Kirill Reshke <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com
|
|
Backpatch-through: 13
|
|
The code comment said, 'the argument should be declared "PGShmemHeader
*seghdr", but we use void to avoid having to include ipc.h in
shmem.h.' We can achieve the original goal with a struct forward
declaration. (ipc.h was also not the correct header file.)
Discussion: https://www.postgresql.org/message-id/flat/cnthxg2eekacrejyeonuhiaezc7vd7o2uowlsbenxqfkjwgvwj@qgzu6eoqrglb
|
|
We have in launch_backend.c:
/*
* The following need to be available to the save/restore_backend_variables
* functions. They are marked NON_EXEC_STATIC in their home modules.
*/
extern slock_t *ShmemLock;
extern slock_t *ProcStructLock;
extern PGPROC *AuxiliaryProcs;
extern PMSignalData *PMSignalState;
extern pg_time_t first_syslogger_file_time;
extern struct bkend *ShmemBackendArray;
extern bool redirection_done;
That comment is not completely true: ShmemLock, ShmemBackendArray, and
redirection_done are not in fact NON_EXEC_STATIC. ShmemLock once was,
but was then needed elsewhere. ShmemBackendArray was static inside
postmaster.c before launch_backend.c was created. redirection_done
was never static.
This patch moves the declaration of ShmemLock and redirection_done to
a header file.
ShmemBackendArray gets a NON_EXEC_STATIC. This doesn't make a
difference, since it only exists if EXEC_BACKEND anyway, but it makes
it consistent.
After that, the comment is now correct.
Reviewed-by: Andres Freund <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/[email protected]
|
|
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 12
|
|
Prior patches got rid of all the uses of SHM_QUEUE. ilist.h style lists are
more widely used and have an easier to use interface. As there are no users
left, remove SHM_QUEUE.
Reviewed-by: Thomas Munro <[email protected]> (in an older version)
Discussion: https://postgr.es/m/[email protected]
Discussion: https://postgr.es/m/[email protected]
|
|
Backpatch-through: 11
|
|
Backpatch-through: 10
|
|
Backpatch-through: 9.5
|
|
Includes some manual cleanup of places that pgindent messed up,
most of which weren't per project style anyway.
Notably, it seems some people didn't absorb the style rules of
commit c9d297751, because there were a bunch of new occurrences
of function calls with a newline just after the left paren, all
with faulty expectations about how the rest of the call would get
indented.
|
|
This tells you about allocations that have been made from the main
shared memory segment. The original patch also tried to show information
about dynamic shared memory allocation as well, but I decided to
leave that problem for another time.
Andres Freund and Robert Haas, reviewed by Michael Paquier, Marti
Raudsepp, Tom Lane, Álvaro Herrera, and Kyotaro Horiguchi.
Discussion: http://postgr.es/m/[email protected]
|
|
Backpatch-through: update all files in master, backpatch legal files through 9.4
|
|
Switch to 2.1 version of pg_bsd_indent. This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.
Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
|
|
Backpatch-through: certain files through 9.4
|
|
Backpatch-through: certain files through 9.3
|
|
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/[email protected]
Discussion: https://postgr.es/m/[email protected]
|
|
|
|
Previously, the "sem" field of PGPROC varied in size depending on which
kernel semaphore API we were using. That was okay as long as there was
only one likely choice per platform, but in the wake of commit ecb0d20a9,
that assumption seems rather shaky. It doesn't seem out of the question
anymore that an extension compiled against one API choice might be loaded
into a postmaster built with another choice. Moreover, this prevents any
possibility of selecting the semaphore API at postmaster startup, which
might be something we want to do in future.
Hence, change PGPROC.sem to be PGSemaphore (i.e. a pointer) for all Unix
semaphore APIs, and turn the pointed-to data into an opaque struct whose
contents are only known within the responsible modules.
For the SysV and unnamed-POSIX APIs, the pointed-to data has to be
allocated elsewhere in shared memory, which takes a little bit of
rejiggering of the InitShmemAllocation code sequence. (I invented a
ShmemAllocUnlocked() function to make that a little cleaner than it used
to be. That function is not meant for any uses other than the ones it
has now, but it beats having InitShmemAllocation() know explicitly about
allocation of space for semaphores and spinlocks.) This change means an
extra indirection to access the semaphore data, but since we only touch
that when blocking or awakening a process, there shouldn't be any
meaningful performance penalty. Moreover, at least for the unnamed-POSIX
case on Linux, the sem_t type is quite a bit wider than a pointer, so this
reduces sizeof(PGPROC) which seems like a good thing.
For the named-POSIX API, there's effectively no change: the PGPROC.sem
field was and still is a pointer to something returned by sem_open() in
the postmaster's memory space. Document and check the pre-existing
limitation that this case can't work in EXEC_BACKEND mode.
It did not seem worth unifying the Windows semaphore ABI with the Unix
cases, since there's no likelihood of needing ABI compatibility much less
runtime switching across those cases. However, we can simplify the Windows
code a bit if we define PGSemaphore as being directly a HANDLE, rather than
pointer to HANDLE, so let's do that while we're here. (This also ends up
being no change in what's physically stored in PGPROC.sem. We're just
moving the HANDLE fetch from callees to callers.)
It would take a bunch of additional code shuffling to get to the point of
actually choosing a semaphore API at postmaster start, but the effects
of that would now be localized in the port/XXX_sema.c files, so it seems
like fit material for a separate patch. The need for it is unproven as
yet, anyhow, whereas the ABI risk to extensions seems real enough.
Discussion: https://postgr.es/m/[email protected]
|
|
A majority of callers seem to have believed that this was the API spec
already, because they omitted any check for a NULL result, and hence
would crash on an out-of-shared-memory failure. The original proposal
was to just add such error checks everywhere, but that does nothing to
prevent similar omissions in future. Instead, let's make ShmemAlloc()
throw the error (so we can remove the caller-side checks that do exist),
and introduce a new function ShmemAllocNoError() that has the previous
behavior of returning NULL, for the small number of callers that need
that and are prepared to do the right thing. This also lets us remove
the rather wishy-washy behavior of printing a WARNING for out-of-shmem,
which never made much sense: either the caller has a strategy for
dealing with that, or it doesn't. It's not ShmemAlloc's business to
decide whether a warning is appropriate.
The v10 release notes will need to call this out as a significant
source-code change. It's likely that it will be a bug fix for
extension callers too, but if not, they'll need to change to using
ShmemAllocNoError().
This is nominally a bug fix, but the odds that it's fixing any live
bug are actually rather small, because in general the requests
being made by the unchecked callers were already accounted for in
determining the overall shmem size, so really they ought not fail.
Between that and the possible impact on extensions, no back-patch.
Discussion: <[email protected]>
|
|
Backpatch certain files through 9.1
|
|
Backpatch certain files through 9.0
|
|
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
|
|
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
|
|
|
|
ShmemIndex, so 64 leaves some headroom.
Kevin Grittner
|
|
If a standby is broadcasting reply messages and we have named
one or more standbys in synchronous_standby_names then allow
users who set synchronous_replication to wait for commit, which
then provides strict data integrity guarantees. Design avoids
sending and receiving transaction state information so minimises
bookkeeping overheads. We synchronize with the highest priority
standby that is connected and ready to synchronize. Other standbys
can be defined to takeover in case of standby failure.
This version has very strict behaviour; more relaxed options
may be added at a later date.
Simon Riggs and Fujii Masao, with reviews by Yeb Havinga, Jaime
Casanova, Heikki Linnakangas and Robert Haas, plus the assistance
of many other design reviewers.
|
|
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
|
|
|
|
|
|
|
|
provided by Andrew.
|
|
|
|
allowed different processes to have different addresses for the shmem segment
in quite a long time, but there were still a few places left that used the
old coding convention. Clean them up to reduce confusion and improve the
compiler's ability to detect pointer type mismatches.
Kris Jurka
|
|
|
|
third party includes (like tcl) that define DLLIMPORT.
|
|
|
|
back-stamped for this.
|
|
modules; the first try was not usable in EXEC_BACKEND builds (e.g.,
Windows). Instead, just provide some entry points to increase the
allocation requests during postmaster start, and provide a dedicated
LWLock that can be used to synchronize allocation operations performed
by backends. Per discussion with Marc Munro.
|
|
lightweight locks.
Marc Munro
|
|
|
|
an LWLock instead of a spinlock. This hardly matters on Unix machines
but should improve startup performance on Windows (or any port using
EXEC_BACKEND). Per previous discussion.
|
|
to 'Size' (that is, size_t), and install overflow detection checks in it.
This allows us to remove the former arbitrary restrictions on NBuffers
etc. It won't make any difference in a 32-bit machine, but in a 64-bit
machine you could theoretically have terabytes of shared buffers.
(How efficiently we could manage 'em remains to be seen.) Similarly,
num_temp_buffers, work_mem, and maintenance_work_mem can be set above
2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional
work by moi.
|
|
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
|
|
tables in shared memory. This ensures that overflow of the lock table
creates no long-lasting problems. Per discussion with Merlin Moncure.
|
|
|
|
Claudio Natoli
|
|
|
|
|
|
This patch is an updated version of the lock listing patch. I've made
the following changes:
- write documentation
- wrap the SRF in a view called 'pg_locks': all user-level
access should be done through this view
- re-diff against latest CVS
One thing I chose not to do is adapt the SRF to use the anonymous
composite type code from Joe Conway. I'll probably do that eventually,
but I'm not really convinced it's a significantly cleaner way to
bootstrap SRF builtins than the method this patch uses (of course, it
has other uses...)
Neil Conway
|
|
|