summaryrefslogtreecommitdiff
path: root/src/backend/utils/hash
AgeCommit message (Collapse)Author
2025-04-23Avoid possibly-theoretical OOM crash hazard in hash_create().Tom Lane
One place in hash_create() used DynaHashAlloc() as a convenient shorthand for MemoryContextAlloc(). That was fine when it was written, but it stopped being fine when 9c911ec06 changed DynaHashAlloc() to use MCXT_ALLOC_NO_OOM (mea culpa). Change the code to call plain MemoryContextAlloc() as intended. I think that this bug may be unreachable in practice, since we now always create AllocSets with some space already allocated, so that an OOM failure here for a non-shared hash table should be impossible (with a hash table name of reasonable length anyway). And there aren't enough shared hash tables to make a crash for one of those probable. Nonetheless it's clearly not operating as designed, so back-patch to v16 where 9c911ec06 came in. Reported-by: Maksim Korotkov <[email protected]> Author: Tom Lane <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 16
2025-04-04Revert "Improve accounting for memory used by shared hash tables"Tomas Vondra
This reverts commit f5930f9a98ea65d659d41600a138e608988ad122. This broke the expansion of private hash tables, which reallocates the directory. But that's impossible when it's allocated together with the other fields, and dir_realloc() failed with BogusFree. Clearly, this needs rethinking. Discussion: https://postgr.es/m/CAApHDvriCiNkm=v521AP6PKPfyWkJ++jqZ9eqX4cXnhxLv8w-A@mail.gmail.com
2025-04-02Improve accounting for memory used by shared hash tablesTomas Vondra
pg_shmem_allocations tracks the memory allocated by ShmemInitStruct(), but for shared hash tables that covered only the header and hash directory. The remaining parts (segments and buckets) were allocated later using ShmemAlloc(), which does not update the shmem accounting. Thus, these allocations were not shown in pg_shmem_allocations. This commit improves the situation by allocating all the hash table parts at once, using a single ShmemInitStruct() call. This way the ShmemIndex entries (and thus pg_shmem_allocations) better reflect the proper size of the hash table. This affects allocations for private (non-shared) hash tables too, as the hash_create() code is shared. For non-shared tables this however makes no practical difference. This changes the alignment a bit. ShmemAlloc() aligns the chunks using CACHELINEALIGN(), which means some parts (header, directory, segments) were aligned this way. Allocating all parts as a single chunk removes this (implicit) alignment. We've considered adding explicit alignment, but we've decided not to - it seems to be merely a coincidence due to using the ShmemAlloc() API, not due to necessity. Author: Rahila Syed <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Nazir Bilal Yavuz <[email protected]> Reviewed-by: Tomas Vondra <[email protected]> Discussion: https://postgr.es/m/CAH2L28vHzRankszhqz7deXURxKncxfirnuW68zD7+hVAqaS5GQ@mail.gmail.com
2025-03-13pg_noreturn to replace pg_attribute_noreturn()Peter Eisentraut
We want to support a "noreturn" decoration on more compilers besides just GCC-compatible ones, but for that we need to move the decoration in front of the function declaration instead of either behind it or wherever, which is the current style afforded by GCC-style attributes. Also rename the macro to "pg_noreturn" to be similar to the C11 standard "noreturn". pg_noreturn is now supported on all compilers that support C11 (using _Noreturn), as well as GCC-compatible ones (using __attribute__, as before), as well as MSVC (using __declspec). (When PostgreSQL requires C11, the latter two variants can be dropped.) Now, all supported compilers effectively support pg_noreturn, so the extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped. This also fixes a possible problem if third-party code includes stdnoreturn.h, because then the current definition of #define pg_attribute_noreturn() __attribute__((noreturn)) would cause an error. Note that the C standard does not support a noreturn attribute on function pointer types. So we have to drop these here. There are only two instances at this time, so it's not a big loss. In one case, we can make up for it by adding the pg_noreturn to a wrapper function and adding a pg_unreachable(), in the other case, the latter was already done before. Reviewed-by: Dagfinn Ilmari Mannsåker <[email protected]> Reviewed-by: Andres Freund <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
2025-01-01Update copyright for 2025Bruce Momjian
Backpatch-through: 13
2024-11-28Remove useless casts to (void *)Peter Eisentraut
Many of them just seem to have been copied around for no real reason. Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers Discussion: https://www.postgresql.org/message-id/flat/[email protected]
2024-10-27Remove unused #include's from backend .c filesPeter Eisentraut
as determined by IWYU These are mostly issues that are new since commit dbbca2cf299. Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-08-12Add user-callable CRC functions.Nathan Bossart
We've had code for CRC-32 and CRC-32C for some time (for WAL records, etc.), but there was no way for users to call it, despite apparent popular demand. The new crc32() and crc32c() functions accept bytea input and return bigint (to avoid returning negative values). Bumps catversion. Author: Aleksander Alekseev Reviewed-by: Peter Eisentraut, Tom Lane Discussion: https://postgr.es/m/CAJ7c6TNMTGnqnG%3DyXXUQh9E88JDckmR45H2Q%2B%3DucaCLMOW1QQw%40mail.gmail.com
2024-08-08Add a caveat to hash_seq_init_with_hash_value() header commentAlexander Korotkov
The typical use-case for hash_seq_init_with_hash_value() is syscache callback. Add a caveat that the default hash function doesn't match syscache hash function. So, one needs to define a custom hash function. Reported-by: Pavel Stehule Discussion: https://postgr.es/m/CAFj8pRAXmv6eyYx%3DE_BTfyK%3DO_%2ByOF8sXB%3D0bn9eOBt90EgWRA%40mail.gmail.com Reviewed-by: Pavel Stehule
2024-08-07Introduce hash_search_with_hash_value() functionAlexander Korotkov
This new function iterates hash entries with given hash values. This function is designed to avoid full sequential hash search in the syscache invalidation callbacks. Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru Author: Teodor Sigaev Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov Reviewed-by: Andrei Lepikhov
2024-03-17Mark hash_corrupted() as pg_attribute_noreturn.Tom Lane
Coverity started complaining about this after cc5ef90ed. The code's not really different from before, but might as well clarify its intent.
2024-03-14Refactor initial hash lookup in dynahash.cMichael Paquier
The same pattern is used three times in dynahash.c to retrieve a bucket number and a hash bucket from a hash value. This has popped up while discussing improvements for the type cache, where this piece of refactoring would become useful. Note that hash_search_with_hash_value() does not need the bucket number, just the hash bucket. Author: Teodor Sigaev Reviewed-by: Aleksander Alekseev, Michael Paquier Discussion: https://postgr.es/m/[email protected]
2024-01-04Update copyright for 2024Bruce Momjian
Reported-by: Michael Paquier Discussion: https://postgr.es/m/[email protected] Backpatch-through: 12
2023-01-02Update copyright for 2023Bruce Momjian
Backpatch-through: 11
2022-12-20Add copyright notices to meson filesAndrew Dunstan
Discussion: https://postgr.es/m/[email protected]
2022-10-14Make some minor improvements in memory-context infrastructure.Tom Lane
We lack a version of repalloc() that supports MCXT_ALLOC_NO_OOM semantics, so invent repalloc_extended() with the usual set of flags. repalloc_huge() becomes a legacy wrapper for that. Also, fix dynahash.c so that it can support HASH_ENTER_NULL requests when using the default palloc-based allocator. The only reason it didn't do that already was the lack of the MCXT_ALLOC_NO_OOM option when that code was written, ages ago. While here, simplify a few overcomplicated tests in mcxt.c. Discussion: https://postgr.es/m/[email protected]
2022-09-22meson: Add initial version of meson based build systemAndres Freund
Autoconf is showing its age, fewer and fewer contributors know how to wrangle it. Recursive make has a lot of hard to resolve dependency issues and slow incremental rebuilds. Our home-grown MSVC build system is hard to maintain for developers not using Windows and runs tests serially. While these and other issues could individually be addressed with incremental improvements, together they seem best addressed by moving to a more modern build system. After evaluating different build system choices, we chose to use meson, to a good degree based on the adoption by other open source projects. We decided that it's more realistic to commit a relatively early version of the new build system and mature it in tree. This commit adds an initial version of a meson based build system. It supports building postgres on at least AIX, FreeBSD, Linux, macOS, NetBSD, OpenBSD, Solaris and Windows (however only gcc is supported on aix, solaris). For Windows/MSVC postgres can now be built with ninja (faster, particularly for incremental builds) and msbuild (supporting the visual studio GUI, but building slower). Several aspects (e.g. Windows rc file generation, PGXS compatibility, LLVM bitcode generation, documentation adjustments) are done in subsequent commits requiring further review. Other aspects (e.g. not installing test-only extensions) are not yet addressed. When building on Windows with msbuild, builds are slower when using a visual studio version older than 2019, because those versions do not support MultiToolTask, required by meson for intra-target parallelism. The plan is to remove the MSVC specific build system in src/tools/msvc soon after reaching feature parity. However, we're not planning to remove the autoconf/make build system in the near future. Likely we're going to keep at least the parts required for PGXS to keep working around until all supported versions build with meson. Some initial help for postgres developers is at https://wiki.postgresql.org/wiki/Meson With contributions from Thomas Munro, John Naylor, Stone Tickle and others. Author: Andres Freund <[email protected]> Author: Nazir Bilal Yavuz <[email protected]> Author: Peter Eisentraut <[email protected]> Reviewed-By: Peter Eisentraut <[email protected]> Discussion: https://postgr.es/m/[email protected]
2022-01-08Update copyright for 2022Bruce Momjian
Backpatch-through: 10
2021-01-02Update copyright for 2021Bruce Momjian
Backpatch-through: 9.5
2020-12-15Improve hash_create()'s API for some added robustness.Tom Lane
Invent a new flag bit HASH_STRINGS to specify C-string hashing, which was formerly the default; and add assertions insisting that exactly one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set. This is in hopes of preventing recurrences of the type of oversight fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS). Also, when HASH_STRINGS is specified, insist that the keysize be more than 8 bytes. This is a heuristic, but it should catch accidental use of HASH_STRINGS for integer or pointer keys. (Nearly all existing use-cases set the keysize to NAMEDATALEN or more, so there's little reason to think this restriction should be problematic.) Tweak hash_create() to insist that the HASH_ELEM flag be set, and remove the defaults it had for keysize and entrysize. Since those defaults were undocumented and basically useless, no callers omitted HASH_ELEM anyway. Also, remove memset's zeroing the HASHCTL parameter struct from those callers that had one. This has never been really necessary, and while it wasn't a bad coding convention it was confusing that some callers did it and some did not. We might as well save a few cycles by standardizing on "not". Also improve the documentation for hash_create(). In passing, improve reinit.c's usage of a hash table by storing the key as a binary Oid rather than a string; and, since that's a temporary hash table, allocate it in CurrentMemoryContext for neatness. Discussion: https://postgr.es/m/[email protected]
2020-09-19Code review for dynahash change.Thomas Munro
Commit be0a6666 left behind a comment about the order of some tests that didn't make sense without the expensive division, and in fact we might as well change the order to one that fails more cheaply most of the time as a micro-optimization. Also, remove the "+ 1" applied to max_bucket, to drop an instruction and match the original behavior. Per review from Tom Lane. Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2020-09-18Remove large fill factor support from dynahash.c.Thomas Munro
Since ancient times we have had support for a fill factor (maximum load factor) to be set for a dynahash hash table, but: 1. It was an integer, whereas for in-memory hash tables interesting load factor targets are probably somewhere near the 0.75-1.0 range. 2. It was implemented in a way that performed an expensive division operation that regularly showed up in profiles. 3. We are not aware of anyone ever having used a non-default value. Therefore, remove support, effectively fixing it at 1. Author: Jakub Wartak <[email protected]> Reviewed-by: Alvaro Herrera <[email protected]> Reviewed-by: Tomas Vondra <[email protected]> Reviewed-by: Thomas Munro <[email protected]> Reviewed-by: David Rowley <[email protected]> Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2020-08-01Improve programmer docs for simplehash and dynahash.Thomas Munro
When reading the code it's not obvious when one should prefer dynahash over simplehash and vice-versa, so, for programmer-friendliness, add comments to inform that decision. Show sample simplehash method signatures. Author: James Coleman <[email protected]> Discussion: https://postgr.es/m/CAAaqYe_dOF39gAJ8rL-a3YO3Qo96MHMRQ2whFjK5ZcU6YvMQSA%40mail.gmail.com
2020-07-14Fix -Wcast-function-type warningsPeter Eisentraut
Three groups of issues needed to be addressed: load_external_function() and related functions returned PGFunction, even though not necessarily all callers are looking for a function of type PGFunction. Since these functions are really just wrappers around dlsym(), change to return void * just like dlsym(). In dynahash.c, we are using strlcpy() where a function with a signature like memcpy() is expected. This should be safe, as the new comment there explains, but the cast needs to be augmented to avoid the warning. In PL/Python, methods all need to be cast to PyCFunction, per Python API, but this now runs afoul of these warnings. (This issue also exists in core CPython.) To fix the second and third case, we add a new type pg_funcptr_t that is defined specifically so that gcc accepts it as a special function pointer that can be cast to any other function pointer without the warning. Also add -Wcast-function-type to the standard warning flags, subject to configure check. Reviewed-by: Tom Lane <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/1e97628e-6447-b4fd-e230-d109cec2d584%402ndquadrant.com
2020-05-14Initial pgindent and pgperltidy run for v13.Tom Lane
Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit c9d297751, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.
2020-05-13Dial back -Wimplicit-fallthrough to level 3Alvaro Herrera
The additional pain from level 4 is excessive for the gain. Also revert all the source annotation changes to their original wordings, to avoid back-patching pain. Discussion: https://postgr.es/m/[email protected]
2020-05-12Add -Wimplicit-fallthrough to CFLAGS and CXXFLAGSAlvaro Herrera
Use it at level 4, a bit more restrictive than the default level, and tweak our commanding comments to FALLTHROUGH. (However, leave zic.c alone, since it's external code; to avoid the warnings that would appear there, change CFLAGS for that file in the Makefile.) Author: Julien Rouhaud <[email protected]> Author: Álvaro Herrera <[email protected]> Reviewed-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/20200412081825.qyo5vwwco3fv4gdo@nol Discussion: https://postgr.es/m/flat/[email protected]
2020-04-08Modify various power 2 calculations to use new helper functionsDavid Rowley
First pass of modifying various places that obtain the next power of 2 of a number and make them use the new functions added in pg_bitutils.h instead. This also removes the _hash_log2() function. There are no longer any callers in core. Other users can swap their _hash_log2(n) call to make use of pg_ceil_log2_32(n). Author: David Fetter, with some minor adjustments by me Reviewed-by: John Naylor, Jesse Zhang Discussion: https://postgr.es/m/20200114173553.GE32763%40fetter.org
2020-02-27Move src/backend/utils/hash/hashfn.c to src/commonRobert Haas
This also involves renaming src/include/utils/hashutils.h, which becomes src/include/common/hashfn.h. Perhaps an argument can be made for keeping the hashutils.h name, but it seemed more consistent to make it match the name of the file, and also more descriptive of what is actually going on here. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Off-list advice on how not to break the Windows build from Davinder Singh and Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24Adapt hashfn.c and hashutils.h for frontend use.Robert Haas
hash_any() and its various variants are defined to return Datum, which is a backend-only concept, but the underlying functions actually want to return uint32 and uint64, and only return Datum because it's convenient for callers who are using them to implement a hash function for some SQL datatype. However, changing these functions to return uint32 and uint64 seems like it might lead to programming errors or back-patching difficulties, both because they are widely used and because failure to use UInt{32,64}GetDatum() might not provoke a compilation error. Instead, rename the existing functions as well as changing the return type, and add static inline wrappers for those callers that need the previous behavior. Although this commit adapts hashutils.h and hashfn.c so that they can be compiled as frontend code, it does not actually do anything that would cause them to be so compiled. That is left for another commit. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24Put all the prototypes for hashfn.c into the same header file.Robert Haas
Previously, some of the prototypes for functions in hashfn.c were in utils/hashutils.h and others were in utils/hsearch.h, but that is confusing and has no particular benefit. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24Move bitmap_hash and bitmap_match to bitmapset.c.Robert Haas
The closely-related function bms_hash_value is already defined in that file, and this change means that hashfn.c no longer needs to depend on nodes/bitmapset.h. That gets us closer to allowing use of the hash functions in hashfn.c in frontend code. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-01-01Update copyrights for 2020Bruce Momjian
Backpatch-through: update all files in master, backpatch legal files through 9.4
2019-11-05Split all OBJS style lines in makefiles into one-line-per-entry style.Andres Freund
When maintaining or merging patches, one of the most common sources for conflicts are the list of objects in makefiles. Especially when the split across lines has been changed on both sides, which is somewhat common due to attempting to stay below 80 columns, those conflicts are unnecessarily laborious to resolve. By splitting, and alphabetically sorting, OBJS style lines into one object per line, conflicts should be less frequent, and easier to resolve when they still occur. Author: Andres Freund Discussion: https://postgr.es/m/[email protected]
2019-10-19Fix most -Wundef warningsPeter Eisentraut
In some cases #if was used instead of #ifdef in an inconsistent style. Cleaning this up also helps when analyzing cases like 38d8dce61fff09daae0edb6bcdd42b0c7f10ebcd where this makes a difference. There are no behavior changes here, but the change in pg_bswap.h would prevent possible accidental misuse by third-party code. Discussion: https://www.postgresql.org/message-id/flat/3b615ca5-c595-3f1d-fdf7-a429e564f614%402ndquadrant.com
2019-10-16Refresh some incorrect links in pg_crc.c/hMichael Paquier
Author: Vignesh C Discussion: https://postgr.es/m/CALDaNm0LPk9vTGTBPBRv0=fX=94o4r6-DuBbHNeCN2AH5bufLw@mail.gmail.com
2019-08-18Fix incidental warnings from cpluspluscheck.Tom Lane
Remove use of "register" keyword in hashfn.c. It's obsolescent according to recent C++ compilers, and no modern C compiler pays much attention to it either. Also fix one cosmetic warning about signed vs unsigned comparison. Discussion: https://postgr.es/m/[email protected]
2019-05-22Initial pgindent run for v12.Tom Lane
This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/[email protected]
2019-04-19Fix collection of typos and grammar mistakes in docs and commentsMichael Paquier
Author: Justin Pryzby Discussion: https://postgr.es/m/[email protected]
2019-03-11Move hash_any prototype from access/hash.h to utils/hashutils.hAlvaro Herrera
... as well as its implementation from backend/access/hash/hashfunc.c to backend/utils/hash/hashfn.c. access/hash is the place for the hash index AM, not really appropriate for generic facilities, which is what hash_any is; having things the old way meant that anything using hash_any had to include the AM's include file, pointlessly polluting its namespace with unrelated, unnecessary cruft. Also move the HTEqual strategy number to access/stratnum.h from access/hash.h. To avoid breaking third-party extension code, add an #include "utils/hashutils.h" to access/hash.h. (An easily removed line by committers who enjoy their asbestos suits to protect them from angry extension authors.) Discussion: https://postgr.es/m/[email protected]
2019-01-02Update copyright for 2019Bruce Momjian
Backpatch-through: certain files through 9.4
2018-03-27Allow memory contexts to have both fixed and variable ident strings.Tom Lane
Originally, we treated memory context names as potentially variable in all cases, and therefore always copied them into the context header. Commit 9fa6f00b1 rethought this a little bit and invented a distinction between fixed and variable names, skipping the copy step for the former. But we can make things both simpler and more useful by instead allowing there to be two parts to a context's identification, a fixed "name" and an optional, variable "ident". The name supplied in the context create call is now required to be a compile-time-constant string in all cases, as it is never copied but just pointed to. The "ident" string, if wanted, is supplied later. This is needed because typically we want the ident to be stored inside the context so that it's cleaned up automatically on context deletion; that means it has to be copied into the context before we can set the pointer. The cost of this approach is basically just an additional pointer field in struct MemoryContextData, which isn't much overhead, and is bought back entirely in the AllocSet case by not needing a headerSize field anymore, since we no longer have to cope with variable header length. In addition, we can simplify the internal interfaces for memory context creation still further, saving a few cycles there. And it's no longer true that a custom identifier disqualifies a context from participating in aset.c's freelist scheme, so possibly there's some win on that end. All the places that were using non-compile-time-constant context names are adjusted to put the variable info into the "ident" instead. This allows more effective identification of those contexts in many cases; for example, subsidary contexts of relcache entries are now identified by both type (e.g. "index info") and relname, where before you got only one or the other. Contexts associated with PL function cache entries are now identified more fully and uniformly, too. I also arranged for plancache contexts to use the query source string as their identifier. This is basically free for CachedPlanSources, as they contained a copy of that string already. We pay an extra pstrdup to do it for CachedPlans. That could perhaps be avoided, but it would make things more fragile (since the CachedPlanSource is sometimes destroyed first). I suspect future improvements in error reporting will require CachedPlans to have a copy of that string anyway, so it's not clear that it's worth moving mountains to avoid it now. This also changes the APIs for context statistics routines so that the context-specific routines no longer assume that output goes straight to stderr, nor do they know all details of the output format. This is useful immediately to reduce code duplication, and it also allows for external code to do something with stats output that's different from printing to stderr. The reason for pushing this now rather than waiting for v12 is that it rethinks some of the API changes made by commit 9fa6f00b1. Seems better for extension authors to endure just one round of API changes not two. Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-01-03Update copyright for 2018Bruce Momjian
Backpatch-through: certain files through 9.3
2017-12-13Rethink MemoryContext creation to improve performance.Tom Lane
This patch makes a number of interrelated changes to reduce the overhead involved in creating/deleting memory contexts. The key ideas are: * Include the AllocSetContext header of an aset.c context in its first malloc request, rather than allocating it separately in TopMemoryContext. This means that we now always create an initial or "keeper" block in an aset, even if it never receives any allocation requests. * Create freelists in which we can save and recycle recently-destroyed asets (this idea is due to Robert Haas). * In the common case where the name of a context is a constant string, just store a pointer to it in the context header, rather than copying the string. The first change eliminates a palloc/pfree cycle per context, and also avoids bloat in TopMemoryContext, at the price that creating a context now involves a malloc/free cycle even if the context never receives any allocations. That would be a loser for some common usage patterns, but recycling short-lived contexts via the freelist eliminates that pain. Avoiding copying constant strings not only saves strlen() and strcpy() overhead, but is an essential part of the freelist optimization because it makes the context header size constant. Currently we make no attempt to use the freelist for contexts with non-constant names. (Perhaps someday we'll need to think harder about that, but in current usage, most contexts with custom names are long-lived anyway.) The freelist management in this initial commit is pretty simplistic, and we might want to refine it later --- but in common workloads that will never matter because the freelists will never get full anyway. To create a context with a non-constant name, one is now required to call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME option. AllocSetContextCreate becomes a wrapper macro, and it includes a test that will complain about non-string-literal context name parameters on gcc and similar compilers. An unfortunate side effect of making AllocSetContextCreate a macro is that one is now *required* to use the size parameter abstraction macros (ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of writing out individual size parameters no longer works unless you switch to AllocSetContextCreateExtended. Internally to the memory-context-related modules, the context creation APIs are simplified, removing the rather baroque original design whereby a context-type module called mcxt.c which then called back into the context-type module. That saved a bit of code duplication, but not much, and it prevented context-type modules from exercising control over the allocation of context headers. In passing, I converted the test-and-elog validation of aset size parameters into Asserts to save a few more cycles. The original thought was that callers might compute size parameters on the fly, but in practice nobody does that, so it's useless to expend cycles on checking those numbers in production builds. Also, mark the memory context method-pointer structs "const", just for cleanliness. Discussion: https://postgr.es/m/[email protected]
2017-11-08Change TRUE/FALSE to true/falsePeter Eisentraut
The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <[email protected]>
2017-08-02Remove broken and useless entry-count printing in HASH_DEBUG code.Tom Lane
init_htab(), with #define HASH_DEBUG, prints a bunch of hashtable parameters. It used to also print nentries, but commit 44ca4022f changed that to "hash_get_num_entries(hctl)", which is wrong (the parameter should be "hashp"). Rather than correct the coding, though, let's just remove that field from the printout. The table must be empty, since we just finished building it, so expensively calculating the number of entries is rather pointless. Moreover hash_get_num_entries makes assumptions (about not needing locks) which we could do without in debugging code. Noted by Choi Doo-Won in bug #14764. Back-patch to 9.6 where the faulty code was introduced. Discussion: https://postgr.es/m/[email protected]
2017-07-22Improve comments about partitioned hash table freelists.Tom Lane
While I couldn't find any live bugs in commit 44ca4022f, the comments seemed pretty far from adequate; in particular it was not made plain that "borrowing" entries from other freelists is critical for correctness. Try to improve the commentary. A couple of very minor code style tweaks, as well. Discussion: https://postgr.es/m/[email protected]
2017-06-21Phase 3 of pgindent updates.Tom Lane
Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2017-06-21Phase 2 of pgindent updates.Tom Lane
Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2017-06-21Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane
The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]