Lists: | pgsql-hackers |
---|
From: | Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | missing warning in pg_import_system_collations |
Date: | 2021-09-09 06:45:05 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hello hackers,
In pg_import_system_collations() there is this fragment of code:
enc = pg_get_encoding_from_locale(localebuf, false);
if (enc < 0)
{
/* error message printed by pg_get_encoding_from_locale() */
continue;
}
However, false passed to pg_get_encoding_from_locale() means
write_message argument is false, so no error message is ever printed.
I propose an obvious patch (see attachment).
Introduced in aa17c06fb in January 2017 when debug was replaced by
false, so I guess back-patching through 10 would be appropriate.
--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru
Attachment | Content-Type | Size |
---|---|---|
pg_import_system_collations-warning.patch | text/x-patch | 522 bytes |
From: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
---|---|
To: | Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: missing warning in pg_import_system_collations |
Date: | 2021-09-09 11:46:25 |
Message-ID: | CAEudQArVn++oaJbV=Drug4RxbfE7WZ0GzxS1poWbW5L8LxTT=Q@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Em qui., 9 de set. de 2021 às 03:45, Anton Voloshin <
a(dot)voloshin(at)postgrespro(dot)ru> escreveu:
> Hello hackers,
>
> In pg_import_system_collations() there is this fragment of code:
>
> enc = pg_get_encoding_from_locale(localebuf, false);
> if (enc < 0)
> {
> /* error message printed by pg_get_encoding_from_locale() */
> continue;
> }
>
> However, false passed to pg_get_encoding_from_locale() means
> write_message argument is false, so no error message is ever printed.
> I propose an obvious patch (see attachment).
>
Yeah, seems correct to me.
The comment clearly expresses the intention.
> Introduced in aa17c06fb in January 2017 when debug was replaced by
> false, so I guess back-patching through 10 would be appropriate.
>
This is an oversight.
+1 from me.
Ranier Vilela
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
Subject: | Re: missing warning in pg_import_system_collations |
Date: | 2021-09-09 14:51:10 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> writes:
> In pg_import_system_collations() there is this fragment of code:
> enc = pg_get_encoding_from_locale(localebuf, false);
> if (enc < 0)
> {
> /* error message printed by pg_get_encoding_from_locale() */
> continue;
> }
> However, false passed to pg_get_encoding_from_locale() means
> write_message argument is false, so no error message is ever printed.
> I propose an obvious patch (see attachment).
> Introduced in aa17c06fb in January 2017 when debug was replaced by
> false, so I guess back-patching through 10 would be appropriate.
I don't think this is obvious at all.
In the original coding (before aa17c06fb, when this code was in initdb),
we printed a warning if "debug" was true and otherwise printed nothing.
The other "if (debug)" cases in the code that got moved over were
translated to "elog(DEBUG1)", but there isn't any API to make
pg_get_encoding_from_locale() log at that level.
What you propose to do here would promote this case from
ought-to-be-DEBUG1 to WARNING, which seems to me to be way too much in the
user's face. Or, if there actually is a case for complaining, then all
those messages ought to be WARNING not DEBUG1. But I'm inclined to think
that having pg_import_system_collations silently ignore unusable locales
is the right thing most of the time.
Assuming we don't want to change pg_get_encoding_from_locale()'s API,
the simplest fix is to duplicate its error message, so more or less
if (enc < 0)
{
- /* error message printed by pg_get_encoding_from_locale() */
+ elog(DEBUG1, "could not determine encoding for locale \"%s\"",
+ localebuf)));
continue;
}
regards, tom lane
From: | Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
Subject: | Re: missing warning in pg_import_system_collations |
Date: | 2021-09-09 15:17:35 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 09/09/2021 21:51, Tom Lane wrote:
> What you propose to do here would promote this case from
> ought-to-be-DEBUG1 to WARNING, which seems to me to be way too much in the
> user's face. Or, if there actually is a case for complaining, then all
> those messages ought to be WARNING not DEBUG1. ...
>
> Assuming we don't want to change pg_get_encoding_from_locale()'s API,
> the simplest fix is to duplicate its error message, so more or less
>
> if (enc < 0)
> {
> - /* error message printed by pg_get_encoding_from_locale() */
> + elog(DEBUG1, "could not determine encoding for locale \"%s\"",
> + localebuf)));
> continue;
> }
Upon thinking a little more, I agree.
The warnings I happen to get from initdb on my current machine (with
many various locales installed, more than on a typical box) are:
performing post-bootstrap initialization ... 2021-09-09 22:04:01.678 +07
[482312] WARNING: could not determine encoding for locale
"hy_AM.armscii8": codeset is "ARMSCII-8"
2021-09-09 22:04:01.679 +07 [482312] WARNING: could not determine
encoding for locale "ka_GE": codeset is "GEORGIAN-PS"
2021-09-09 22:04:01.679 +07 [482312] WARNING: could not determine
encoding for locale "kk_KZ": codeset is "PT154"
2021-09-09 22:04:01.679 +07 [482312] WARNING: could not determine
encoding for locale "kk_KZ.rk1048": codeset is "RK1048"
2021-09-09 22:04:01.686 +07 [482312] WARNING: could not determine
encoding for locale "tg_TJ": codeset is "KOI8-T"
2021-09-09 22:04:01.686 +07 [482312] WARNING: could not determine
encoding for locale "th_TH": codeset is "TIS-620"
ok
While they are definitely interesting as DEBUG1, not so as a WARNING.
So, +1 from me for your proposed elog(DEBUG1, ...); patch. Thank you.
--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
Subject: | Re: missing warning in pg_import_system_collations |
Date: | 2021-09-09 18:37:52 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> writes:
> On 09/09/2021 21:51, Tom Lane wrote:
>> Assuming we don't want to change pg_get_encoding_from_locale()'s API,
>> the simplest fix is to duplicate its error message, so more or less
>>
>> if (enc < 0)
>> {
>> - /* error message printed by pg_get_encoding_from_locale() */
>> + elog(DEBUG1, "could not determine encoding for locale \"%s\"",
>> + localebuf)));
>> continue;
>> }
> Upon thinking a little more, I agree.
Another approach we could take is to deem the comment incorrect and
just remove it, codifying the current behavior of silently ignoring
unrecognized encodings. The reason that seems like it might be
appropriate is that the logic immediately below this bit silently
ignores encodings that are known but are frontend-only:
if (!PG_VALID_BE_ENCODING(enc))
continue; /* ignore locales for client-only encodings */
It's sure not very clear to me why one case deserves a message and the
other not. Perhaps they both do, which would lead to adding another
DEBUG1 message here.
regards, tom lane
From: | Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
Subject: | Re: missing warning in pg_import_system_collations |
Date: | 2021-09-10 06:47:48 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 10/09/2021 01:37, Tom Lane wrote:
> Another approach we could take is to deem the comment incorrect and
> just remove it, codifying the current behavior of silently ignoring
> unrecognized encodings. The reason that seems like it might be
> appropriate is that the logic immediately below this bit silently
> ignores encodings that are known but are frontend-only:
>
> if (!PG_VALID_BE_ENCODING(enc))
> continue; /* ignore locales for client-only encodings */
>
> It's sure not very clear to me why one case deserves a message and the
> other not. Perhaps they both do, which would lead to adding another
> DEBUG1 message here.
I'm not an expert in locales, but I think it makes some sense to be
silent about encodings we have consciously decided to ignore as we have
them in our tables, but marked them as frontend-only
(!PG_VALID_BE_ENCODING(enc)).
Just like it makes sense to do give a debug-level warning about
encodings seen in locale -a output but not recognized by us at all
(pg_get_encoding_from_locale(localebuf, false) < 0).
Therefore I think your patch with duplicated error message is better
than what we have currently. I don't see how adding debug-level messages
about skipping frontend-only encodings would be of any significant use here.
Unless someone more experienced in locales' subtleties would like to
chime in.
--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
Subject: | Re: missing warning in pg_import_system_collations |
Date: | 2021-09-11 16:19:49 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru> writes:
> On 10/09/2021 01:37, Tom Lane wrote:
>> It's sure not very clear to me why one case deserves a message and the
>> other not. Perhaps they both do, which would lead to adding another
>> DEBUG1 message here.
> I'm not an expert in locales, but I think it makes some sense to be
> silent about encodings we have consciously decided to ignore as we have
> them in our tables, but marked them as frontend-only
> (!PG_VALID_BE_ENCODING(enc)).
I'm not really buying that. It seems to me that the only reason anyone
would examine this debug output at all is that they want to know "why
didn't this locale (which I can see in 'locale -a' output) get imported?".
So the only cases I'm inclined to not log about are when we skip a locale
because there's already a matching pg_collation entry.
I experimented with the attached draft patch. The debug output on my
RHEL8 box (with a more-or-less-default set of locales) looks like
2021-09-11 12:13:09.908 EDT [41731] DEBUG: could not identify encoding for locale "hy_AM.armscii8"
2021-09-11 12:13:09.909 EDT [41731] DEBUG: could not identify encoding for locale "ka_GE"
2021-09-11 12:13:09.909 EDT [41731] DEBUG: could not identify encoding for locale "ka_GE.georgianps"
2021-09-11 12:13:09.909 EDT [41731] DEBUG: could not identify encoding for locale "kk_KZ"
2021-09-11 12:13:09.909 EDT [41731] DEBUG: could not identify encoding for locale "kk_KZ.pt154"
2021-09-11 12:13:09.926 EDT [41731] DEBUG: could not identify encoding for locale "tg_TJ"
2021-09-11 12:13:09.926 EDT [41731] DEBUG: could not identify encoding for locale "tg_TJ.koi8t"
2021-09-11 12:13:09.926 EDT [41731] DEBUG: could not identify encoding for locale "th_TH"
2021-09-11 12:13:09.926 EDT [41731] DEBUG: could not identify encoding for locale "th_TH.tis620"
2021-09-11 12:13:09.926 EDT [41731] DEBUG: could not identify encoding for locale "thai"
2021-09-11 12:13:09.929 EDT [41731] DEBUG: skipping client-only locale "zh_CN.gb18030"
2021-09-11 12:13:09.929 EDT [41731] DEBUG: skipping client-only locale "zh_CN.gbk"
2021-09-11 12:13:09.930 EDT [41731] DEBUG: skipping client-only locale "zh_HK"
2021-09-11 12:13:09.930 EDT [41731] DEBUG: skipping client-only locale "zh_HK.big5hkscs"
2021-09-11 12:13:09.930 EDT [41731] DEBUG: skipping client-only locale "zh_SG.gbk"
2021-09-11 12:13:09.930 EDT [41731] DEBUG: skipping client-only locale "zh_TW"
2021-09-11 12:13:09.930 EDT [41731] DEBUG: skipping client-only locale "zh_TW.big5"
I don't see a good reason to think that someone would be less confused
about why we reject zh_HK than why we reject th_TH. So I think if we're
going to worry about this then we should add both messages.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
log-debug-messages-for-rejected-locales.patch | text/x-diff | 814 bytes |