| Lists: | pgsql-hackers |
|---|
| From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 05:31:04 |
| Message-ID: | CAFj8pRCNTi2yHBXcdYf-cYZ63R8Laf9L49Q_uxt+A5WXKPPhxg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
Hi
maybe I found a bug in xmlserialize
SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS
varchar INDENT);
(2023-04-23 07:27:53) postgres=# SELECT xmlserialize(DOCUMENT
'<foo><bar><val x="y">42</val></bar></foo>' AS varchar INDENT);
┌─────────────────────────┐
│ xmlserialize │
╞═════════════════════════╡
│ <foo> ↵│
│ <bar> ↵│
│ <val x="y">42</val>↵│
│ </bar> ↵│
│ </foo> ↵│
│ │
└─────────────────────────┘
(1 row)
Looks so there is an extra empty row.
Regards
Pavel
| From: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
|---|---|
| To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 12:02:17 |
| Message-ID: | [email protected] |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On 23.04.23 07:31, Pavel Stehule wrote:
> Hi
>
> maybe I found a bug in xmlserialize
>
> SELECT xmlserialize(DOCUMENT '<foo><bar><val
> x="y">42</val></bar></foo>' AS varchar INDENT);
>
> (2023-04-23 07:27:53) postgres=# SELECT xmlserialize(DOCUMENT
> '<foo><bar><val x="y">42</val></bar></foo>' AS varchar INDENT);
> ┌─────────────────────────┐
> │ xmlserialize │
> ╞═════════════════════════╡
> │ <foo> ↵│
> │ <bar> ↵│
> │ <val x="y">42</val>↵│
> │ </bar> ↵│
> │ </foo> ↵│
> │ │
> └─────────────────────────┘
> (1 row)
>
> Looks so there is an extra empty row.
>
> Regards
>
> Pavel
Hi Pavel,
Good catch! It looks like it comes directly from libxml2.
xmlDocPtr doc = xmlReadDoc(BAD_CAST "<foo><bar><val
x=\"y\">42</val></bar></foo>", NULL, NULL, 0 );
xmlBufferPtr buf = NULL;
xmlSaveCtxtPtr ctxt = NULL;
buf = xmlBufferCreate();
ctxt = xmlSaveToBuffer(buf, NULL, XML_SAVE_NO_DECL | XML_SAVE_FORMAT);
xmlSaveDoc(ctxt, doc);
xmlSaveClose(ctxt);
printf("'%s'",buf->content);
==>
'<foo>
<bar>
<val x="y">42</val>
</bar>
</foo>
'
I'll do some digging to see if there is a good way to get rid of this
newline or if we need to chose a different dump function.
Thanks!
Best, Jim
| From: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
|---|---|
| To: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
| Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 12:25:35 |
| Message-ID: | [email protected] |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
> On Sun, Apr 23, 2023 at 02:02:17PM +0200, Jim Jones wrote:
> On 23.04.23 07:31, Pavel Stehule wrote:
> > Hi
> >
> > maybe I found a bug in xmlserialize
> >
> > SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>'
> > AS varchar INDENT);
> >
> > (2023-04-23 07:27:53) postgres=# SELECT xmlserialize(DOCUMENT
> > '<foo><bar><val x="y">42</val></bar></foo>' AS varchar INDENT);
> > ┌─────────────────────────┐
> > │ xmlserialize │
> > ╞═════════════════════════╡
> > │ <foo> ↵│
> > │ <bar> ↵│
> > │ <val x="y">42</val>↵│
> > │ </bar> ↵│
> > │ </foo> ↵│
> > │ │
> > └─────────────────────────┘
> > (1 row)
> >
> > Looks so there is an extra empty row.
> >
> > Regards
> >
> > Pavel
>
> Hi Pavel,
>
> Good catch! It looks like it comes directly from libxml2.
>
> xmlDocPtr doc = xmlReadDoc(BAD_CAST "<foo><bar><val
> x=\"y\">42</val></bar></foo>", NULL, NULL, 0 );
> xmlBufferPtr buf = NULL;
> xmlSaveCtxtPtr ctxt = NULL;
>
> buf = xmlBufferCreate();
> ctxt = xmlSaveToBuffer(buf, NULL, XML_SAVE_NO_DECL | XML_SAVE_FORMAT);
>
> xmlSaveDoc(ctxt, doc);
> xmlSaveClose(ctxt);
>
> printf("'%s'",buf->content);
>
> ==>
>
> '<foo>
> <bar>
> <val x="y">42</val>
> </bar>
> </foo>
> '
It looks like this happens only if xml type is DOCUMENT, and thus
xmlSaveDoc is used to save the doc directly. I might be wrong, but after
a quick look at the corresponding libxml functionality it seems that a
new line is getting added if the element type is not XML_XINCLUDE_{START|END},
which is unfortunate if correct.
| From: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com> |
|---|---|
| To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 12:41:55 |
| Message-ID: | CAMsGm5eEhP5AnmmLANEPzwzm062WXOpjSCaSqCfFETdmz8bQUA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On Sun, 23 Apr 2023 at 01:31, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> Hi
>
> maybe I found a bug in xmlserialize
>
> SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>'
> AS varchar INDENT);
>
> (2023-04-23 07:27:53) postgres=# SELECT xmlserialize(DOCUMENT
> '<foo><bar><val x="y">42</val></bar></foo>' AS varchar INDENT);
> ┌─────────────────────────┐
> │ xmlserialize │
> ╞═════════════════════════╡
> │ <foo> ↵│
> │ <bar> ↵│
> │ <val x="y">42</val>↵│
> │ </bar> ↵│
> │ </foo> ↵│
> │ │
> └─────────────────────────┘
> (1 row)
>
> Looks so there is an extra empty row.
>
I wouldn't necessarily worry about this much. There is not, as such, an
extra blank line at the end; rather it is conventional that a text file
should end with a newline character. That is, conventionally every single
line in a text file ends with a newline character, meaning the only text
file that doesn't end with a newline is the empty file. You can see this in
tools like diff, which explicitly report "no newline at end of file" if the
file ends with a different character.
If you were to save the value to a file you would probably want it the way
it is.
That being said, this is a database column result and I agree it would look
more elegant if the blank line in the display were not there. I might go so
far as to change the psql display routines to not leave a blank line after
the content in the event it ends with a newline.
| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
| Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 14:48:09 |
| Message-ID: | [email protected] |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> writes:
> On 23.04.23 07:31, Pavel Stehule wrote:
>> Looks so there is an extra empty row.
> Good catch! It looks like it comes directly from libxml2.
Is it really a bug? If libxml2 itself is putting in that newline,
I'm not sure we should take it on ourselves to strip it.
> I'll do some digging to see if there is a good way to get rid of this
> newline or if we need to chose a different dump function.
If we do want to strip it, I'd just add a couple lines of code
to delete any trailing newlines at the end of the process.
Compare, eg, pchomp().
regards, tom lane
| From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
|---|---|
| To: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 14:48:37 |
| Message-ID: | CAFj8pRBD3+9XGniHP4cTAdUJ5hv+8YBWNqt-NybEveH87-18Qg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
ne 23. 4. 2023 v 14:42 odesílatel Isaac Morland <isaac(dot)morland(at)gmail(dot)com>
napsal:
> On Sun, 23 Apr 2023 at 01:31, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>
>> Hi
>>
>> maybe I found a bug in xmlserialize
>>
>> SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>'
>> AS varchar INDENT);
>>
>> (2023-04-23 07:27:53) postgres=# SELECT xmlserialize(DOCUMENT
>> '<foo><bar><val x="y">42</val></bar></foo>' AS varchar INDENT);
>> ┌─────────────────────────┐
>> │ xmlserialize │
>> ╞═════════════════════════╡
>> │ <foo> ↵│
>> │ <bar> ↵│
>> │ <val x="y">42</val>↵│
>> │ </bar> ↵│
>> │ </foo> ↵│
>> │ │
>> └─────────────────────────┘
>> (1 row)
>>
>> Looks so there is an extra empty row.
>>
>
> I wouldn't necessarily worry about this much. There is not, as such, an
> extra blank line at the end; rather it is conventional that a text file
> should end with a newline character. That is, conventionally every single
> line in a text file ends with a newline character, meaning the only text
> file that doesn't end with a newline is the empty file. You can see this in
> tools like diff, which explicitly report "no newline at end of file" if the
> file ends with a different character.
>
> If you were to save the value to a file you would probably want it the way
> it is.
>
> That being said, this is a database column result and I agree it would
> look more elegant if the blank line in the display were not there. I might
> go so far as to change the psql display routines to not leave a blank line
> after the content in the event it ends with a newline.
>
psql shows to display content without changes. I don't think it should be
fixed on the client side.
But it can be easily fixed on the server side. I think there is some code
that tries to clean the end lines already.
regards
Pavel
| From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 14:50:41 |
| Message-ID: | CAFj8pRC_7hH9McSbXnmHhZxdDjVDpJ6wF+5pxxESBEdA2RHXHA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
ne 23. 4. 2023 v 16:48 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:
> Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> writes:
> > On 23.04.23 07:31, Pavel Stehule wrote:
> >> Looks so there is an extra empty row.
>
> > Good catch! It looks like it comes directly from libxml2.
>
> Is it really a bug? If libxml2 itself is putting in that newline,
> I'm not sure we should take it on ourselves to strip it.
>
Maybe It is not a bug, but it can be messy. "json_pretty" doesn't do it.
>
> > I'll do some digging to see if there is a good way to get rid of this
> > newline or if we need to chose a different dump function.
>
> If we do want to strip it, I'd just add a couple lines of code
> to delete any trailing newlines at the end of the process.
> Compare, eg, pchomp().
>
yes
Pavel
>
> regards, tom lane
>
| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com> |
| Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 14:52:27 |
| Message-ID: | [email protected] |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
Isaac Morland <isaac(dot)morland(at)gmail(dot)com> writes:
> That being said, this is a database column result and I agree it would look
> more elegant if the blank line in the display were not there.
Yeah, that would basically be the argument for editorializing on libxml2's
result. It's a weak argument, but not entirely without merit.
> I might go so
> far as to change the psql display routines to not leave a blank line after
> the content in the event it ends with a newline.
psql has *no* business changing what it displays: if what came from the
server has a trailing newline, that had better be made visible. Even if
we thought it was a good idea, it's about 25 years too late to reconsider
that. However, xmlserialize()'s new indenting behavior isn't set in
stone yet, so we could contemplate chomping newlines within that.
regards, tom lane
| From: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 16:03:00 |
| Message-ID: | CAMsGm5f1z0A7G3edJTQB5iDzT2pHVx_NiGGz9GZju9pXx412xQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On Sun, 23 Apr 2023 at 10:52, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Isaac Morland <isaac(dot)morland(at)gmail(dot)com> writes:
>
> > I might go so
> > far as to change the psql display routines to not leave a blank line
> after
> > the content in the event it ends with a newline.
>
> psql has *no* business changing what it displays: if what came from the
> server has a trailing newline, that had better be made visible. Even if
> we thought it was a good idea, it's about 25 years too late to reconsider
> that. However, xmlserialize()'s new indenting behavior isn't set in
> stone yet, so we could contemplate chomping newlines within that.
The trailing newline is made visible by the little bent arrow character
that appears at the right hand side. So you could still tell whether the
value ended with a trailing newline. I agree that simply dropping the
trailing newline before displaying the value would be a very bad idea.
| From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
|---|---|
| To: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 16:28:35 |
| Message-ID: | CAFj8pRBvfM=a4SYHyv7bTCYmXt=Fo5GcsNuOtctQO5CApRrYHA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
Dne ne 23. 4. 2023 18:03 uživatel Isaac Morland <isaac(dot)morland(at)gmail(dot)com>
napsal:
> On Sun, 23 Apr 2023 at 10:52, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Isaac Morland <isaac(dot)morland(at)gmail(dot)com> writes:
>>
>
>
>> > I might go so
>> > far as to change the psql display routines to not leave a blank line
>> after
>> > the content in the event it ends with a newline.
>>
>> psql has *no* business changing what it displays: if what came from the
>> server has a trailing newline, that had better be made visible. Even if
>> we thought it was a good idea, it's about 25 years too late to reconsider
>> that. However, xmlserialize()'s new indenting behavior isn't set in
>> stone yet, so we could contemplate chomping newlines within that.
>
>
> The trailing newline is made visible by the little bent arrow character
> that appears at the right hand side. So you could still tell whether the
> value ended with a trailing newline. I agree that simply dropping the
> trailing newline before displaying the value would be a very bad idea.
>
What is benefit or usage of this trailing newline?
Personally, when I do some formatting I prefer adding newlines on client
side before necessity of removing.
>
| From: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com> |
|---|---|
| To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 17:46:45 |
| Message-ID: | CAMsGm5e1BN4qYEYgdX++UcUKSCMSDLVbC11F3eSr1yHmneizhg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On Sun, 23 Apr 2023 at 12:28, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
>
> Dne ne 23. 4. 2023 18:03 uživatel Isaac Morland <isaac(dot)morland(at)gmail(dot)com>
> napsal:
>
>> On Sun, 23 Apr 2023 at 10:52, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>>> Isaac Morland <isaac(dot)morland(at)gmail(dot)com> writes:
>>>
>>
>>
>>> > I might go so
>>> > far as to change the psql display routines to not leave a blank line
>>> after
>>> > the content in the event it ends with a newline.
>>>
>>> psql has *no* business changing what it displays: if what came from the
>>> server has a trailing newline, that had better be made visible. Even if
>>> we thought it was a good idea, it's about 25 years too late to reconsider
>>> that. However, xmlserialize()'s new indenting behavior isn't set in
>>> stone yet, so we could contemplate chomping newlines within that.
>>
>>
>> The trailing newline is made visible by the little bent arrow character
>> that appears at the right hand side. So you could still tell whether the
>> value ended with a trailing newline. I agree that simply dropping the
>> trailing newline before displaying the value would be a very bad idea.
>>
>
> What is benefit or usage of this trailing newline?
>
When creating a text file, it is conventional to end it with a newline.
Every single line of the file is ended with a newline, including the last
line of a file. Various tools deal with text files which are missing the
newline on the last line in various ways depending on context. If you "cat"
a file which is missing its trailing newline, the command prompt naturally
ends up on the same line of the display as the characters after the last
newline. Tools like “less” often adjust their display so the presence or
absence of the trailing newline makes no difference.
So it’s not so much about benefit or usage, it’s about what text files
normally contain. Newline characters are used as line terminators, not line
separators.
Of course, it's conventional for a database value not to end with a
newline. If I store a person's name in the database, it would be weird to
append a newline to the end. Here we have serialized XML which we tend to
think of storing in a text file — where one would expect it to end with a
newline — but we're displaying it in a table cell as part of the output of
a database query, where one typically does not expect values to end with a
newline (although they can, and psql displays values differently even if
they differ only in the presence or absence of a newline at the end).
If you were to load a typical text file into a column of a database row and
display it using psql, you would see the same phenomenon.
My inclination would be, if we're just calling to a long-standardized
library routine, to just accept its output as is. If a program is saving
the output to a text file, that would be the expected behaviour. If not,
then we need to document that the output of our function is the output of
the library function, minus the trailing newline.
Personally, when I do some formatting I prefer adding newlines on client
> side before necessity of removing.
>
| From: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
|---|---|
| To: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-23 21:56:39 |
| Message-ID: | [email protected] |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
> My inclination would be, if we're just calling to a long-standardized
> library routine, to just accept its output as is. If a program is
> saving the output to a text file, that would be the expected
> behaviour. If not, then we need to document that the output of our
> function is the output of the library function, minus the trailing
> newline.
After some digging on the matter, I'd also tend to leave the output as
is, but I also do understand the other arguments - specially the
consistency with jsonb_pretty().
If we agree to remove it, the change wouldn't be substantial :) I guess
we could just pchomp it in the end of the function, as suggested by Tom.
Attached a draft patch.
Best, Jim
| Attachment | Content-Type | Size |
|---|---|---|
| xmlserialize-remove-trailing-newline.diff | text/x-patch | 4.5 KB |
| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
| Cc: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-24 01:18:03 |
| Message-ID: | [email protected] |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> writes:
> If we agree to remove it, the change wouldn't be substantial :) I guess
> we could just pchomp it in the end of the function, as suggested by Tom.
> Attached a draft patch.
I wouldn't actually *use* pchomp here, because that induces an unnecessary
copy of the result string. I had in mind more like copying pchomp's code
to count up the trailing newline(s) and then pass a corrected length
to cstring_to_text_with_len.
You could simplify matters by doing that in all cases, too. It should
never find anything to remove in the non-indented case, but the check
should be of negligible cost in context.
regards, tom lane
| From: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: xmlserialize bug - extra empty row at the end |
| Date: | 2023-04-24 08:29:27 |
| Message-ID: | [email protected] |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On 24.04.23 03:18, Tom Lane wrote:
> I wouldn't actually *use* pchomp here, because that induces an
> unnecessary
> copy of the result string. I had in mind more like copying pchomp's code
> to count up the trailing newline(s) and then pass a corrected length
> to cstring_to_text_with_len.
Changed.
> You could simplify matters by doing that in all cases, too. It should
> never find anything to remove in the non-indented case, but the check
> should be of negligible cost in context.
I'm not sure I understood it correctly.
The non-indented cases should never find anything and indented cases
with CONTENT strings do not add trailing newlines, so this is only
applicable with DOCUMENT .. INDENT, right?
Something like this would suffice?
if(xmloption_arg != XMLOPTION_DOCUMENT)
result = (text *) xmlBuffer_to_xmltype(buf);
else
{
int len = xmlBufferLength(buf);
const char *xmloutput = (const char *) xmlBufferContent(buf);
while (len > 0 && xmloutput[len - 1] == '\n')
len--;
result = cstring_to_text_with_len(xmloutput, len);
}
If we really agree on manually removing the trailing newlines I will
open a CF entry for this.
Best, Jim
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Remove-trailing-newlines-from-xmlserialize-indent.patch | text/x-patch | 5.3 KB |