micro-optimize nbtcompare.c routines

Lists: pgsql-hackers
From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: micro-optimize nbtcompare.c routines
Date: 2024-09-26 20:17:03
Message-ID: ZvXBP3Rbfmris0yf@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Here's a patch that adjusts several routines in nbtcompare.c and related
files to use the branchless integer comparison functions added in commit
6b80394. It's probably unlikely this produces a measurable benefit (at
least I've been unable to find any in my admittedly-limited testing), but
in theory it should save a cycle here and there. I was hoping that this
would trim many lines of code, but maintaining the STRESS_SORT_INT_MIN
stuff eats up most of what we save.

Anyway, I don't feel too strongly about this patch, but I went to the
trouble of writing it, and so I figured I'd post it.

--
nathan

Attachment Content-Type Size
v1-0001-micro-optimize-nbtcompare.c-routines.patch text/plain 8.6 KB

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: micro-optimize nbtcompare.c routines
Date: 2024-09-27 02:50:13
Message-ID: CAApHDvrAi259611bx=EiWBjYhmRyxzUMssVW8E31+vHH2Vachw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 27 Sept 2024 at 08:17, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
> Here's a patch that adjusts several routines in nbtcompare.c and related
> files to use the branchless integer comparison functions added in commit
> 6b80394. It's probably unlikely this produces a measurable benefit (at
> least I've been unable to find any in my admittedly-limited testing), but
> in theory it should save a cycle here and there. I was hoping that this
> would trim many lines of code, but maintaining the STRESS_SORT_INT_MIN
> stuff eats up most of what we save.

I had been looking at [1] (which I've added your version to now). I
had been surprised to see gcc emitting different code for the first 3
versions. Clang does a better job at figuring out they all do the same
thing and emitting the same code for each.

I played around with the attached (hacked up) qsort.c to see if there
was any difference. Likely function call overhead kills the
performance anyway. There does not seem to be much difference between
them. I've not tested with an inlined comparison function.

Looking at your version, it doesn't look like there's any sort of
improvement in terms of the instructions. Certainly, for clang, it's
worse as it adds a shift left instruction and an additional compare.
No jumps, at least.

What's your reasoning for returning INT_MIN and INT_MAX?

David

[1] https://godbolt.org/z/33T8h151M

Attachment Content-Type Size
qsort.c text/plain 6.6 KB

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: micro-optimize nbtcompare.c routines
Date: 2024-09-27 03:23:43
Message-ID: ZvYlP4IXbZoI4pQr@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 27, 2024 at 02:50:13PM +1200, David Rowley wrote:
> I had been looking at [1] (which I've added your version to now). I
> had been surprised to see gcc emitting different code for the first 3
> versions. Clang does a better job at figuring out they all do the same
> thing and emitting the same code for each.

Interesting.

> I played around with the attached (hacked up) qsort.c to see if there
> was any difference. Likely function call overhead kills the
> performance anyway. There does not seem to be much difference between
> them. I've not tested with an inlined comparison function.

I'd expect worse performance with the branchless routines for the inlined
case. However, I recall that clang was able to optimize med3() as well as
it can with the branching routines, so that may not always be true.

> Looking at your version, it doesn't look like there's any sort of
> improvement in terms of the instructions. Certainly, for clang, it's
> worse as it adds a shift left instruction and an additional compare.
> No jumps, at least.

I think I may have forgotten to add -O2 when I was inspecting this code
with godbolt.org earlier. *facepalm* The different versions look pretty
comparable with that added.

> What's your reasoning for returning INT_MIN and INT_MAX?

That's just for the compile option added by commit c87cb5f, which IIUC is
intended to test that we correctly handle comparisons that return INT_MIN.

--
nathan


From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: micro-optimize nbtcompare.c routines
Date: 2024-09-27 15:44:00
Message-ID: ZvbSwJL6SnskVDeA@nathan
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I've marked this one as Withdrawn. Apologies for the noise.

--
nathan