Skip to content

pebble: use ReadPropertiesBlock when accessing reader props #4775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 30, 2025

Conversation

xinhaoz
Copy link
Member

@xinhaoz xinhaoz commented May 27, 2025

pebble: close file cache before closing the objstorage provider

It seems that during close, the file cache may attempt to access structures
managed by the objstroage provider during the eviction process, such as sending
to channels used for tracing events. We should close the file cache before the
obj storage provider.


pebble: use ReadPropertiesBlock when accessing reader props

Remove remaining reads of Properites on the reader struct. This commit also
removes the Properties field from the reader.

Closes: #4383


Previous: #4773

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@xinhaoz xinhaoz force-pushed the replace-props-pebble branch 6 times, most recently from c1944f9 to 25b6685 Compare May 28, 2025 16:42
@xinhaoz xinhaoz marked this pull request as ready for review May 28, 2025 17:14
@xinhaoz xinhaoz requested a review from a team as a code owner May 28, 2025 17:14
@xinhaoz xinhaoz requested review from jbowens and RaduBerinde May 28, 2025 17:14
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 10 files reviewed, 1 unresolved discussion (waiting on @jbowens)


objstorage/objstorageprovider/objiotracing/obj_io_tracing_on.go line 323 at r2 (raw file):

func (g *eventGenerator) flush() {
	if g.buf.num > 0 && !g.t.closed.Load() {
		g.t.workerDataCh <- g.buf

In principle, there is still a race here if the goroutine gets preempted right before sending on the channel.

Do you have the stack trace with the crash when we flushed after Close()? It suggests we were doing object IO after closing the objstorage provider, which shouldn't happen.

@xinhaoz xinhaoz force-pushed the replace-props-pebble branch from 25b6685 to f8cd50f Compare May 28, 2025 20:44
Copy link
Member Author

@xinhaoz xinhaoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 10 files reviewed, 1 unresolved discussion (waiting on @jbowens and @RaduBerinde)


objstorage/objstorageprovider/objiotracing/obj_io_tracing_on.go line 323 at r2 (raw file):

Previously, RaduBerinde wrote…

In principle, there is still a race here if the goroutine gets preempted right before sending on the channel.

Do you have the stack trace with the crash when we flushed after Close()? It suggests we were doing object IO after closing the objstorage provider, which shouldn't happen.

Ahh yes you're right. Here's the stack trace, I also just dropped the commit since it's not actually fixing what we want: https://github.com/cockroachdb/pebble/actions/runs/15286733028/job/42998483391?pr=4775

It seems that the objprovider is closed before the file cache -- should we just close the file cache first or must it be closed last here?

pebble/db.go

Line 1792 in 2fd9454

err = firstError(err, d.objProvider.Close())

@xinhaoz
Copy link
Member Author

xinhaoz commented May 28, 2025

objstorage/objstorageprovider/objiotracing/obj_io_tracing_on.go line 323 at r2 (raw file):

Previously, xinhaoz (Xin Hao Zhang) wrote…

Ahh yes you're right. Here's the stack trace, I also just dropped the commit since it's not actually fixing what we want: https://github.com/cockroachdb/pebble/actions/runs/15286733028/job/42998483391?pr=4775

It seems that the objprovider is closed before the file cache -- should we just close the file cache first or must it be closed last here?

pebble/db.go

Line 1794 in 2fd9454

// If the options include a closer to 'close' the filesystem, close it.

Should also be failing on CI now. Also I was unsure if this is an even an issue with objprovider tracing at all, but I couldn't see anything obvious with these changes since we didn't add any reader.close calls here. LMK if anyone sees anything odd with the file cache interactions here.

@xinhaoz xinhaoz force-pushed the replace-props-pebble branch from f8cd50f to e84181c Compare May 29, 2025 15:02
@xinhaoz xinhaoz requested a review from RaduBerinde May 29, 2025 15:03
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 11 of 11 files at r3, 9 of 10 files at r4, all commit messages.
Reviewable status: 10 of 11 files reviewed, 5 unresolved discussions (waiting on @RaduBerinde)


-- commits line 11 at r4:
nit: Properties


table_stats.go line 512 at r4 (raw file):

	addPhysicalTableStats := func(r *sstable.Reader, _ sstable.ReadEnv) (err error) {
		props, err := r.ReadPropertiesBlock(context.TODO(), nil /* buffer pool */)

This block read is good for now, but I think it's probably worth avoiding in a follow-up by adding NumEntries, RawKeySize and RawValueSize to TableMetadata.TableStats. Want to leave a TODO/ file an issue?


table_stats.go line 518 at r4 (raw file):

		entryCount += file.Stats.NumEntries
		keySum += v.Properties.RawKeySize
		valSum += v.Properties.RawValueSize

I think the existing code here has a bug. It should be scaling the properties for virtual tables, like we do higher up:

			props := r.Properties.CommonProperties
			if meta.Virtual {
				props = r.Properties.GetScaledProperties(meta.TableBacking.Size, meta.Size)
			}

I filed #4790 for this. You could fix it up in this PR, or leave it for now. Up to you. I think we can also combine addPhysicalTableStats and addVirtualTableStats into one now. I think we overlooked that in #4375


ingest.go line 297 at r4 (raw file):

	containsRangeDeletions := r.Attributes.Intersects(sstable.AttributeRangeDels | sstable.AttributeRangeKeyDels)
	if tf.BlockColumnar() || !containsRangeDeletions {

nit: I think we should make the loading of the properties block unconditional here, because going forward we expect almost all tables to be BlockColumnar() and it'll simplify the code a bit

Copy link
Member Author

@xinhaoz xinhaoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 10 of 11 files reviewed, 3 unresolved discussions (waiting on @jbowens and @RaduBerinde)


-- commits line 11 at r4:

Previously, jbowens (Jackson Owens) wrote…

nit: Properties

done


ingest.go line 297 at r4 (raw file):

Previously, jbowens (Jackson Owens) wrote…

nit: I think we should make the loading of the properties block unconditional here, because going forward we expect almost all tables to be BlockColumnar() and it'll simplify the code a bit

Done.


table_stats.go line 512 at r4 (raw file):

Previously, jbowens (Jackson Owens) wrote…

This block read is good for now, but I think it's probably worth avoiding in a follow-up by adding NumEntries, RawKeySize and RawValueSize to TableMetadata.TableStats. Want to leave a TODO/ file an issue?

Filed #4792!


table_stats.go line 518 at r4 (raw file):

Previously, jbowens (Jackson Owens) wrote…

I think the existing code here has a bug. It should be scaling the properties for virtual tables, like we do higher up:

			props := r.Properties.CommonProperties
			if meta.Virtual {
				props = r.Properties.GetScaledProperties(meta.TableBacking.Size, meta.Size)
			}

I filed #4790 for this. You could fix it up in this PR, or leave it for now. Up to you. I think we can also combine addPhysicalTableStats and addVirtualTableStats into one now. I think we overlooked that in #4375

Done.

xinhaoz added 2 commits May 29, 2025 18:13
It seems that during close, the file cache may attempt to access structures
managed by the objstroage provider during the eviction process, such as sending
to channels used for tracing events. We should close the file cache before the
obj storage provider.
Remove remaining reads of `Properties` on the reader struct. This commit also
removes the Properties field from the reader.

Closes: cockroachdb#4383
Closes: cockroachdb#4790

There are no more reads of `Properties` so we can remove this field now.
@xinhaoz xinhaoz force-pushed the replace-props-pebble branch from e84181c to b9a9caa Compare May 29, 2025 22:14
@xinhaoz xinhaoz merged commit 173056b into cockroachdb:master May 30, 2025
6 checks passed
@xinhaoz xinhaoz deleted the replace-props-pebble branch May 30, 2025 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sstable: reduce memory consumption from properties
4 participants