Project

General

Profile

Actions

Bug #19700

closed

TestProcess#test_execopts_redirect_open_fifo_interrupt_print is flaky on macOS

Added by kjtsanaktsidis (KJ Tsanaktsidis) about 2 years ago. Updated about 2 years ago.

Status:
Feedback
Assignee:
-
Target version:
-
[ruby-core:113714]

Description

The test TestProcess#test_execopts_redirect_open_fifo_interrupt_print in test_process.rb is flaky on macOS for me. Sometimes, it just hangs forever.

This test is testing what happens when:

  • You have two processes
  • One is blocked opening a FIFO for reading (to redirect to a child process)
  • The other one sends a signal to that process
  • And then opens the FIFO for writing (which should unblock the child process)

When this test hangs forever, for me,

  • The child process is blocked on opening the FIFO (i.e. it's waiting for a writer)
  • But the parent process successfully wrote data to the FIFO already somehow (this shouldn't be possible - the parent should have been blocked opening the FIFO until the open succeeded in the child).

Actually I believe this is a bug in macOS. The following program will fail with write (parent): Broken pipe on macOS when run in a loop, but works correctly on Linux: https://gist.github.com/KJTsanaktsidis/fc84b006cfff1bb0b55a2571df825d80

I'm going to open a PR to skip this test on macos because I believe the operating system is broken here, and as far as I can tell Ruby is doing the correct thing.

Updated by nobu (Nobuyoshi Nakada) about 2 years ago

  • Status changed from Open to Feedback

I haven't seen that failures on macOS.

ProductName:		macOS
ProductVersion:		13.4
BuildVersion:		22F66

Also your test program runs fine.

Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 2 years ago

I think it's the same failure as these:

Is it an arm thing perhaps? My macbook is an M1 on 13.4 (22F66) as well.

I was going to suggest it could have something to do with the CrowdStrike security stack on my mac (work machine), which definitely gets its hooks into open(2) system calls, but it seems we have a similar failure on RubyCI (which I assume doesn't have any of that stuff on it?). Maybe CrowdStrike makes the flakiness more likely but it's present regardless?

I'll do a bit of a survey of macs around the office this week and see if I can find a differentiating factor with that test program.

Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 2 years ago

It seems from my survey around the office that my test program works on Intel macs and crashes on ARM ones. I opened a bug report with Apple about this (FB12251512)

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0