From: "Eregon (Benoit Daloze)" Date: 2022-01-10T11:59:37+00:00 Subject: [ruby-core:107023] [Ruby master Bug#18465] Make `IO#write` atomic. Issue #18465 has been updated by Eregon (Benoit Daloze). ioquatix (Samuel Williams) wrote in #note-6: > The current implementation before and after this PR makes no such guarantee unfortunately. The best you can do as a user is buffer your own string and call write with that as an argument to get any kind of atomic behaviour, but that only applies to non-sync IOs. Right, but we could provide this, which is very clearly wanted (nobody wants interleaving between the line and the `\n` for puts) for `puts`, if we ensure writev() or all strings are joined before a single `write` call. Re sync IOs it'd just work with my approach. > The write lock internally protects `sync=true` IO which has an internal per-IO buffer. Yes, the write lock feels like the wrong primitive to use here, it's the write buffer's lock, and there might not be one for non-buffered IO (or if there is then it's just extra cost). > https://github.com/oracle/truffleruby/blob/bd36e75003a1f2d57dbc947350cb076e9a827cbd/src/main/ruby/truffleruby/core/io.rb#L2375-L2394 > > How is sync handled? I don't see the internal buffer is used in `def write`. There is no write buffering in TruffleRuby, we found that: 1) it's not necessary semantically (OTOH it is needed for reading due to `ungetc`, `gets`, etc) 2) it doesn't seem to improve performance in most cases, actually it makes worse by having extra copies. 3) better memory footprint due to not having a write buffer per IO > The interface for IO#write has to deal with both buffered and un-buffered operations and so we hooked into the internal read and write functions of IO. The best we can hope for is to tidy up how they are used. We don't provide a strong guarantee between one call to `IO#write` corresponding to one call to `Scheduler#io_write`... while it's true in most cases, it's not true in all cases. This is really just an artefact of the complexity of the current implementation in `io.c`. I think doing my suggestion would fix it, for the writev() case you'd either pass all parts to the hook or join before. For the non-writev case it would already be joined. Any concern however doing the approach in https://bugs.ruby-lang.org/issues/18465#note-4? It provides the stronger guarantee people actually want (e.g., no interleaving with a subprocess sharing stdout/stderr) and seems to only have benefits. ---------------------------------------- Bug #18465: Make `IO#write` atomic. https://bugs.ruby-lang.org/issues/18465#change-95855 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) * Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN ---------------------------------------- Right now, `IO#write` including everything that calls it including `IO#puts`, has a poorly specified behaviour w.r.t. other fibers/threads that call `IO#write` at the same time. Internally, we have a write lock, however it's only used to lock against individual writes rather than the whole operation. From a user point of view, there is some kind of atomicity, but it's not clearly defined and depends on many factors, e.g. whether `write` or `writev` is used internally. We propose to make `IO#write` an atomic operation, that is, `IO#write` on a synchronous/buffered IO will always perform the write operation using a lock around the entire operation. In theory, this should actually be more efficient than the current approach which may acquire and release the lock several times per operation, however in practice I'm sure it's almost unnoticeable. Where it does matter, is when interleaved operations invoke the fiber scheduler. By using a single lock around the entire operation, rather than one or more locks around the system calls, the entire operation is more predictable and behaves more robustly. -- https://bugs.ruby-lang.org/ Unsubscribe: