From: "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" Date: 2024-01-25T00:00:46+00:00 Subject: [ruby-core:116433] [Ruby master Bug#20208] Net::HTTP errors with Errno::EAFNOSUPPORT when setting local_host with Addrinfo Issue #20208 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). Thanks for this report - it was super detailed and made it very easy for me to figure out what's going on! Firstly, your bisection is right. The AI_ADDRCONFIG flag is what makes the difference here. The flag causes glibc to NOT return ipv6 addresses if the system doesn't have any ipv6 addresses of its own - and the loopback device doesn't count, glibc will ignore that when asking "does the system have ipv6 addresses?". This is normally what you want when using the result of `getaddrinfo` for an outbound connection; if you don't have an ipv6 connection to the world, perfoming AAAA DNS lookups which will return results you can't possibly use is pointless and AI_ADDRCONFIG skips this. By default, Ruby will use `AI_ADDRCONFIG` for DNS lookups it performs internally as a result of connecting to things; so `TCPSocket.new`, etc perform their DNS lookups with AI_ADDRCONFIG (since it knows the point of this lookup is to make a connection with it), but other functions like `Addrinfo.getaddrinfo` by default are _not_ made with this flag, since you might be using the results to do something other than connect to them - maybe you're writing dig in ruby, for example. The problem with your reproduction is that you are actually trying to connect to localhost; so, your loopback ipv6 address is actually relevant here! Now, on to your reproduction: ``` http = Net::HTTP.new("localhost", 8080) ``` This is going to end up calling into `TCPSocket.open`, which will perform DNS resolution with AI_ADDRCONFIG. Since your system has no non-loopback IPv6 addresses, this means that '127.0.0.1' gets returned. Whether or not AI_ADDRCONFIG should return IPv6 results for localhost if the loopback adapter has an IPv6 address is an interesting question, but the current implementation in glibc is that it does not: ``` irb(main):010:0> Addrinfo.getaddrinfo("localhost", 8080, nil, :STREAM, nil, Socket::AI_ADDRCONFIG) => [#, #] irb(main):011:0> system 'ip addr list' 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever 2: enp0s31f6: mtu 1500 qdisc fq_codel state DOWN group default qlen 1000 link/ether 84:a9:38:35:ea:56 brd ff:ff:ff:ff:ff:ff 3: wlp0s20f3: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether a0:e7:0b:22:fc:ea brd ff:ff:ff:ff:ff:ff inet 192.168.2.249/24 brd 192.168.2.255 scope global dynamic noprefixroute wlp0s20f3 valid_lft 83114sec preferred_lft 83114sec ``` So, because `getaddrinfo` returned '127.0.0.1', we proceed to create a IPv4 socket for the connection (this is the `AF_INET` socket you see in the strace output). Then, the next line of your reproduction: ``` http.local_host = Addrinfo.tcp("localhost", 8080).ip_address ``` This is calling `getaddrinfo` to resolve "localhost" for us to use it as the _local_ side of the connection. Because Ruby does not know what you intend to do with this IP address, it does not make the request with AI_ADDRCONFIG. Thus, you get an IPv6 result returned, since there is an IPv6 addres for localhost! This results in the call to `bind(AF_INET6)` in your strace output, and hence the error. --- I think the problem here is that the test `TestNetHTTPLocalBind#test_bind_to_local_host` (and friends) is wrong. It _should_ be perforning the following sequence of actions (in pseudocode): * Do `remote_addr = getaddrinfo("host to connect to", AF_UNSPEC, AI_ADDRCONFIG)` * Then, do `local_bind_addr = getaddrinfo("localhost", remote_addr.address_family)` * Then, do `socket(remote_addr)`, `bind(local_bind_addr)`, and `connect(remote_addr)`. i.e. we should be explicitly specifying the address family when looking up the local address, so that it's the same as the address family we're going to use in remote_address. However what it's actually doing is * Do `remote_addr = getaddrinfo("host to connect to", AF_UNSPEC, AI_ADDRCONFIG)` * Then, do `local_bind_addr = getaddrinfo("localhost", AF_UNSPEC)` * Then, do `socket(remote_addr)`, `bind(local_bind_addr)`, and `connect(remote_addr)`. So there's no guarnatee that the local_host it looks up is in the same address family as what it's going to connect to. Fortunately, `#local_host=` accepts a string, which will be looked up during the connection. So this program _does_ work properly: ``` http = Net::HTTP.new("localhost", 8080) http.local_host = "localhost" p http.get("/") ``` If it connects to `::1` (for _whatever_ reason), it will use `::1` as the local addr; and if it connects to `127.0.0.1`, it will use `127.0.0.1` as the local addr. So tl;dr: I'm going to fix the tests here, i think the implementation behaviour is correct. ---------------------------------------- Bug #20208: Net::HTTP errors with Errno::EAFNOSUPPORT when setting local_host with Addrinfo https://bugs.ruby-lang.org/issues/20208#change-106455 * Author: jprokop (Jarek Prokop) * Status: Assigned * Priority: Normal * Assignee: kjtsanaktsidis (KJ Tsanaktsidis) * ruby -v: ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- A bug was found when dealing with Ruby tests downstream. One of our builders has a specific networking configuration, resulting in Ruby incorrectly binding a socket, resulting in exception Errno::EAFNOSUPPORT, despite localhost being IPv6 capable. It is reproducible with Ruby 3.3, and reasonably current master (git hash a846d391d38b34fcc4f90adef967c166c923bd56). Reproduction environment: The networking configuration has to be in a specific state. The regular interface (such as eth0) has to have ipv6 disabled while localhost is IPv6 enabled. I have tracked the problem to a commit adding AI_ADDRCONFIG flag: https://github.com/ruby/ruby/commit/d2ba8ea54a4089959afdeecdd963e3c4ff391748#diff-0a5f5e9afd3efff0444a367dd88aac41bb4de9765c8542b81c1ebcff60ab3b14R99 If I revert the commit or just simply set 2 ifdefs that are present in the diff with `HAVE_CONST_AI_ADDRCONFIG` to 0, the problem no longer occurs. I have used vagrant with fedora/39-cloud-base box with the above mentioned git hash. However, I'd note that I reproduced it also on RHEL 8 and RHEL 9. The VM has the following interfaces: ~~~ $ ip addr 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:e3:aa:c1 brd ff:ff:ff:ff:ff:ff altname enp0s5 altname ens5 inet 192.168.122.209/24 brd 192.168.122.255 scope global dynamic noprefixroute eth0 valid_lft 2099sec preferred_lft 2099sec inet6 fe80::f5fe:e8a4:8f83:4a8f/64 scope link tentative noprefixroute valid_lft forever preferred_lft forever ~~~ Disable IPv6 of eth0 and leave only lo with IPv6: ~~~ $ sudo sysctl "net.ipv6.conf.eth0.disable_ipv6=1" ~~~ Confirm the result: ~~~ $ ip addr 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:e3:aa:c1 brd ff:ff:ff:ff:ff:ff altname enp0s5 altname ens5 inet 192.168.122.209/24 brd 192.168.122.255 scope global dynamic noprefixroute eth0 valid_lft 3587sec preferred_lft 3587sec ~~~ inet6 is no longer present on eth0, but still present in lo. Then we can copy what TestNetHTTPLocalBind is doing in setup, as that is one of the failing tests and use it for a reproducer: ~~~ $ ruby -rnet/http -e 'http = Net::HTTP.new("localhost", 8080); http.local_host = Addrinfo.tcp("localhost", 8080).ip_address; p http.get("/")' /usr/share/ruby/net/http.rb:1603:in `initialize': Failed to open TCP connection to localhost:8080 (Address family not supported by protocol - bind(2) for "::1" port ) (Errno::EAFNOSUPPORT) from /usr/share/ruby/net/http.rb:1603:in `open' from /usr/share/ruby/net/http.rb:1603:in `block in connect' from /usr/share/ruby/timeout.rb:186:in `block in timeout' from /usr/share/ruby/timeout.rb:193:in `timeout' from /usr/share/ruby/net/http.rb:1601:in `connect' from /usr/share/ruby/net/http.rb:1580:in `do_start' from /usr/share/ruby/net/http.rb:1569:in `start' from /usr/share/ruby/net/http.rb:2297:in `request' from /usr/share/ruby/net/http.rb:1917:in `get' from -e:1:in `
' /usr/share/ruby/net/http.rb:1603:in `initialize': Address family not supported by protocol - bind(2) for "::1" port (Errno::EAFNOSUPPORT) from /usr/share/ruby/net/http.rb:1603:in `open' from /usr/share/ruby/net/http.rb:1603:in `block in connect' from /usr/share/ruby/timeout.rb:186:in `block in timeout' from /usr/share/ruby/timeout.rb:193:in `timeout' from /usr/share/ruby/net/http.rb:1601:in `connect' from /usr/share/ruby/net/http.rb:1580:in `do_start' from /usr/share/ruby/net/http.rb:1569:in `start' from /usr/share/ruby/net/http.rb:2297:in `request' from /usr/share/ruby/net/http.rb:1917:in `get' from -e:1:in `
' ~~~ The script: ~~~ http = Net::HTTP.new("localhost", 8080) http.local_host = Addrinfo.tcp("localhost", 8080).ip_address p http.get("/") ~~~ Without setting the `http.local_host` attribute using Addrinfo, the reproducer does not fail with EAFNOSUPPORT. Whether `port` is specified or `nil` does not make a difference. Whether there is a server listening on 8080 or not does not make a difference, the script fails with the errno regardless. I have collected `strace` that points to a possible cause: ~~~ $ strace ruby -rnet/http -e 'http = Net::HTTP.new("localhost", 8080); http.local_host = Addrinfo.tcp("localhost", 8080).ip_address; p http.get("/")' 2>&1 | grep AF_INET socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_TCP) = 5 bind(5, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = -1 EAFNOSUPPORT (Address family not supported by protocol) ~~~ A socket is created with AF_INET and later is bound with AF_INET6, that is not correct behavior as far as I can tell. Full strace is attached. Observed failures in Ruby test suite related to this issue: ~~~ 109) Error: TestNetHTTPLocalBind#test_bind_to_local_port: Errno::EAFNOSUPPORT: Failed to open TCP connection to localhost:37337 (Address family not supported by protocol - bind(2) for "::1" port 45395) /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `initialize' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `open' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `block in connect' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:186:in `block in timeout' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:193:in `timeout' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1601:in `connect' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1580:in `do_start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1569:in `start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:2297:in `request' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1917:in `get' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1282:in `test_bind_to_local_port' 110) Error: TestNetHTTPLocalBind#test_bind_to_local_host: Errno::EAFNOSUPPORT: Failed to open TCP connection to localhost:46329 (Address family not supported by protocol - bind(2) for "::1" port ) /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `initialize' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `open' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `block in connect' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:186:in `block in timeout' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:193:in `timeout' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1601:in `connect' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1580:in `do_start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1569:in `start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:2297:in `request' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1917:in `get' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1267:in `test_bind_to_local_host' 111) Error: TestNetHTTPForceEncoding#test_response_body_encoding_false: Errno::EAFNOSUPPORT: Failed to open TCP connection to localhost:41749 (Address family not supported by protocol - bind(2) for "::1" port ) /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `initialize' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `open' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `block in connect' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:186:in `block in timeout' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:193:in `timeout' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1601:in `connect' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1580:in `do_start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1569:in `start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:2297:in `request' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1917:in `get' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1308:in `fe_request' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1312:in `test_response_body_encoding_false' 112) Error: TestNetHTTPForceEncoding#test_response_body_encoding_string_without_content_type: Errno::EAFNOSUPPORT: Failed to open TCP connection to localhost:42775 (Address family not supported by protocol - bind(2) for "::1" port ) /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `initialize' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `open' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `block in connect' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:186:in `block in timeout' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:193:in `timeout' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1601:in `connect' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1580:in `do_start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1569:in `start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:2297:in `request' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1917:in `get' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1308:in `fe_request' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1330:in `test_response_body_encoding_string_without_content_type' 113) Error: TestNetHTTPForceEncoding#test_response_body_encoding_true_with_content_type: Errno::EAFNOSUPPORT: Failed to open TCP connection to localhost:36895 (Address family not supported by protocol - bind(2) for "::1" port ) /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `initialize' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `open' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `block in connect' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:186:in `block in timeout' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:193:in `timeout' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1601:in `connect' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1580:in `do_start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1569:in `start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:2297:in `request' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1917:in `get' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1308:in `fe_request' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1324:in `test_response_body_encoding_true_with_content_type' 114) Error: TestNetHTTPForceEncoding#test_response_body_encoding_encoding_without_content_type: Errno::EAFNOSUPPORT: Failed to open TCP connection to localhost:37115 (Address family not supported by protocol - bind(2) for "::1" port ) /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `initialize' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `open' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `block in connect' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:186:in `block in timeout' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:193:in `timeout' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1601:in `connect' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1580:in `do_start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1569:in `start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:2297:in `request' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1917:in `get' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1308:in `fe_request' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1336:in `test_response_body_encoding_encoding_without_content_type' 115) Error: TestNetHTTPForceEncoding#test_response_body_encoding_true_without_content_type: Errno::EAFNOSUPPORT: Failed to open TCP connection to localhost:37799 (Address family not supported by protocol - bind(2) for "::1" port ) /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `initialize' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `open' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1603:in `block in connect' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:186:in `block in timeout' /builddir/build/BUILD/ruby-3.3.0/lib/timeout.rb:193:in `timeout' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1601:in `connect' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1580:in `do_start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1569:in `start' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:2297:in `request' /builddir/build/BUILD/ruby-3.3.0/lib/net/http.rb:1917:in `get' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1308:in `fe_request' /builddir/build/BUILD/ruby-3.3.0/test/net/http/test_http.rb:1318:in `test_response_body_encoding_true_without_content_type' ~~~ Related failures from specs: ~~~ 1) An exception occurred during: before :each TCPSocket#local_address using IPv6 using an implicit hostname the returned Addrinfo uses the correct IP address ERROR Errno::ECONNREFUSED: Connection refused - connect(2) for nil port 37121 /builddir/build/BUILD/ruby-3.3.0/spec/ruby/library/socket/tcpsocket/local_address_spec.rb:59:in `initialize' /builddir/build/BUILD/ruby-3.3.0/spec/ruby/library/socket/tcpsocket/local_address_spec.rb:59:in `new' /builddir/build/BUILD/ruby-3.3.0/spec/ruby/library/socket/tcpsocket/local_address_spec.rb:59:in `block (4 levels) in ' /builddir/build/BUILD/ruby-3.3.0/spec/ruby/library/socket/tcpsocket/local_address_spec.rb:4:in `' 2) An exception occurred during: before :each TCPSocket#remote_address using IPv6 using an implicit hostname the returned Addrinfo uses the correct IP address ERROR Errno::ECONNREFUSED: Connection refused - connect(2) for nil port 39823 /builddir/build/BUILD/ruby-3.3.0/spec/ruby/library/socket/tcpsocket/remote_address_spec.rb:58:in `initialize' /builddir/build/BUILD/ruby-3.3.0/spec/ruby/library/socket/tcpsocket/remote_address_spec.rb:58:in `new' /builddir/build/BUILD/ruby-3.3.0/spec/ruby/library/socket/tcpsocket/remote_address_spec.rb:58:in `block (4 levels) in ' /builddir/build/BUILD/ruby-3.3.0/spec/ruby/library/socket/tcpsocket/remote_address_spec.rb:4:in `' ~~~ ---Files-------------------------------- strace_log.txt (304 KB) -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/