From: justin.reid@... Date: 2020-03-05T13:59:49+00:00 Subject: [ruby-core:97371] [Ruby master Bug#16672] net/http leaves original content-length header intact after inflating response Issue #16672 has been updated by jmreid (Justin Reid). jeremyevans0 (Jeremy Evans) wrote in #note-8: > `total_out` doesn't give you the full size of the output until after the input is fully processed. So if there is an exception or other early exit from the block passed to `inflater` before the body is fully inflated, you can end up with an incorrect result. Also, the `Content-Length` header inside the block would still be wrong. You could remove it before the block and only set it on success after the block, that's probably the best way to handle it if you want to modify it. Ah, understood! ---------------------------------------- Bug #16672: net/http leaves original content-length header intact after inflating response https://bugs.ruby-lang.org/issues/16672#change-84496 * Author: jmreid (Justin Reid) * Status: Open * Priority: Normal * ruby -v: ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin19] * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN ---------------------------------------- When using net/http to make a request to a resource, the default request headers are the following (when you have ZLIB available): `"accept-encoding"=>["gzip;q=1.0,deflate;q=0.6,identity;q=0.3"], "accept"=>["*/*"], "user-agent"=>["Ruby"]` This means that a resource will return a gzipped response if it can provide it. Take this URL for example: `https://storage.googleapis.com/justin-reid-test/test.js` This is a JS file that has a `content-length` of `2733` when gzipped and `9995` when inflated: ``` curl "https://storage.googleapis.com/justin-reid-test/test.js" -H "accept-encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3" | wc -c 2733 curl "https://storage.googleapis.com/justin-reid-test/test.js" | wc -c 9995 ``` When making a simple request for this asset using net/http: ``` uri = URI('https://storage.googleapis.com/justin-reid-test/test.js') res = Net::HTTP.get_response(uri) ``` Ruby will (https://github.com/ruby/ruby/blob/f08cd708b11dd5b293986b92bb5e227731665b36/lib/net/http/response.rb#L264-L278): - Delete the `content-encoding` header - inflate the body - return the inflated body The issue here is that Ruby also leaves the `content-length` header set to the original request's value: ``` require 'net/http' uri = URI('https://storage.googleapis.com/justin-reid-test/test.js') res = Net::HTTP.get_response(uri) puts "Fetching: https://storage.googleapis.com/justin-reid-test/test.js" puts "Body size using String#bytesize: #{res.body.to_s.bytesize}" puts "Content-Length response header: #{res.content_length}" ``` Results in: ``` Fetching: https://storage.googleapis.com/justin-reid-test/test.js Body size using String#bytesize: 9995 Content-Length response header: 2733 ``` This means that an incorrect `content-length` header is passed back when net/http makes requests for gzip objects and inflates them. This issue was noticed when Rack changed their behaviour in how they compute content-length. They used to compute the content-length for each body, but that changed in 2.0.8: https://github.com/rack/rack/commit/8c62821f4a464858a6b6ca3c3966ec308d2bb53e#diff-10b933d2c1fdc82ceecade456c64e1c2L92 https://github.com/rack/rack/issues/1472#issuecomment-574362342 Using `Rack::ContentLength` is now the method they prefer if you need to compute the content-length. However, `Rack::ContentLength` will not try to re-compute the value if that header already exists: https://github.com/rack/rack/blob/6196377654b7ff7ce7abaecea62bb285d77d53aa/lib/rack/content_length.rb#L21 Should Ruby: - Do a `self.delete 'content-length'` in the inflater? - Compute the `content-length` itself and update the header? (Hacky example: https://github.com/ruby/ruby/compare/master...jmreid:content-length) -- https://bugs.ruby-lang.org/ Unsubscribe: