From: merch-redmine@... Date: 2020-03-05T02:19:03+00:00 Subject: [ruby-core:97365] [Ruby master Bug#16672] net/http leaves original content-length header intact after inflating response Issue #16672 has been updated by jeremyevans0 (Jeremy Evans). jmreid (Justin Reid) wrote in #note-3: > The issue here is that this ` "content-length"=>["2733"]` value is 2733. The `res.body` at this point is 9995, so `content-length` needs to match that or not exist. Otherwise it causes browsers to only partially download the file. I don't think I agree with that logic. `content_length` is documented to return the value of the header, not the size of the decompressed body. So the method appears to be operating exactly as documented. I'm also not sure why you are mentioning browsers in this context. If I open up Chrome and go into Development Tools, when I request `https://storage.googleapis.com/justin-reid-test/test.js` and look at the response headers, I see 2733 for `Content-Length`, not 9995. The only possible interpretation I can think of where browsers come into play is if you were writing a server, using net/http inside a request handler, and taking the response status, headers, and body returned by net/http and returning them directly as the response. In which case the proper fix would be not decompressing the body automatically: ```ruby res = Net::HTTP.get_response(uri){|res| res.decode_content = false} ``` ---------------------------------------- Bug #16672: net/http leaves original content-length header intact after inflating response https://bugs.ruby-lang.org/issues/16672#change-84491 * Author: jmreid (Justin Reid) * Status: Open * Priority: Normal * ruby -v: ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin19] * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN ---------------------------------------- When using net/http to make a request to a resource, the default request headers are the following (when you have ZLIB available): `"accept-encoding"=>["gzip;q=1.0,deflate;q=0.6,identity;q=0.3"], "accept"=>["*/*"], "user-agent"=>["Ruby"]` This means that a resource will return a gzipped response if it can provide it. Take this URL for example: `https://storage.googleapis.com/justin-reid-test/test.js` This is a JS file that has a `content-length` of `2733` when gzipped and `9995` when inflated: ``` curl "https://storage.googleapis.com/justin-reid-test/test.js" -H "accept-encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3" | wc -c 2733 curl "https://storage.googleapis.com/justin-reid-test/test.js" | wc -c 9995 ``` When making a simple request for this asset using net/http: ``` uri = URI('https://storage.googleapis.com/justin-reid-test/test.js') res = Net::HTTP.get_response(uri) ``` Ruby will (https://github.com/ruby/ruby/blob/f08cd708b11dd5b293986b92bb5e227731665b36/lib/net/http/response.rb#L264-L278): - Delete the `content-encoding` header - inflate the body - return the inflated body The issue here is that Ruby also leaves the `content-length` header set to the original request's value: ``` require 'net/http' uri = URI('https://storage.googleapis.com/justin-reid-test/test.js') res = Net::HTTP.get_response(uri) puts "Fetching: https://storage.googleapis.com/justin-reid-test/test.js" puts "Body size using String#bytesize: #{res.body.to_s.bytesize}" puts "Content-Length response header: #{res.content_length}" ``` Results in: ``` Fetching: https://storage.googleapis.com/justin-reid-test/test.js Body size using String#bytesize: 9995 Content-Length response header: 2733 ``` This means that an incorrect `content-length` header is passed back when net/http makes requests for gzip objects and inflates them. This issue was noticed when Rack changed their behaviour in how they compute content-length. They used to compute the content-length for each body, but that changed in 2.0.8: https://github.com/rack/rack/commit/8c62821f4a464858a6b6ca3c3966ec308d2bb53e#diff-10b933d2c1fdc82ceecade456c64e1c2L92 https://github.com/rack/rack/issues/1472#issuecomment-574362342 Using `Rack::ContentLength` is now the method they prefer if you need to compute the content-length. However, `Rack::ContentLength` will not try to re-compute the value if that header already exists: https://github.com/rack/rack/blob/6196377654b7ff7ce7abaecea62bb285d77d53aa/lib/rack/content_length.rb#L21 Should Ruby: - Do a `self.delete 'content-length'` in the inflater? - Compute the `content-length` itself and update the header? (Hacky example: https://github.com/ruby/ruby/compare/master...jmreid:content-length) -- https://bugs.ruby-lang.org/ Unsubscribe: