wget should follow redirects
Luca0208 opened this issue ยท 4 comments
Pretty much the title. Currently wget won't follow 3xx codes, but since we have the headers available we could use the Location header to redirect.
The current behaviour is rather unintuitive, as it will say "Success" and then not save any file(At least if the body of the response was empty, which is usually the case for a 3xx).
The http
API should handle redirects automatically (up to a limit of 16). What URL are you trying to fetch which doesn't?
I'm testing with wget http://git.io/JLgFg
, which redirects to https://git.io/JLgFg
and then https://raw.githubusercontent.com/SquidDev-CC/CC-Tweaked/mc-1.15.x/src/main/resources/data/computercraft/lua/rom/programs/http/wget.lua
.
The Link that actually didn't work was a dropbox share link, but since it's not my file I don't know if it's okay if I share it here. I can't reproduce it with other links.
I'm gonna ask the person who gave me the link if they're ok with me sharing it here.
I managed to reproduce it.
The link https://www.dropbox.com/s/p4t4796rnszt1xl/testing%20this.txt?dl=1 will cause wget to say "Connecting to [...]... Success." and then not save any file because the body is empty.
It seems that when the URL contains an encoded space(%20
) (or potentially any encoded characters at all) it doesn't follow the redirect but instead saves the body of the 301 Response.
e.g. http://git.io/%20
redirects to https://git.io/%20
but wget instead saves the "You are being redirected" body.
I did a bit of testing and found out that the Problem seems to be in HttpRequestHandler.getRedirect
This method URLDecodes the Location Header before passing it to URI()
which apparently causes a URISyntaxException meaning the Redirect isn't followed.
Removing the decoding and instead passing location
directly to new URI()
looks like it works, however since I don't really know how URI works in Java I'd rather not put it into a PR before getting confirmation that URI()
indeed requires an already escaped URI.