ubuntu 中wget和curl简谈

最新推荐文章于 2025-06-18 12:05:17 发布

原创最新推荐文章于 2025-06-18 12:05:17 发布 · 1.2k 阅读

9 ·

CC 4.0 BY-SA版权

文章标签：

#linux

linux基础专栏收录该内容

21 篇文章

订阅专栏

本文详细介绍了Ubuntu中wget和curl命令的使用，包括它们的语法、选项、参数及实例。wget以其稳定性、断点续传和限速下载功能见长，适合下载大文件。curl则擅长自定义请求参数，适用于复杂的HTTP请求。两者在下载和模拟请求上有各自的优势。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

wget命令

wget命令用来从指定的URL下载文件。wget非常稳定，它在带宽很窄的情况下和不稳定网络中有很强的适应性，如果是由于网络的原因下载失败，wget会不断的尝试，直到整个文件下载完毕。如果是服务器打断下载过程，它会再次联到服务器上从停止的地方继续下载。这对从那些限定了链接时间的服务器上下载大文件非常有用。

语法

wget(选项)(参数)

选项

-a<日志文件>：在指定的日志文件中记录资料的执行过程；
-A<后缀名>：指定要下载文件的后缀名，多个后缀名之间使用逗号进行分隔；
-b：进行后台的方式运行wget；
-B<连接地址>：设置参考的连接地址的基地地址；
-c：继续执行上次终端的任务；
-C<标志>：设置服务器数据块功能标志on为激活，off为关闭，默认值为on；
-d：调试模式运行指令；
-D<域名列表>：设置顺着的域名列表，域名之间用“，”分隔；
-e<指令>：作为文件“.wgetrc”中的一部分执行指定的指令；
-h：显示指令帮助信息；
-i<文件>：从指定文件获取要下载的URL地址；
-l<目录列表>：设置顺着的目录列表，多个目录用“，”分隔；
-L：仅顺着关联的连接；
-r：递归下载方式；
-nc：文件存在时，下载文件不覆盖原有文件；
-nv：下载时只显示更新和出错信息，不显示指令的详细执行过程；
-q：不显示指令执行过程；
-nh：不查询主机名称；
-v：显示详细执行过程；
-V：显示版本信息；
--passive-ftp：使用被动模式PASV连接FTP服务器；
--follow-ftp：从HTML文件中下载FTP连接文件。

参数

URL：下载指定的URL地址。

实例

使用wget下载单个文件

wget http://www.linuxde.net/testfile.zip

以下的例子是从网络下载一个文件并保存在当前目录，在下载的过程中会显示进度条，包含（下载完成百分比，已经下载的字节，当前下载速度，剩余下载时间）。

下载并以不同的文件名保存

wget -O wordpress.zip http://www.linuxde.net/download.aspx?id=1080

wget默认会以最后一个符合/的后面的字符来命令，对于动态链接的下载通常文件名会不正确。

错误：下面的例子会下载一个文件并以名称download.aspx?id=1080保存:

wget http://www.linuxde.net/download?id=1

即使下载的文件是zip格式，它仍然以download.php?id=1080命令。

正确：为了解决这个问题，我们可以使用参数-O来指定一个文件名：

wget -O wordpress.zip http://www.linuxde.net/download.aspx?id=1080

wget限速下载

wget --limit-rate=300k http://www.linuxde.net/testfile.zip

当你执行wget的时候，它默认会占用全部可能的宽带下载。但是当你准备下载一个大文件，而你还需要下载其它文件时就有必要限速了。

使用wget断点续传

wget -c http://www.linuxde.net/testfile.zip

使用wget -c重新启动下载中断的文件，对于我们下载大文件时突然由于网络等原因中断非常有帮助，我们可以继续接着下载而不是重新下载一个文件。需要继续中断的下载时可以使用-c参数。

使用wget后台下载

wget -b http://www.linuxde.net/testfile.zip

Continuing in background, pid 1840.
Output will be written to `wget-log'.

对于下载非常大的文件的时候，我们可以使用参数-b进行后台下载，你可以使用以下命令来察看下载进度：

tail -f wget-log

伪装代理名称下载

wget --user-agent="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16" http://www.linuxde.net/testfile.zip

有些网站能通过根据判断代理名称不是浏览器而拒绝你的下载请求。不过你可以通过--user-agent参数伪装。

测试下载链接

当你打算进行定时下载，你应该在预定时间测试下载链接是否有效。我们可以增加--spider参数进行检查。

wget --spider URL

如果下载链接正确，将会显示:

Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

这保证了下载能在预定的时间进行，但当你给错了一个链接，将会显示如下错误:

wget --spider url
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 404 Not Found
Remote file does not exist -- broken link!!!

你可以在以下几种情况下使用--spider参数：

定时下载之前进行检查
间隔检测网站是否可用
检查网站页面的死链接

增加重试次数

wget --tries=40 URL

如果网络有问题或下载一个大文件也有可能失败。wget默认重试20次连接下载文件。如果需要，你可以使用--tries增加重试次数。

下载多个文件

wget -i filelist.txt

首先，保存一份下载链接文件：

cat > filelist.txt
url1
url2
url3
url4

接着使用这个文件和参数-i下载。

镜像网站

wget --mirror -p --convert-links -P ./LOCAL URL

下载整个网站到本地。

--miror开户镜像下载。
-p下载所有为了html页面显示正常的文件。
--convert-links下载后，转换成本地的链接。
-P ./LOCAL保存所有文件和目录到本地指定目录。

过滤指定格式下载

wget --reject=gif ur

下载一个网站，但你不希望下载图片，可以使用这条命令。

把下载信息存入日志文件

wget -o download.log URL

不希望下载信息直接显示在终端而是在一个日志文件，可以使用。

限制总下载文件大小

wget -Q5m -i filelist.txt

当你想要下载的文件超过5M而退出下载，你可以使用。注意：这个参数对单个文件下载不起作用，只能递归下载时才有效。

下载指定格式文件

wget -r -A.pdf url

可以在以下情况使用该功能：

下载一个网站的所有图片。
下载一个网站的所有视频。
下载一个网站的所有PDF文件。

FTP下载

wget ftp-url
wget --ftp-user=USERNAME --ftp-password=PASSWORD url

可以使用wget来完成ftp链接的下载。

使用wget匿名ftp下载：

wget ftp-url

使用wget用户名和密码认证的ftp下载：

wget --ftp-user=USERNAME --ftp-password=PASSWORD url

CURL命令

不带有任何参数时，curl 就是发出 GET 请求。

$ curl https://www.example.com

上面命令向www.example.com发出 GET 请求，服务器返回的内容会在命令行输出。

-A

-A参数指定客户端的用户代理标头，即User-Agent。curl 的默认用户代理字符串是curl/[version]。

$ curl -A 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36' https://google.com

上面命令将User-Agent改成 Chrome 浏览器。

$ curl -A '' https://google.com

上面命令会移除User-Agent标头。

也可以通过-H参数直接指定标头，更改User-Agent。

$ curl -H 'User-Agent: php/1.0' https://google.com

-b

-b参数用来向服务器发送 Cookie。

$ curl -b 'foo=bar' https://google.com

上面命令会生成一个标头Cookie: foo=bar，向服务器发送一个名为foo、值为bar的 Cookie。

$ curl -b 'foo1=bar;foo2=bar2' https://google.com

上面命令发送两个 Cookie。

$ curl -b cookies.txt https://www.google.com

上面命令读取本地文件cookies.txt，里面是服务器设置的 Cookie（参见-c参数），将其发送到服务器。

-c

-c参数将服务器设置的 Cookie 写入一个文件。

$ curl -c cookies.txt https://www.google.com

上面命令将服务器的 HTTP 回应所设置 Cookie 写入文本文件cookies.txt。

-d

-d参数用于发送 POST 请求的数据体。

$ curl -d'login=emma＆password=123'-X POST https://google.com/login
# 或者
$ curl -d 'login=emma' -d 'password=123' -X POST  https://google.com/login

使用-d参数以后，HTTP 请求会自动加上标头Content-Type : application/x-www-form-urlencoded。并且会自动将请求转为 POST 方法，因此可以省略-X POST。

-d参数可以读取本地文本文件的数据，向服务器发送。

$ curl -d '@data.txt' https://google.com/login

上面命令读取data.txt文件的内容，作为数据体向服务器发送。

–data-urlencode

--data-urlencode参数等同于-d，发送 POST 请求的数据体，区别在于会自动将发送的数据进行 URL 编码。

$ curl --data-urlencode 'comment=hello world' https://google.com/login

上面代码中，发送的数据hello world之间有一个空格，需要进行 URL 编码。

-e

-e参数用来设置 HTTP 的标头Referer，表示请求的来源。

curl -e 'https://google.com?q=example' https://www.example.com

上面命令将Referer标头设为https://google.com?q=example。

-H参数可以通过直接添加标头Referer，达到同样效果。

curl -H 'Referer: https://google.com?q=example' https://www.example.com

-F

-F参数用来向服务器上传二进制文件。

$ curl -F 'file=@photo.png' https://google.com/profile

上面命令会给 HTTP 请求加上标头Content-Type: multipart/form-data，然后将文件photo.png作为file字段上传。

-F参数可以指定 MIME 类型。

$ curl -F 'file=@photo.png;type=image/png' https://google.com/profile

上面命令指定 MIME 类型为image/png，否则 curl 会把 MIME 类型设为application/octet-stream。

-F参数也可以指定文件名。

$ curl -F 'file=@photo.png;filename=me.png' https://google.com/profile

上面命令中，原始文件名为photo.png，但是服务器接收到的文件名为me.png。

-G

-G参数用来构造 URL 的查询字符串。

$ curl -G -d 'q=kitties' -d 'count=20' https://google.com/search

上面命令会发出一个 GET 请求，实际请求的 URL 为https://google.com/search?q=kitties&count=20。如果省略--G，会发出一个 POST 请求。

如果数据需要 URL 编码，可以结合--data--urlencode参数。

$ curl -G --data-urlencode 'comment=hello world' https://www.example.com

-H

-H参数添加 HTTP 请求的标头。

$ curl -H 'Accept-Language: en-US' https://google.com

上面命令添加 HTTP 标头Accept-Language: en-US。

$ curl -H 'Accept-Language: en-US' -H 'Secret-Message: xyzzy' https://google.com

上面命令添加两个 HTTP 标头。

$ curl -d '{"login": "emma", "pass": "123"}' -H 'Content-Type: application/json' https://google.com/login

上面命令添加 HTTP 请求的标头是Content-Type: application/json，然后用-d参数发送 JSON 数据。

-i

-i参数打印出服务器回应的 HTTP 标头。

$ curl -i https://www.example.com

上面命令收到服务器回应后，先输出服务器回应的标头，然后空一行，再输出网页的源码。

-I

-I参数向服务器发出 HEAD 请求，然会将服务器返回的 HTTP 标头打印出来。

$ curl -I https://www.example.com

上面命令输出服务器对 HEAD 请求的回应。

--head参数等同于-I。

$ curl --head https://www.example.com

-k

-k参数指定跳过 SSL 检测。

$ curl -k https://www.example.com

上面命令不会检查服务器的 SSL 证书是否正确。

-L

-L参数会让 HTTP 请求跟随服务器的重定向。curl 默认不跟随重定向。

$ curl -L -d 'tweet=hi' https://api.twitter.com/tweet

–limit-rate

--limit-rate用来限制 HTTP 请求和回应的带宽，模拟慢网速的环境。

$ curl --limit-rate 200k https://google.com

上面命令将带宽限制在每秒 200K 字节。

-o

-o参数将服务器的回应保存成文件，等同于wget命令。

$ curl -o example.html https://www.example.com

上面命令将www.example.com保存成example.html。

-O

-O参数将服务器回应保存成文件，并将 URL 的最后部分当作文件名。

$ curl -O https://www.example.com/foo/bar.html

上面命令将服务器回应保存成文件，文件名为bar.html。

-s

-s参数将不输出错误和进度信息。

$ curl -s https://www.example.com

上面命令一旦发生错误，不会显示错误信息。不发生错误的话，会正常显示运行结果。

如果想让 curl 不产生任何输出，可以使用下面的命令。

$ curl -s -o /dev/null https://google.com

-S

-S参数指定只输出错误信息，通常与-s一起使用。

$ curl -s -o /dev/null https://google.com

上面命令没有任何输出，除非发生错误。

-u

-u参数用来设置服务器认证的用户名和密码。

$ curl -u 'bob:12345' https://google.com/login

上面命令设置用户名为bob，密码为12345，然后将其转为 HTTP 标头Authorization: Basic Ym9iOjEyMzQ1。

curl 能够识别 URL 里面的用户名和密码。

$ curl https://bob:12345@google.com/login

上面命令能够识别 URL 里面的用户名和密码，将其转为上个例子里面的 HTTP 标头。

$ curl -u 'bob' https://google.com/login

上面命令只设置了用户名，执行后，curl 会提示用户输入密码。

-v

-v参数输出通信的整个过程，用于调试。

$ curl -v https://www.example.com

--trace参数也可以用于调试，还会输出原始的二进制数据。

$ curl --trace - https://www.example.com

-x

-x参数指定 HTTP 请求的代理。

$ curl -x socks5://james:cats@myproxy.com:8080 https://www.example.com

上面命令指定 HTTP 请求通过myproxy.com:8080的 socks5 代理发出。

如果没有指定代理协议，默认为 HTTP。

$ curl -x james:cats@myproxy.com:8080 https://www.example.com

上面命令中，请求的代理使用 HTTP 协议。

-X

-X参数指定 HTTP 请求的方法。

$ curl -X POST https://www.example.com

上面命令对https://www.example.com发出 POST 请求。

curl和wget的区别和使用

curl和wget基础功能有诸多重叠，如下载等。

非要说区别的话，curl由于可自定义各种请求参数所以在模拟web请求方面更擅长；wget由于支持ftp和Recursive所以在下载文件方面更擅长。类比的话curl是浏览器，而wget是迅雷9。

1.下载文件

curl -O http://man.linuxde.net/text.iso                    #O大写，不用O只是打印内容不会下载
wget http://www.linuxde.net/text.iso                       #不用参数，直接下载文件

2.下载文件并重命名

curl -o rename.iso http://man.linuxde.net/text.iso         #o小写
wget -O rename.zip http://www.linuxde.net/text.iso         #O大写

3.断点续传

curl -O -C - http://man.linuxde.net/text.iso               #O大写，C大写
wget -c http://www.linuxde.net/text.iso                    #c小写

4.限速下载

curl --limit-rate 50k -O http://man.linuxde.net/text.iso
wget --limit-rate=50k http://www.linuxde.net/text.iso

5.显示响应头部信息

curl -I http://man.linuxde.net/text.iso
wget --server-response http://www.linuxde.net/test.iso

6.wget利器–打包下载网站

wget --mirror -p --convert-links -P /var/www/html http://man.linuxde.net/