Wed. Dec 11th, 2019

How to Use curl to Download Files From the Linux Command Line

8 min read

The Linux curl command can do a whole lot more than download files. Find out what curl is capable of, and when you should use it instead of wget.

 curl vs. wget : What’s the Difference?

People often struggle to identify the relative strengths of the wget and curl commands. The commands do have some functional overlap. They can each retrieve files from remote locations, but that’s where the similarity ends.

wget is a fantastic tool for downloading content and files. It can download files, web pages, and directories. It contains intelligent routines to traverse links in web pages and recursively download content across an entire website. It is unsurpassed as a command-line download manager.

curl satisfies an altogether different need. Yes, it can retrieve files, but it cannot recursively navigate a website looking for content to retrieve. What curl actually does is let you interact with remote systems by making requests to those systems, and retrieving and displaying their responses to you. Those responses might well be web page content and files, but they can also contain data provided via a web service or API as a result of the “question” asked by the curl request.

And curl isn’t limited to websites. curl supports over 20 protocols, including HTTP, HTTPS, SCP, SFTP, and FTP. And arguably, due to its superior handling of Linux pipes, curl can be more easily integrated with other commands and scripts.

The author of curl has a webpage that describes the differences he sees between curl and wget.

Installing curl

Out of the computers used to research this article, Fedora 31 and Manjaro 18.1.0 had curl already installed. curl had to be installed on Ubuntu 18.04 LTS. On Ubuntu, run this command to install it:

sudo apt-get install curl
sudo apt-get install curl in a terminal window

The curl Version

The --version option makes curlreport its version. It also lists all the protocols that it supports.

curl --version
curl --version in a terminal window

Retrieving a Web Page

If we point curl at a web page, it will retrieve it for us.

curl https://www.bbc.com
curl https://www.bbc.com in a terminal window

But its default action is to dump it to the terminal window as source code.

Output from curl displaying web page source code in a terminal window

Beware: If you don’t tell curl you want something stored as a file, it will always dump it to the terminal window. If the file it is retrieving is a binary file, the outcome can be unpredictable. The shell may try to interpret some of the byte values in the binary file as control characters or escape sequences.

Saving Data to a File

Let’s tell curl to redirect the output into a file:

curl https://www.bbc.com  > bbc.html
curl https://www.bbc.com > bbc.html in a terminal window

This time we don’t see the retrieved information, it is sent straight to the file for us. Because there is no terminal window output to display, curl outputs a set of progress information.

It didn’t do this in the previous example because the progress information would have been scattered throughout the web page source code, so curl automatically suppressed it.

In this example, curl detects that the output is being redirected to a file and that it is safe to generate the progress information.

curl download progress meter in a terminal window

The information provided is:

  • % Total: The total amount to be retrieved.
  • % Received: The percentage and actual values of the data retrieved so far.
  • % Xferd: The percent and actual sent, if data is being uploaded.
  • Average Speed Dload: The average download speed.
  • Average Speed Upload: The average upload speed.
  • Time Total: The estimated total duration of the transfer.
  • Time Spent: The elapsed time so far for this transfer.
  • Time Left: The estimated time left for the transfer to complete
  • Current Speed: The current transfer speed for this transfer.

Because we redirected the output from curl to a file, we now have a file called “bbc.html.”

bbc.html file created by curl.

Double-clicking that file will open your default browser so that it displays the retrieved web page.

Retrieved web page disdplayed in a browser window.

Note that the address in the browser address bar is a local file on this computer, not a remote website.

We don’t have to redirect the output to create a file. We can create a file by using the -o (output) option, and telling curl to create the file. Here we’re using the -o option and providing the name of the file we wish to create “bbc.html.”

curl -o bbc.html https://www.bbc.com
curl -o bbc.html https://www.bbc.com in a terminal window

Using a Progress Bar To Monitor Downloads

To have the text-based download information replaced by a simple progress bar, use the -# (progress bar) option.

curl -x -o bbc.html https://www.bbc.com
curl -x -o bbc.html https://www.bbc.com in a terminal window

Restarting an Interrupted Download

It is easy to restart a download that has been terminated or interrupted. Let’s start a download of a sizeable file. We’ll use the latest Long Term Support build of Ubuntu 18.04. We’re using the --output option to specify the name of the file we wish to save it into: “ubuntu180403.iso.”

curl --output ubuntu18043.iso http://releases.ubuntu.com/18.04.3/ubuntu-18.04.3-desktop-amd64.iso
curl --output ubuntu18043.iso http://releases.ubuntu.com/18.04.3/ubuntu-18.04.3-desktop-amd64.iso in a terminal window

The download starts and works its way towards completion.

Progess of a large download in a terminal widnow

If we forcibly interrupt the download with Ctrl+C , we’re returned to the command prompt, and the download is abandoned.

To restart the download, use the -C (continue at) option. This causes curl to restart the download at a specified point or offset within the target file. If you use a hyphen - as the offset, curl will look at the already downloaded portion of the file and determine the correct offset to use for itself.

curl -C - --output ubuntu18043.iso http://releases.ubuntu.com/18.04.3/ubuntu-18.04.3-desktop-amd64.iso
curl -C - --output ubuntu18043.iso http://releases.ubuntu.com/18.04.3/ubuntu-18.04.3-desktop-amd64.iso ina terminal window

The download is restarted. curl reports the offset at which it is restarting.

curl -C - --output ubuntu18043.iso http://releases.ubuntu.com/18.04.3/ubuntu-18.04.3-desktop-amd64.iso in a terminal window

Retrieving HTTP headers

With the -I (head) option, you can retrieve the HTTP headers only. This is the same as sending the HTTP HEAD command to a web server.

curl -I www.twitter.com
curl -I www.twitter.com in a terminal window

This command retrieves information only; it does not download any web pages or files.

Output from curl -I www.twitter.com in a terminal window

Downloading Multiple URLs

Using xargs we can download multiple URLs at once. Perhaps we want to download a series of web pages that make up a single article or tutorial.

Copy these URLs to an editor and save it to a file called “urls-to-download.txt.” We can use xargs to treat the content of each line of the text file as a parameter which it will feed to curl, in turn.

https://tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#0
https://tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#1
https://tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#2
https://tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#3
https://tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#4
https://tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-ubuntu#5

This is the command we need to use to have xargs pass these URLs to curl one at a time:

xargs -n 1 curl -O < urls-to-download.txt

Note that this command uses the -O (remote file) output command, which uses an uppercase “O.” This option causes curl to save the retrieved  file with the same name that the file has on the remote server.

The -n 1 option tells xargs to treat each line of the text file as a single parameter.

When you run the command, you’ll see multiple downloads start and finish, one after the other.

Output from xargs and curl downloading multiple files

Checking in the file browser shows the multiple files have been downloaded. Each one bears the name it had on the remote server.

downloaded file sin the nautilus file browser

Downloading Files From an FTP Server

Using curl with a File Transfer Protocol (FTP) server is easy, even if you have to authenticate with a username and password. To pass a username and password with curl use the -u (user) option, and type the username, a colon “:”, and the password. Don’t put a space before or after the colon.

This is a free-for-testing FTP server hosted by Rebex. The test FTP site has a pre-set username of “demo”, and the password is “password.” Don’t use this type of weak username and password on a production or “real” FTP server.

curl -u demo:password ftp://test.rebex.net
curl -u demo:password ftp://test.rebex.net in a terminal window

curl figures out that we’re pointing it at an FTP server, and returns a list of the files that are present on the server.

List of files on a remtoe FTP server ina terminal window

The only file on this server is a “readme.txt” file, of 403 bytes in length. Let’s retrieve it. Use the same command as a moment ago, with the filename appended to it:

curl -u demo:password ftp://test.rebex.net/readme.txt
curl -u demo:password ftp://test.rebex.net/readme.txt in a terminal window

The file is retrieved and curl displays its contents in the terminal window.

The contents of a file retrieved from an FTP server displayed in a terminal window

In almost all cases, it is going to be more convenient to have the retrieved file saved to disk for us, rather than displayed in the terminal window. Once more we can use the -O (remote file) output command to have the file saved to disk, with the same filename that it has on the remote server.

curl -O -u demo:password ftp://test.rebex.net/readme.txt
curl -O -u demo:password ftp://test.rebex.net/readme.txt in a terminal window

The file is retrieved and saved to disk. We can use ls to check the file details. It has the same name as the file on the FTP server, and it is the same length, 403 bytes.

ls -hl readme.txt
ls -hl readme.txt in a terminal window

RELATED: How to Use the FTP Command on Linux

Sending Parameters to Remote Servers

Some remote servers will accept parameters in requests that are sent to them. The parameters might be used to format the returned data, for example, or they may be used to select the exact data that the user wishes to retrieve. It is often possible to interact with web application programming interfaces (APIs) using curl.

As a simple example, the ipify website has an API can be queried to ascertain your external IP address.

curl https://api.ipify.org

By adding the format parameter to the command, with the value of “json” we can again request our external IP address, but this time the returned data will be encoded in the JSON format.

curl https://api.ipify.org?format=json
curl https://api.ipify.org in a terminal window

Here’s another example that makes use of a Google API. It returns a JSON object describing a book. The parameter you must provide is the International Standard Book Number (ISBN) number of a book. You can find these on the back cover of most books, usually below a barcode. The parameter we’ll use here is “0131103628.”

curl https://www.googleapis.com/books/v1/volumes?q=isbn:0131103628
curl https://www.googleapis.com/books/v1/volumes?q=isbn:0131103628 in a terminal window

The returned data is comprehensive:

Google book API data displayed in a terminal window

Sometimes curl, Sometimes wget

If I wanted to download content from a website and have the tree-structure of the website searched recursively for that content, I’d use wget.

If I wanted to interact with a remote server or API, and possibly download some files or web pages, I’d use curl. Especially if the protocol was one of the many not supported by wget.

source: howtogeek.com

Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright © All rights reserved. | Newsphere by AF themes.