Mastering ‘wget’: How to Install and Use in Linux

Mastering ‘wget’: How to Install and Use in Linux

Image of a Linux terminal illustrating the installation of the wget command commonly used for retrieving files or data

The ‘wget’ command is your trusty tool when trying to download files from the web in Linux. But if you’re new to Linux or haven’t used ‘wget’ before, you might find the process a bit daunting. Don’t worry, whether you’re a Linux newbie or a seasoned sysadmin, this guide has something for you.

This guide will walk you through the process of installing and using the ‘wget’ command in Linux. We’ll start with the basics, showing you how to install ‘wget’ on Debian and Ubuntu using the APT package management system, and on CentOS and AlmaLinux using the YUM package manager. We’ll then delve into more advanced topics, like compiling ‘wget’ from source and installing a specific version of ‘wget’. Finally, we’ll wrap up with guidance on how to use the ‘wget’ command and verify that the correct version is installed.

Let’s get started!

TL;DR: How Do I Install and Use the ‘wget’ Command in Linux?

In most Linux distributions, ‘wget’ comes pre-installed. If it’s not, you can install it in Debian based distributions like Ubuntu, by running the command sudo apt-get install wget. To use ‘wget’, simply type wget [URL] in your terminal.

# To install wget in Debian based distributions like Ubuntu
sudo apt-get install wget

# To use wget to download a file
wget https://example.com/path/to/file

# Output:
# --2022-01-01 00:00:00--  https://example.com/path/to/file
# Resolving example.com (example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
# Connecting to example.com (example.com)|93.184.216.34|:443... connected.
# HTTP request sent, awaiting response... 200 OK
# Length: unspecified [text/html]
# Saving to: ‘file’
#
#     [ <=>                                   ] 2,818       --.-K/s   in 0s
#
# 2022-01-01 00:00:00 (35.7 MB/s) - ‘file’ saved [2818]

This command will download the file located at the specified URL and save it in your current directory under the same name as it has on the server. If you want to specify a different filename or location, you can do so with the -O option, like this:

wget -O /path/to/save/file https://example.com/path/to/file

This is just a basic way to install and use the ‘wget’ command in Linux, but there’s much more to learn about ‘wget’. Continue reading for more detailed information and advanced usage scenarios.

Mastering the Basics: Installing the ‘wget’ Command in Linux

The ‘wget’ command is a free utility for non-interactive download of files from the web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies. It’s designed to be robust, handling a wide range of common situations including timeouts, network problems, and unexpected termination, making it a staple in any Linux user’s toolkit.

Installing ‘wget’ with APT

If you’re using a Debian-based distribution like Ubuntu, you can install ‘wget’ using the Advanced Package Tool (APT). Here’s how:

sudo apt update
sudo apt install wget

# Output:
# Reading package lists... Done
# Building dependency tree
# Reading state information... Done
# wget is already the newest version (1.19.4-1ubuntu2.2).
# 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

This command updates your package lists, then installs ‘wget’. The output confirms whether ‘wget’ was installed or if it was already present.

Installing ‘wget’ with YUM

For CentOS, AlmaLinux, and other Red Hat-based distributions, you can use the Yellowdog Updater, Modified (YUM) to install ‘wget’. Here’s the command:

sudo yum install wget

# Output:
# Loaded plugins: fastestmirror, ovl
# Loading mirror speeds from cached hostfile
#  * base: mirror.umd.edu
#  * epel: mirror.umd.edu
#  * extras: mirror.umd.edu
#  * updates: mirror.umd.edu
# Resolving Dependencies
# --> Running transaction check
# ---> Package wget.x86_64 0:1.14-18.el7_6.1 will be installed
# --> Finished Dependency Resolution

# Dependencies Resolved

# ================================================================================
#  Package    Arch         Version                  Repository              Size
# ================================================================================
# Installing:
#  wget       x86_64       1.14-18.el7_6.1          base                   547 k

# Transaction Summary
# ================================================================================
# Install  1 Package

# Total download size: 547 k
# Installed size: 2.0 M
# Is this ok [y/d/N]: y
# Downloading packages:
# wget-1.14-18.el7_6.1.x86_64.rpm                                             | 547 kB  00:00:01
# Running transaction check
# Running transaction test
# Transaction test succeeded
# Running transaction
#   Installing : wget-1.14-18.el7_6.1.x86_64                                                     1/1
#   Verifying  : wget-1.14-18.el7_6.1.x86_64                                                     1/1

# Installed:
#   wget.x86_64 0:1.14-18.el7_6.1

# Complete!

This command installs ‘wget’ on your system. The output provides a summary of the installation, including the version of ‘wget’ that was installed.

Installing ‘wget’ from Source Code

Sometimes, you might need to install ‘wget’ from source code. This could be because your distribution’s package manager doesn’t offer the latest version, or you might need to compile ‘wget’ with specific options.

To install ‘wget’ from source, first, you need to download the source code. You can do this with ‘wget’ itself, or another tool like ‘curl’. Here’s how you can download the source code with ‘wget’.

wget https://ftp.gnu.org/gnu/wget/wget-latest.tar.gz

tar -xzf wget-latest.tar.gz
cd wget-*/
./configure
make
sudo make install

# Output:
# Downloading wget source code
# Unpacking wget source code
# Configuring wget for installation
# Compiling wget
# Installing wget

This series of commands downloads the latest ‘wget’ source code, unpacks it, configures it for your system, compiles it, and installs it.

Installing Different Versions of ‘wget’

Installing from Source

If you need to install a specific version of ‘wget’ from source, you can do so by specifying that version when you download the source code. For example, to install ‘wget’ version 1.20.3, you would use the following command:

wget https://ftp.gnu.org/gnu/wget/wget-1.20.3.tar.gz

tar -xzf wget-1.20.3.tar.gz
cd wget-1.20.3/
./configure
make
sudo make install

# Output:
# Downloading wget 1.20.3 source code
# Unpacking wget 1.20.3 source code
# Configuring wget 1.20.3 for installation
# Compiling wget 1.20.3
# Installing wget 1.20.3

Installing with Package Managers

If you’re using a package manager like APT or YUM, you might be able to install a specific version of ‘wget’ by specifying that version in the install command. Here’s how you can do it with APT:

sudo apt-get install wget=1.20.3-1ubuntu1

# Output:
# Reading package lists... Done
# Building dependency tree
# Reading state information... Done
# wget is already the newest version (1.20.3-1ubuntu1).
# 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

And here’s how you can do it with YUM:

sudo yum install wget-1.20.3-1.el7

# Output:
# Loaded plugins: fastestmirror, ovl
# Loading mirror speeds from cached hostfile
#  * base: mirror.umd.edu
#  * epel: mirror.umd.edu
#  * extras: mirror.umd.edu
#  * updates: mirror.umd.edu
# Resolving Dependencies
# --> Running transaction check
# ---> Package wget.x86_64 0:1.20.3-1.el7 will be installed
# --> Finished Dependency Resolution

# Dependencies Resolved

# ================================================================================
#  Package    Arch         Version                  Repository              Size
# ================================================================================
# Installing:
#  wget       x86_64       1.20.3-1.el7             base                   547 k

# Transaction Summary
# ================================================================================
# Install  1 Package

# Total download size: 547 k
# Installed size: 2.0 M
# Is this ok [y/d/N]: y
# Downloading packages:
# wget-1.20.3-1.el7.x86_64.rpm                                             | 547 kB  00:00:01
# Running transaction check
# Running transaction test
# Transaction test succeeded
# Running transaction
#   Installing : wget-1.20.3-1.el7.x86_64                                                     1/1
#   Verifying  : wget-1.20.3-1.el7.x86_64                                                     1/1

# Installed:
#   wget.x86_64 0:1.20.3-1.el7

# Complete!

Key Changes and Features in Different Versions

Different versions of ‘wget’ come with different features and bug fixes. For example, ‘wget’ 1.20 introduced support for HTTP/2 and added several new options. Version 1.19 added support for the HTTP no-store directive and improved the handling of cookies.

Here’s a comparison of the features in different ‘wget’ versions:

VersionKey Features and Changes
1.20HTTP/2 support, new options
1.19HTTP no-store directive, improved cookie handling
1.18Bug fixes, improved FTP support

Using ‘wget’ and Verifying Installation

Basic Usage of ‘wget’

Once you’ve installed ‘wget’, you can use it to download files. Here’s an example:

wget https://ftp.gnu.org/gnu/wget/wget-1.20.3.tar.gz

# Output:
# --2022-01-01 00:00:00--  https://ftp.gnu.org/gnu/wget/wget-1.20.3.tar.gz
# Resolving ftp.gnu.org (ftp.gnu.org)... 209.51.188 .141, 2001:470:142:3::b
# Connecting to ftp.gnu.org (ftp.gnu.org)|209.51.188.141|:443... connected.
# HTTP request sent, awaiting response... 200 OK
# Length: 4311643 (4.1M) [application/x-gzip]
# Saving to: ‘wget-1.20.3.tar.gz’

# wget-1.20.3.tar.gz          100%[=====================================>]   4.11M  --.-KB/s    in 0.1s

# 2022-01-01 00:00:00 (39.8 MB/s) - ‘wget-1.20.3.tar.gz’ saved [4311643/4311643]

This command downloads the ‘wget’ version 1.20.3 source code from the GNU website.

Verifying ‘wget’ Installation

You can verify that ‘wget’ is installed and check its version with the following command:

wget --version

# Output:
# GNU Wget 1.20.3 built on linux-gnu.

# +digest +https +ipv6 +iri +large-file +nls +ntlm +opie +psl +ssl/openssl

# Wgetrc:
#     /etc/wgetrc (system)
# Locale:
#     /usr/share/locale
# Compile:
#     gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc" 
#     -DLOCALEDIR="/usr/share/locale" -I. -I../../src -I../lib 
#     -I../../lib -DHAVE_LIBSSL -DNDEBUG -g -O2 -fdebug-prefix-map=/build/wget-XbZoGD/wget-1.20.3=. 
#     -fstack-protector-strong -Wformat -Werror=format-security -DNO_SSLv2 
#     -D_FILE_OFFSET_BITS=64 -g -Wall
# Link:
#     gcc -DHAVE_LIBSSL -DNDEBUG -g -O2 -fdebug-prefix-map=/build/wget-XbZoGD/wget-1.20.3=. 
#     -fstack-protector-strong -Wformat -Werror=format-security -DNO_SSLv2 
#     -D_FILE_OFFSET_BITS=64 -g -Wall -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now 
#     -lpcre -luuid -lidn2 -lssl -lcrypto -lpsl ftp-opie.o openssl.o http-ntlm.o 
#     ../lib/libgnu.a

# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.

# GNU Wget is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with Wget.  If not, see <http://www.gnu.org/licenses/>.

This command displays the version of ‘wget’ that is installed, as well as some information about how it was compiled.

Exploring Alternatives to ‘wget’ in Linux

While ‘wget’ is a powerful tool for downloading files in Linux, it’s not the only option. There are other command-line tools and graphical download managers that you can use depending on your specific needs and preferences. Let’s explore some alternatives to ‘wget’.

Using ‘curl’ to Download Files in Linux

‘curl’ is another command-line tool for transferring data with URLs. Unlike ‘wget’, ‘curl’ supports a wider range of protocols, including SCP, SFTP, TFTP, TELNET, LDAP(S), FILE, POP3, IMAP, SMTP, RTMP and RTSP. ‘curl’ can also upload and download files, making it a very versatile tool.

Here’s an example of how you can use ‘curl’ to download a file:

curl -O https://ftp.gnu.org/gnu/wget/wget-1.20.3.tar.gz

# Output:
#   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
#                                  Dload  Upload   Total   Spent    Left  Speed
# 100 4208k  100 4208k    0     0  11.2M      0 --:--:-- --:--:-- --:--:-- 11.2M

In this command, the -O option tells ‘curl’ to write output to a file named like the remote file we get. (NOTE: That is a capital letter o, not a zero).

While ‘curl’ is more versatile than ‘wget’, it’s also more complex. ‘wget’ is often easier to use if you’re only interested in downloading files.

Using a Graphical Download Manager

If you prefer a graphical user interface, there are several download managers available for Linux. These tools provide features like download queuing, scheduling, and more.

One popular option is uGet, a powerful download manager with a user-friendly interface. It supports downloading files from HTTP, HTTPS, and FTP servers and allows you to organize your downloads into categories.

Here’s how you can install uGet on Ubuntu:

sudo apt-get install uget

# Output:
# Reading package lists... Done
# Building dependency tree
# Reading state information... Done
# The following additional packages will be installed:
#   aria2 uget-common
# The following NEW packages will be installed:
#   aria2 uget uget-common
# 0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
# Need to get 1,267 kB of archives.
# After this operation, 5,120 kB of additional disk space will be used.
# Do you want to continue? [Y/n] Y
# Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 aria2 amd64 1.33.1-1 [1,201 kB]
# Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 uget-common all 2.2.0-1build1 [58.6 kB]
# Get:3 http://archive.ubuntu.com/ubuntu bionic/universe amd64 uget amd64 2.2.0-1build1 [7,468 B]
# Fetched 1,267 kB in 1s (1,063 kB/s)
# Selecting previously unselected package aria2.
# (Reading database ... 160975 files and directories currently installed.)
# Preparing to unpack .../aria2_1.33.1-1_amd64.deb ...
# Unpacking aria2 (1.33.1-1) ...
# Selecting previously unselected package uget-common.
# Preparing to unpack .../uget-common_2.2.0-1build1_all.deb ...
# Unpacking uget-common (2.2.0-1build1) ...
# Selecting previously unselected package uget.
# Preparing to unpack .../uget_2.2.0-1build1_amd64.deb ...
# Unpacking uget (2.2.0-1build1) ...
# Setting up aria2 (1.33.1-1) ...
# Setting up uget-common (2.2.0-1build1) ...
# Setting up uget (2.2.0-1build1) ...
# Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
# Processing triggers for hicolor-icon-theme (0.17-2) ...

This command installs uGet and its dependencies. You can then start uGet from your application menu.

While graphical download managers like uGet provide a user-friendly interface, they’re not as flexible as command-line tools like ‘wget’ and ‘curl’. They’re best suited for users who prefer a GUI over using the command line.

Navigating Common Issues with ‘wget’

While ‘wget’ is a powerful and reliable tool, like any software, you may encounter issues or unexpected behavior. Let’s go over some common problems and their solutions.

‘wget’: command not found

If you try to run ‘wget’ and see an error message like 'wget': command not found, it means that ‘wget’ is not installed on your system, or it’s not in your system’s PATH.

To solve this, you can install ‘wget’ using your system’s package manager, as we discussed earlier. If ‘wget’ is installed but not in your PATH, you need to add it. Here’s how you can do that:

echo 'export PATH=$PATH:/path/to/wget' >> ~/.bashrc
source ~/.bashrc

# Output:
# This adds the path to wget to your PATH environment variable and reloads your bash configuration.

In this command, replace /path/to/wget with the actual path to your ‘wget’ executable.

SSL/TLS Certificate Errors

If you’re trying to download a file from a website that uses SSL/TLS, you might see an error like ERROR: cannot verify example.com's certificate. This can happen if the website’s SSL/TLS certificate is self-signed, expired, or issued by an untrusted certificate authority.

You can bypass this error with the --no-check-certificate option, but this is not recommended because it makes the connection insecure. A better solution is to update your system’s certificate store or add the specific certificate to your trusted certificates.

# Use with caution: this option makes the connection insecure
wget --no-check-certificate https://example.com/path/to/file

# Output:
# --2022-01-01 00:00:00--  https://example.com/path/to/file
# Loading certificate data from /etc/ssl/certs/ca-certificates.crt
# Resolving example.com (example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
# Connecting to example.com (example.com)|93.184.216.34|:443... connected.
# WARNING: cannot verify example.com's certificate, issued by ‘CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US’:
#   Unable to locally verify the issuer's authority.
# HTTP request sent, awaiting response... 200 OK
# Length: unspecified [text/html]
# Saving to: ‘file’

# file                 [ <=>                ]   2.81K  --.-KB/s    in 0s

# 2022-01-01 00:00:00 (35.7 MB/s) - ‘file’ saved [2874]

404 Not Found Errors

If you see an error like ERROR 404: Not Found, it means that the file you’re trying to download does not exist on the server. Make sure that the URL is correct and that the file exists.

wget https://example.com/path/to/nonexistent/file

# Output:
# --2022-01-01 00:00:00--  https://example.com/path/to/nonexistent/file
# Resolving example.com (example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
# Connecting to example.com (example.com)|93.184.216.34|:443... connected.
# HTTP request sent, awaiting response... 404 Not Found
# 2022-01-01 00:00:00 ERROR 404: Not Found.

These are just a few examples of the issues you might encounter when using ‘wget’. If you come across a problem that’s not covered here, remember that you can always refer to the ‘wget’ man page (man wget) or the ‘wget’ info page (info wget) for more information.

Understanding HTTP and FTP Protocols

Before diving deeper into the ‘wget’ command, it’s crucial to understand the protocols it primarily deals with – HTTP and FTP.

HTTP: The Foundation of Data Communication on the Web

HTTP (Hypertext Transfer Protocol) is the protocol used for transferring data over the internet. It is based on a request-response model. When you type a URL into your browser, your browser sends an HTTP request to the server where the website is hosted. The server then responds with the requested data, allowing your browser to display the website.

# Example of an HTTP request using wget
wget http://example.com

# Output:
# --2022-01-01 00:00:00--  http://example.com/
# Resolving example.com (example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
# Connecting to example.com (example.com)|93.184.216.34|:80... connected.
# HTTP request sent, awaiting response... 200 OK
# Length: 1256 (1.2K) [text/html]
# Saving to: ‘index.html’

# index.html          100%[===================>]   1.23K  --.-KB/s    in 0s

# 2022-01-01 00:00:00 (63.6 MB/s) - ‘index.html’ saved [1256/1256]

In this example, ‘wget’ sends an HTTP request to example.com and saves the response to a file named index.html.

FTP: The Standard Network Protocol for File Transfers

FTP (File Transfer Protocol) is a standard network protocol used for transferring files between a client and a server on a computer network. ‘wget’ can retrieve files using FTP as well as HTTP.

# Example of an FTP request using wget
wget ftp://ftp.example.com/somefile.txt

# Output:
# --2022-01-01 00:00:00--  ftp://ftp.example.com/somefile.txt
#           => ‘somefile.txt’
# Resolving ftp.example.com (ftp.example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
# Connecting to ftp.example.com (ftp.example.com)|93.184.216.34|:21... connected.
# Logging in as anonymous ... Logged in!
# ==> SYST ... done.    ==> PWD ... done.
# ==> TYPE I ... done.  ==> CWD not needed.
# ==> SIZE somefile.txt ... 1256
# ==> PASV ... done.    ==> RETR somefile.txt ... done.
# Length: 1256 (1.2K) (unauthoritative)

# somefile.txt        100%[===================>]   1.23K  --.-KB/s    in 0s

# 2022-01-01 00:00:00 (63.6 MB/s) - ‘somefile.txt’ saved [1256]

In this example, ‘wget’ retrieves a file from an FTP server and saves it to somefile.txt.

The Importance of Command-Line Downloads in Linux

Being able to download files from the command line is an essential skill for system administrators and developers. It allows you to automate tasks, such as downloading logs or data files, updating software, or deploying applications. With ‘wget’, you can download files directly to your server without needing a graphical interface, which is often the case when working with remote servers.

The Power of ‘wget’ in System Administration and Web Scraping

The ‘wget’ command is not just a tool for downloading files. Its versatility extends into areas like system administration and web scraping, proving its worth as an integral part of any Linux user’s toolkit.

‘wget’ in System Administration

For system administrators, ‘wget’ can be a game-changer. Its ability to download files non-interactively makes it perfect for automating tasks. For example, you can use ‘wget’ in a cron job to download daily or weekly updates, logs, or data dumps.

Here’s an example of how you might set up a cron job to download a daily log file from a web server:

# Edit the crontab file

crontab -e

# Add the following line to the crontab file:

0 0 * * * wget -q -O /path/to/save/logfile.txt http://example.com/path/to/logfile.txt

# Output:
# This command will run wget every day at midnight. The -q option makes wget run quietly, and the -O option specifies the output file. The command will download the logfile.txt file from example.com and save it to /path/to/save/logfile.txt.

‘wget’ in Web Scraping

Web scraping is another area where ‘wget’ shines. With its recursive download feature, you can download entire websites or web pages for offline viewing or data extraction. This is particularly useful for web developers and data scientists.

Here’s an example of how you can use ‘wget’ to download an entire website:

wget --recursive --no-clobber --page-requisites --html-extension --convert-links --domains example.com --no-parent www.example.com/somepage.html

# Output:
# This command will download the webpage at www.example.com/somepage.html and all assets (like CSS, JS, and images) needed to view the page offline. It will also download any linked pages within the same domain. The downloaded files will keep their .html extension for easy viewing in a web browser.

Diving Deeper into Shell Scripting and Regular Expressions

If you’re interested in harnessing the full power of ‘wget’, it’s worth exploring related concepts like shell scripting and regular expressions. Shell scripts allow you to automate tasks by combining multiple commands into a single script. Regular expressions, on the other hand, can help you match and manipulate text patterns, which is useful when dealing with large amounts of data or complex file structures.

Further Resources for Mastering ‘wget’

If you want to explore more about ‘wget’, here are some resources that you might find useful:

  1. GNU ‘wget’ Manual: This is the official manual for ‘wget’ from the GNU project. It’s a comprehensive resource that covers all features and options of ‘wget’.

  2. Linux Command Line Basics: This Udemy course covers the basics of the Linux command line, including how to use commands like ‘wget’.

  3. Web Scraping with Bash: This tutorial explains how to use bash command-line tools for web scraping.

Wrapping Up: Mastering the ‘wget’ Command in Linux

In this comprehensive guide, we’ve explored the ins and outs of the ‘wget’ command in Linux, a powerful tool for downloading files from the web.

We kicked things off with the basics, walking you through the installation of ‘wget’ on various Linux distributions. We then delved into more advanced topics, such as downloading multiple files, limiting download speed, and even downloading in the background.

Along the way, we tackled common challenges you might encounter when using ‘wget’, such as command not found errors, SSL/TLS certificate errors, and 404 Not Found errors, providing you with solutions and workarounds for each issue.

We also looked at alternative approaches to file downloading in Linux, comparing ‘wget’ with other methods like the ‘curl’ command and graphical download managers. Here’s a quick comparison of these methods:

MethodFlexibilityComplexityUser Interface
‘wget’HighModerateCommand Line
‘curl’Very HighHighCommand Line
Graphical Download ManagersModerateLowGraphical User Interface

Whether you’re just starting out with ‘wget’ or you’re looking to level up your file downloading skills, we hope this guide has given you a deeper understanding of ‘wget’ and its capabilities.

With its balance of flexibility, power, and ease of use, ‘wget’ is a powerful tool for downloading files in Linux. Happy downloading!