As per official website wget is : "GNU Wget is a free software package for retrieving files using HTTP, HTTPS, FTP and FTPS, the most widely used Internet protocols. It is a non-interactive command-line tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc."
In simple words, wget is a tool that can help us to retrieve contents from internet. On most Linux distributions wget comes preinstalled. wget is mostly used in cron jobs, get content from internet or to get data from API.
If you ever want to download an entire website form internet, wget can be used to complete the job:
$ wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --no-parent https://saur.in/
This command will download entire https://saur.in website to the current directory. All pages and child pages, css, js, images, videos everything and will convert links so that they work locally and off-line.
Options used are:
- --recursive : Download the entire Web site. This will follow and download every links found on the website. If option --domains not specified it will also download files outside of domain.
- --no-clobber : Don't overwrite any existing files. This is useful in two ways, 1) It will not download repeated links. 2) In case the download is interrupted and resumed.
- --page-requisites : Get all the elements that compose the page (Images, CSS, JS, fonts, videos and so on).
- --html-extension : Save pages with the .html extension.
- --convert-links : Convert all links so that they work locally, off-line.
- --restrict-file-names=windows : Modify filenames so that they will work in Windows as well.
Few other useful options:
- --domains saur.in : Don't follow links outside saur.in domain. Use comma-separated list in case of multiple domains.
- --no-parent : Don't follow links from parent URLs.
- --quiet : Prints no output to terminal (The default option is --verbose).
- --show-progress : Display progress bar.
- --timestamping : Don't re-retrieve files unless newer than local.
- --server-response : Print server response.
- --ignore-case : Ignore case when matching files/directories.
- --no-directories : Don't create directories. Save everything in current directory.
- --https-only : Only follow secure https links.
- --relative : Follow relative links only.