Shell Script To Show All the Internal and External Links From a URL
Last Updated :
22 Feb, 2023
To make hierarchy among the webpages or the information web developers use to connect them all is called webpage linking. There are two types of webpage linking: one is internal linking, and the other is external linking. Internal links are those which link a page available on the same website to produce a cycle on the site. At the same time, external links are those which link to another website or domain. External links play a vital role in ranking a website on the search engine. Improvement in the website rank can be seen by increasing the number of external links to your website. Here we are asked to code a shell script that could print all these links on the terminal screen. The only input provided to the script is the URL of the webpage for which we need to fetch all the links.
Note: A website can be accessed in two ways: one is using a web browser, and the other is using terminal commands which follow limited protocols to access the website. Terminal commands have some limitations, so we will also use a terminal-based web browser, which will help us to connect to that website.
CLI:
For the command line, we are going to use the tool "lynx". Lynx is a terminal-based web browser that did not show images and other multimedia content to make it much faster than other browsers.
# sudo apt install lynx -y
Install lynx terminal browser
Let us see the GeeksForGeeks project page links. But before we must understand the options present in the lynx browser.
- -dump: This will dump the formatted output of the document.
- -listonly: This will list all the links present on the URL mentioned. This used with -dump.
Now apply these options:
# lynx -dump -listonly https://www.geeksforgeeks.org/computer-science-projects/
dump all links on terminal
Or redirect this terminal output to any text file:
# lynx -dump -listonly https://www.geeksforgeeks.org/computer-science-projects/ > links.txt

Now see the links using cat commands:
# cat links.txt

Shell Script
We could easily do all the work done above in a file using a scripting language, and it would be much easier and enjoyable as well. There are different ways to get the links, like regex. We will use regex with the "sed" command. First, we will download the webpage as text and then apply the regular expression on the text file.
Now we will create a file using the nano editor. Code explanation is given below.
# nano returnLinks.sh

Below is the implementation:
#!/bin/bash
# Give the url
read urL
# wget will now download this webpage in the file named webpage.txt
# -O option is used to concat the content of the url to the file mentioned.
wget -O webpage.txt "$urL"
# Now we will apply stream editor to filter the url from the file.
sed -n 's/.*href="\([^"]*\).*/\1/p' webpage.txt
Give permission to the file:
To execute a file using terminal we first make it executable by changing the accessibility modes of the file. Here 777 represents read, write, and executable. There are some other permissions that could be used to limit the files.
# chmod 777 returnLinks.sh
Now execute the shell script and give the URL:
# ./returnLinks.sh
shell script returns links
You can also store this in an external file as well:
The script will be the same; only the output redirection will be added to the stream editor command so that the output can be stored in the file.
#!/bin/bash
#Give the url
read urL
#wget will now download this webpage in the file named webpage.txt
wget -O webpage.txt "$urL"
#Now we will apply stream editor to filter the url from the file.
# here we will use output redirection to a text file. All the other code is same.
sed -n 's/.*href="\([^"]*\).*/\1/p' webpage.txt > links.txt

Now open the file links.txt
We will now open the file and see if all the links are present in the file or not.
# cat links.txt

Similar Reads
Linux Commands Cheat Sheet Linux, often associated with being a complex operating system primarily used by developers, may not necessarily fit that description entirely. While it can initially appear challenging for beginners, once you immerse yourself in the Linux world, you may find it difficult to return to your previous W
13 min read
grep command in Unix/Linux The grep command in Unix/Linux is a powerful tool used for searching and manipulating text patterns within files. Its name is derived from the ed (editor) command g/re/p (globally search for a regular expression and print matching lines), which reflects its core functionality. grep is widely used by
7 min read
Linux/Unix Tutorial Linux is one of the most widely used open-source operating systems. It's fast, secure, stable, and powers everything from smartphones and servers to cloud platforms and IoT devices. Linux is especially popular among developers, system administrators, and DevOps professionals.Linux is:A Unix-like OS
10 min read
25 Basic Linux Commands For Beginners [2025] While performing a task, we all need shortcuts. Shortcuts help us to complete a task quickly. Linux comes with such commands which are one to two words, using that commands, you can perform several operations in no time. As a beginner, you must be aware of those basic Linux commands to complete an o
13 min read
Sed Command in Linux/Unix With Examples The SED command is one of the most powerful commands used during the process of text processing in Linux/Unix operating systems. The SED command is typically invoked for executing operations such as replace and search, text manipulation, and stream editing.With SED, you can manipulate text files wit
9 min read
AWK command in Unix/Linux with examples Awk is a scripting language used for manipulating data and generating reports. The awk command programming language requires no compiling and allows the user to use variables, numeric functions, string functions, and logical operators. Awk is a utility that enables a programmer to write tiny but eff
8 min read
How to Find a File in Linux | Find Command The find command in Linux is used to search for files and directories based on name, type, size, date, or other conditions. It scans the specified directory and its sub directories to locate files matching the given criteria.find command uses are:Search based on modification time (e.g., files edited
9 min read
Introduction to Linux Shell and Shell Scripting If we are using any major operating system, we are indirectly interacting with the shell. While running Ubuntu, Linux Mint, or any other Linux distribution, we are interacting with the shell by using the terminal. In this article we will discuss Linux shells and shell scripting so before understandi
8 min read
ZIP command in Linux with examples In Linux, the zip command compresses one or more files or directories into a single.zip archive file. This saves disk space, keeps data organized, and makes it simple to share or backup files. It's among the most used compression utilities, particularly when sharing large files via email or storing
6 min read
screen command in Linux with Examples The screen command is an advanced terminal multiplexer that allows you to have multiple sessions within one terminal window. It's like having "tabs" in your Linux terminal â you can open, detach, switch, or resume sessions at any time without losing what you're working on. It's particularly convenie
7 min read