Python | Check for URL in a String
Last Updated :
05 Mar, 2025
We are given a string that may contain one or more URLs and our task is to extract them efficiently. This is useful for web scraping, text processing, and data validation. For example:
Input:
s = "My Profile: https://auth.geeksforgeeks.org/user/Prajjwal%20/articles in the portal of https://www.geeksforgeeks.org/"
Output:
['https://auth.geeksforgeeks.org/user/Rayyyyy%20/articles', 'https://www.geeksforgeeks.org/']
Using re.findall()
Python’s Regular Expressions (regex) module allows us to extract patterns like URLs from texts, it comes with various functions like findall(). The re.findall() function in Python is used to find all occurrences of a pattern in a given string and return them as a list.
Python
import re
s = 'My Profile: https://auth.geeksforgeeks.org/user/Rayyyy%20/articles in the portal of https://www.geeksforgeeks.org/'
pattern = r'https?://\S+|www\.\S+'
print("URLs:", re.findall(pattern, s))
OutputURLs: ['https://auth.geeksforgeeks.org/user/Rayyyy%20/articles', 'https://www.geeksforgeeks.org/']
Explanation:
- r’https?://\S+|www\.\S+’ is a regex pattern to match URLs starting with http://, https://, or www.
- findall() extracts all matches in a list
Using the urlparse()
The urlparse() function from Python’s urllib.parse module helps break down a URL into its key parts, such as the scheme (http, https), domain name, path, query parameters, and fragments. This function is useful for validating and extracting URLs from text by checking if a word follows a proper URL structure.
Python
from urllib.parse import urlparse
s = 'My Profile: https://auth.geeksforgeeks.org/user/Rayyyy%20/articles in the portal of https://www.geeksforgeeks.org/'
# Split the string into words
split_s = s.split()
# Empty list to collect URLs
urls = []
for word in split_s:
parsed = urlparse(word)
if parsed.scheme and parsed.netloc:
urls.append(word)
print("URLs:", urls)
OutputURLs: ['https://auth.geeksforgeeks.org/user/Rayyyy%20/articles', 'https://www.geeksforgeeks.org/']
Explanation:
- s.split() function splits the string to words.
- then urlparse(word) function checks each word to see if it has a valid scheme (http/https) and domain.
- URLs are added to url list using append() function.
Using urlextract()
urlextract is a third party library so to use it we need to first install it by giving the command “pip install urlextract” in out terminal, it offers a pre-built solution to find URLs in text. Its URLExtract class helps us to quickly identify URLs without needing custom patterns, making it a convenient choice for difficult extraction of URLs.
Python
from urlextract import URLExtract
s = 'My Profile: https://auth.geeksforgeeks.org/user/Prajjwal%20/articles in the portal of https://www.geeksforgeeks.org/'
extractor = URLExtract()
urls = extractor.find_urls(s)
print("URLs:", urls)
OutputUrls: ['https://auth.geeksforgeeks.org/user/Prajjwal%20/articles', 'https://www.geeksforgeeks.org/']
Explanation:
- import URLExtract from the urlextract library.
- URLExtract() creates an extractor object to scan the string.
- find_urls() detects all URLs in s and returns them as a list, no manual splitting or validation is needed.
Using startswith()
One simple approach is to split the string and check if each word starts with “http://” or “https://” using .startswith() built-in method, we can use .split() function to split the string and then check each word, if it starts with “http://” or “https://”. If it does, we add it to our list of extracted URLs.
Python
s = 'My Profile: https://auth.geeksforgeeks.org/user/Rayyyy%20/articles in the portal of https://www.geeksforgeeks.org/'
x = s.split()
# Empty list to extract the URL
res=[]
for i in x:
if i.startswith("https:") or i.startswith("http:"):
res.append(i)
print("Urls: ", res)
OutputUrls: ['https://auth.geeksforgeeks.org/user/Rayyyy%20/articles', 'https://www.geeksforgeeks.org/']
Explanation:
- string.split() method splits the string into words.
- then we checks if each word starts with http:// or https:// using the “if” statement.
- if it does, then we add it to the list of URLs using .append() method.
Using find() method
find() is a built-in method in Python that is used to find a specific element in a collection, so we can use it to identify and extract a URL from a string. Here’s how:
Python
s = 'My Profile: https://auth.geeksforgeeks.org/user/Rayyyy%20/articles in the portal of https://www.geeksforgeeks.org/'
split_s = s.split()
res=[]
for i in split_s:
if i.find("https:")==0 or i.find("http:")==0:
res.append(i)
print("Urls: ", res)
OutputUrls: ['https://auth.geeksforgeeks.org/user/Rayyyy%20/articles', 'https://www.geeksforgeeks.org/']
Explanation:
- s.split() funtion splits the string to words.
- identify url using i.find() function.
- add the URLs to the list ‘res’ using .append().
Similar Reads
Python | Check for URL in a String
We are given a string that may contain one or more URLs and our task is to extract them efficiently. This is useful for web scraping, text processing, and data validation. For example: Input: s = "My Profile: https://auth.geeksforgeeks.org/user/Prajjwal%20/articles in the portal of https://www.geeks
4 min read
Python - Check for float string
Checking for float string refers to determining whether a given string can represent a floating-point number. A float string is a string that, when parsed, represents a valid float value, such as "3.14", "-2.0", or "0.001". For example: "3.14" is a float string."abc" is not a float string.Using try-
2 min read
Python - Check if substring present in string
The task is to check if a specific substring is present within a larger string. Python offers several methods to perform this check, from simple string methods to more advanced techniques. In this article, we'll explore these different methods to efficiently perform this check. Using in operatorThis
2 min read
Check for True or False in Python
Python has built-in data types True and False. These boolean values are used to represent truth and false in logical operations, conditional statements, and expressions. In this article, we will see how we can check the value of an expression in Python. Common Ways to Check for True or FalsePython p
2 min read
How to Urlencode a Querystring in Python?
URL encoding a query string consists of converting characters into a format that can be safely transmitted over the internet. This process replaces special characters with a '%' followed by their hexadecimal equivalent. In this article, we will explore three different approaches to urlencode a query
2 min read
Requesting a URL from a local File in Python
Making requests over the internet is a common operation performed by most automated web applications. Whether a web scraper or a visitor tracker, such operations are performed by any program that makes requests over the internet. In this article, you will learn how to request a URL from a local File
4 min read
How to check if a string is a valid keyword in Python?
In programming, a keyword is a "reserved word" by the language that conveys special meaning to the interpreter. It may be a command or a parameter. Keywords cannot be used as a variable name in the program snippet. What is Keywords in PythonPython also reserves some keywords that convey special mean
2 min read
Check if a string exists in a PDF file in Python
In this article, we'll learn how to use Python to determine whether a string is present in a PDF file. In Python, strings are essential for Projects, applications software, etc. Most of the time, we have to determine whether a string is present in a PDF file or not. Here, we'll discuss how to check
2 min read
How to check if a Python variable exists?
Variables in Python can be defined locally or globally. There are two types of variables first one is a local variable that is defined inside the function and the second one are global variable that is defined outside the function. Method 1: Checking the existence of a local variableTo check the exi
3 min read
Python - How to search for a string in text files?
In this article, we are going to see how to search for a string in text files using Python Example: string = "GEEK FOR GEEKS"Input: "FOR" Output: Yes, FOR is present in the given string. Text File for demonstration: Finding the index of the string in the text file using readline() In this method, we
2 min read