How do I extract all URLs from a string in Python?

In Python, use the re module: import re; urls = re.findall(r'https?://[^s]+', text). For a no-code solution, use the QuickTextTools URL Extractor.

Does the tool extract mailto: or ftp:// links?

Currently the tool focuses on http:// and https:// URLs. mailto: and ftp:// links are not extracted in this version.

Does the tool extract bare domains like example.com?

No. Only absolute URLs with http:// or https:// prefixes are extracted. Bare domains are excluded to prevent false positives from regular English text.

QuickTextTools

✨ Professional Text Utilities

Home ToolsExtract URLs

URL EXTRACTION TOOL

Extract URLs

🔗 Instantly extract all URLs from any text, HTML, log files, or documents. Perfect for SEO audits and web scraping workflows.

✅ 100% Free🔓 No Login Required⚡ Instant Browser Tool

Instant Extraction

100% Browser-Side

Deduplication & Sorting

Text Input

0 characters

Include http://Include https://Remove DuplicatesSort Alphabetically

Extracted URLs

🔧 How It Works

Extract every URL from any text in just three simple steps

Paste Your Text

Paste any text containing URLs — HTML source, emails, documents, logs, JSON, markdown, or raw text

Choose Options

Filter by protocol, remove duplicates, or sort alphabetically to get exactly the list you need

Extract & Export

Click Extract URLs — copy results to clipboard or download as a .txt file instantly

LEARN THE BASICS

What Is URL Extraction — and When Do You Need It?

Understanding how URL extraction works helps you apply it effectively across SEO, security, development, and research workflows.

What Is URL Extraction?

URL extraction is the process of automatically identifying and pulling out all web addresses from a block of text. Instead of manually scanning thousands of characters for links, a regex-based extractor finds every URL matching a known pattern in milliseconds.

Modern URL extraction tools recognize URLs embedded in HTML attributes, plain sentences, JSON values, log lines, markdown link syntax, and anywhere else a URL might appear — without requiring any markup structure.

Why Not Just Use Ctrl+F?

A manual search finds one URL at a time and requires you to know what you are looking for. URL extraction finds every URL simultaneously, regardless of where it appears, what domain it points to, or how it is embedded in the surrounding text.

For any task involving more than a handful of links — SEO audits, log analysis, security reviews — manual copy-paste is error-prone and time-consuming. Automated extraction is the professional-grade solution.

How the Extraction Regex Works (Under the Hood)

STEP 1 — Pattern Match

A carefully crafted regex scans the entire text for strings beginning with http:// or https:// followed by valid URL characters including paths, query strings, and fragments.

STEP 2 — Filter & Deduplicate

Optional filters remove unwanted protocol variants. The deduplication step converts the match array to a Set and back, preserving only first-occurrence order while eliminating exact duplicates.

STEP 3 — Output

Matched URLs are joined with newline characters — one URL per line — making the output directly usable in spreadsheets, scripts, and link checkers without any further parsing.

REAL-WORLD USAGE

Who Uses URL Extraction?

From SEO professionals to security researchers — here are the most common scenarios where this tool saves hours of manual work.

Web Scraping & Crawling

Quickly pull all links from a scraped HTML page without writing a parser. Paste the raw HTML source and get a clean list of every URL on the page in seconds — perfect for auditing site structure or building crawl queues.

SEO & Link Auditing

Extract all outbound links from a page's source code or a content export. Run deduplication and sort alphabetically to review your link profile, find broken outbound links, or catalog external references for a backlink audit.

Log File Analysis

Server access logs, application error logs, and API debug logs are filled with URLs. Paste a log excerpt and extract every endpoint, redirect target, or external resource referenced — without manually scanning thousands of lines.

Security & Phishing Analysis

Security analysts paste suspicious email bodies or malicious document text to extract all embedded URLs for threat intelligence review, domain reputation checks, and safe-browsing lookups — without clicking any links.

Research & Citation Management

Researchers paste bibliography sections, reference lists, or academic paper text to extract all cited URLs at once. Export as a .txt file and validate each source link without manually copying one by one.

Content Migration & CMS Work

When migrating content between CMS platforms, extract all media URLs, internal links, and external references from exported HTML or markdown files. Use the list to update references in bulk or verify CDN migration completeness.

PRACTICAL EXAMPLES

What Input Types This Tool Handles

The extractor finds URLs regardless of how they appear in surrounding text.

Plain Text with Embedded URLs

Visit our site at https://www.quicktexttools.in for tools. Documentation lives at https://docs.example.com/api/v2. Legacy endpoint: http://old.example.com/endpoint?token=abc&format=json

✅ All three URLs extracted correctly including query string parameters.

HTML Source Code

<a href="https://example.com/page">Link</a>
<img src="https://cdn.example.com/image.png" />
<script src="https://cdn.jsdelivr.net/npm/axios/dist/axios.min.js"></script>

✅ Extracts URLs from href, src, and any other attribute — no HTML parsing required.

Server Log File Excerpt

2025-06-01 12:34:01 GET https://api.example.com/v1/users 200 142ms 2025-06-01 12:34:02 POST https://api.example.com/v1/orders 201 89ms 2025-06-01 12:34:03 Redirect -> https://cdn.example.com/assets/logo.svg

💡 Paste log lines directly — the tool extracts only the URLs, ignoring timestamps and status codes.

Common Input That Returns No Results

❌ Bare domains without protocol prefix

example.com, www.google.com, docs.github.com

❌ Relative URLs (no domain)

/api/v1/users, ../images/logo.png, #section-2

⚠️ Only absolute URLs with http:// or https:// are extracted. Bare domains and relative paths are excluded to prevent false positives.

TROUBLESHOOTING

Fixing Common URL Extraction Issues

Most issues have simple explanations. Here is how to resolve them quickly.

No URLs found despite text clearly containing links

Cause: The URLs in your text may be bare domains (e.g. example.com) without http:// or https:// prefixes, or they may be relative paths like /page/about.

Fix: Ensure your source text contains full absolute URLs with protocol prefixes. If you are working with HTML that uses relative paths, you may need to prepend the base domain manually.

URLs are being cut off or truncated in the output

Cause: This can happen when URLs in the source text are followed immediately by punctuation like ), ], ., or ' that the regex conservatively excludes from the URL boundary.

Fix: Review the extracted URLs and manually append any missing trailing characters if needed. Most URLs extracted from clean HTML or plain text will be complete.

Duplicate URLs appearing in output even with Remove Duplicates enabled

Cause: Remove Duplicates performs an exact string match. If two URLs are identical except for a trailing slash (https://example.com vs https://example.com/) they are treated as different URLs.

Fix: Both variants are technically distinct URLs. If you need to normalize them, copy the output and use a text manipulation tool to strip trailing slashes before deduplication.

Tool extracts URLs from inside JavaScript or CSS code

Cause: The tool extracts ALL URLs from the text, including those in script tags, stylesheets, or inline styles. This is by design — the tool is format-agnostic.

Fix: After extraction, filter the output list to the domain or path patterns you care about. The Sort Alphabetically option helps visually group URLs by domain for faster manual filtering.

Browser freezes when processing very large HTML files

Cause: Extremely large inputs (multi-megabyte HTML files with thousands of URLs) can cause the browser tab to become temporarily unresponsive while the regex processes the full string.

Fix: Wait for processing to complete — most modern browsers recover within a few seconds. For extremely large inputs, consider splitting the file into sections before pasting.

Common Questions About Extracting URLs from Text

Learn More — Guides & Related Resources

How to Extract URLs from Text — Complete Guide

Python, JavaScript, and no-code walkthroughs for every common URL extraction scenario.

Read Guide

Remove Duplicates Tool

Paste your extracted URL list and remove any remaining duplicates in one click.

Open Tool

Sort Lines Tool

Sort your extracted URL list alphabetically to group by domain or identify patterns.

Open Tool

Frequently Asked Questions

Everything you need to know about extracting URLs from text

Do I need to sign up or log in to use this tool?

No. All QuickTextTools are completely free to use online with no login, signup, or account required.

What types of URLs does this tool extract?

The tool extracts all URLs beginning with http:// or https://. This includes plain URLs, URLs embedded in HTML href attributes, markdown links, JSON values, log lines, and raw text.

Does it extract URLs without http/https (bare domains)?

Currently the tool focuses on absolute URLs with http:// or https:// prefixes, as bare domains like example.com are highly ambiguous and produce false positives from regular text. Most real-world use cases involve full URLs.

Is my text sent to any server?

Absolutely not. All URL extraction happens entirely in your browser using JavaScript regex. Your text never leaves your device and is never sent to any server.

Can I extract URLs from HTML source code?

Yes. Paste the full HTML source and the tool will find all URLs in href attributes, src attributes, data attributes, inline styles, and plain text content simultaneously.

What does the Remove Duplicates option do?

When enabled, if the same URL appears multiple times in your text only one copy is kept in the output. This is useful when scraping pages where the same link appears in navigation, body, and footer.

How does the Sort Alphabetically option work?

Extracted URLs are sorted A–Z by their full URL string. This makes it easier to visually group URLs by domain or identify patterns in large URL lists.

What is the maximum text size I can process?

Since processing happens in your browser, practical limits depend on your device RAM. The tool handles multi-megabyte text files — HTML pages, large log files, and concatenated documents — without issues on modern hardware.