FORMATFORGE // KNOWLEDGE_BASE

URL Encoding, Base64, and Web-Safe Strings Explained

Runs locally in your browser Updated: April 2026 No data upload required

Quick Answer

URL encoding (percent encoding) converts unsafe characters into a format that can be transmitted in URLs. Spaces become %20, special characters become percent-encoded hex values, and the URL remains valid across all systems. Understanding encoding rules prevents broken links, double-encoding bugs, and security issues in web applications.

Why URLs Need Encoding

URLs can only contain a limited set of characters defined by RFC 3986. Characters outside this set, including spaces, accented letters, and most punctuation, must be percent-encoded before they can appear in a URL. Without encoding, browsers, servers, and proxies may interpret these characters as structural delimiters rather than data.

For example, a space in a URL path could be misinterpreted as the end of the URL. An ampersand in a query value could be misinterpreted as a parameter separator. Percent encoding ensures that every character is unambiguous.

RFC 3986: The URL Standard

RFC 3986 defines which characters are allowed in URLs without encoding:

Category Characters Notes
Unreserved (always safe) A-Z a-z 0-9 - _ . ~ Never need encoding
Reserved (structural meaning) : / ? # [ ] @ ! $ & ' ( ) * + , ; = Encode only when used as data, not as delimiters
Everything else Spaces, accented characters, CJK, emoji Must always be percent-encoded

How Percent Encoding Works

Each byte of the character's UTF-8 representation is written as a percent sign followed by two hex digits. A space (byte 0x20) becomes %20. The letter "é" (UTF-8 bytes 0xC3 0xA9) becomes %C3%A9. Emoji and CJK characters produce longer sequences because they use 3 or 4 UTF-8 bytes.

// Encoding examples
Space       → %20
é (U+00E9)  → %C3%A9
€ (U+20AC)  → %E2%82%AC
日 (U+65E5) → %E6%97%A5

Spaces in URLs: %20 vs +

One of the most common sources of confusion is how spaces are encoded. The answer depends on where in the URL the space appears:

Context Space encoding Standard
URL path %20 RFC 3986
Query string (form data) + application/x-www-form-urlencoded (HTML forms)
Query string (general) %20 RFC 3986

The + convention comes from HTML form encoding, not from the URL standard itself. When building URLs programmatically, %20 is the safer choice because it works everywhere. Use the URL Encoder / Decoder to see how your specific input gets encoded.

JavaScript Encoding Functions

JavaScript provides three different encoding functions, each with a different scope:

Function Encodes Does NOT encode Use for
encodeURIComponent() Everything except A-Z a-z 0-9 - _ . ~ ! ' ( ) * Unreserved characters Query parameter values, path segments
encodeURI() Spaces and non-ASCII characters Reserved characters like : / ? # & Encoding a complete URL while preserving structure
escape() (deprecated) Most non-alphanumeric Various exceptions Do not use. Deprecated and inconsistent.
// encodeURIComponent - use for parameter values
const query = encodeURIComponent("hello world & goodbye");
// "hello%20world%20%26%20goodbye"

// encodeURI - use for complete URLs
const url = encodeURI("https://example.com/path with spaces/page");
// "https://example.com/path%20with%20spaces/page"

// Common mistake: using encodeURI for a parameter value
// This does NOT encode & and = which breaks the query string
encodeURI("key=value&other");  // WRONG for parameter values

Python Encoding

from urllib.parse import quote, quote_plus, unquote

# Encode a path segment
quote("hello world")           # "hello%20world"

# Encode a query parameter (+ for spaces)
quote_plus("hello world")      # "hello+world"

# Decode
unquote("hello%20world")       # "hello world"

Double Encoding: The Most Common URL Bug

Double encoding happens when an already-encoded string gets encoded again. The % character itself gets encoded to %25, turning %20 into %2520. This is one of the most common and hardest-to-debug URL bugs.

How It Happens

// Step 1: Original value
"hello world"

// Step 2: First encoding (correct)
"hello%20world"

// Step 3: Second encoding (bug!)
"hello%2520world"

// The server receives %2520 and decodes once to %20
// The actual value becomes "hello%20world" instead of "hello world"

How to Detect Double Encoding

How to Prevent Double Encoding

Encode exactly once, at the point where the URL is constructed. Never encode a value that might already be encoded without checking first. If you are unsure whether input is already encoded, decode it first and then re-encode.

URL Slugs vs URL Encoding

URL slugs and URL encoding solve different problems. Encoding makes any string safe for URLs by converting unsafe characters to percent sequences. Slugs transform human-readable titles into clean, permanent URL paths using only lowercase letters, digits, and hyphens.

Approach Input Output Use case
URL encoding "Héllo Wörld!" "H%C3%A9llo%20W%C3%B6rld!" Query parameters, API values
Slug generation "Héllo Wörld!" "hello-world" Page URLs, blog posts, product pages

Use the URL Slug Generator for clean page URLs. Use the URL Encoder / Decoder for encoding arbitrary values in query strings and API parameters. See the SEO-friendly URL slugs guide for slug best practices.

Base64 Encoding

Base64 is a different encoding scheme that converts binary data into ASCII text. Unlike URL encoding, Base64 is not designed for URLs specifically — it is a general-purpose binary-to-text encoding used for embedding data in text-based formats.

Base64 Variants

Variant Alphabet Padding Use case
Standard (RFC 4648) A-Z a-z 0-9 + / = Email attachments (MIME), data storage
URL-safe (RFC 4648 §5) A-Z a-z 0-9 - _ Optional URLs, filenames, JWT tokens
MIME Standard + line breaks every 76 chars = Email encoding

Standard Base64 uses + and / which are not URL-safe. URL-safe Base64 replaces these with - and _. JWT tokens use URL-safe Base64 without padding. The Base64 Encoder / Decoder handles both variants.

Data URIs

Base64 enables embedding binary data directly in HTML and CSS using data URIs:

// Embedding a small image in HTML
<img src="data:image/png;base64,iVBORw0KGgo..." />

// Embedding a font in CSS
@font-face {
  src: url("data:font/woff2;base64,d09GMg...");
}

Data URIs are useful for small resources (icons, small images) where eliminating an HTTP request improves performance. For larger resources, separate files with proper caching are more efficient.

JWT Tokens: Base64URL in Practice

JSON Web Tokens (JWT) are a practical example of URL-safe Base64 encoding. A JWT consists of three Base64URL-encoded segments separated by dots:

// JWT structure
header.payload.signature

// Each part is Base64URL encoded (no padding)
eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0In0.signature

The header specifies the signing algorithm. The payload contains claims (user data, expiration time, issuer). The signature ensures integrity. Use the JWT Decoder to inspect tokens without needing to write code.

Important: decoding a JWT does not verify it. The payload is only Base64-encoded, not encrypted. Anyone can read it. Verification requires checking the signature against the secret or public key.

Common Encoding Mistakes

FAQ

Should I use encodeURI or encodeURIComponent?

Use encodeURIComponent for encoding individual values (query parameters, path segments). Use encodeURI only when you need to encode a complete URL while preserving its structure. In most cases, encodeURIComponent is what you want.

Is Base64 encoding the same as encryption?

No. Base64 is a reversible encoding, not encryption. Anyone can decode Base64 data. It provides no security, only format conversion. Never use Base64 to protect sensitive data.

Why do some URLs use + for spaces and others use %20?

The + convention comes from HTML form encoding (application/x-www-form-urlencoded). The %20 convention comes from RFC 3986 (the URL standard). Both are valid in query strings, but %20 is correct in URL paths.

How do I know if a URL is double-encoded?

Look for %25 sequences. If you see %2520 (which decodes to %20), the URL was encoded twice. Decode once and check if the result is a valid encoded URL.

Can I put emoji in URLs?

Yes, but they must be percent-encoded for reliable transmission. Modern browsers display the decoded form in the address bar, but the actual HTTP request uses the encoded form. A single emoji can produce 12+ characters of percent encoding.

What is the difference between URL encoding and slug generation?

URL encoding preserves the original characters by converting them to percent sequences. Slug generation transforms text into a clean, readable format using only safe characters. They solve different problems and are used in different parts of a URL.

Related Tools

Related Guides