Quick Answer
URL encoding (percent encoding) converts unsafe characters into a format that can be transmitted in URLs. Spaces become %20, special characters become percent-encoded hex values, and the URL remains valid across all systems. Understanding encoding rules prevents broken links, double-encoding bugs, and security issues in web applications.
Why URLs Need Encoding
URLs can only contain a limited set of characters defined by RFC 3986. Characters outside this set, including spaces, accented letters, and most punctuation, must be percent-encoded before they can appear in a URL. Without encoding, browsers, servers, and proxies may interpret these characters as structural delimiters rather than data.
For example, a space in a URL path could be misinterpreted as the end of the URL. An ampersand in a query value could be misinterpreted as a parameter separator. Percent encoding ensures that every character is unambiguous.
RFC 3986: The URL Standard
RFC 3986 defines which characters are allowed in URLs without encoding:
| Category | Characters | Notes |
|---|---|---|
| Unreserved (always safe) | A-Z a-z 0-9 - _ . ~ |
Never need encoding |
| Reserved (structural meaning) | : / ? # [ ] @ ! $ & ' ( ) * + , ; = |
Encode only when used as data, not as delimiters |
| Everything else | Spaces, accented characters, CJK, emoji | Must always be percent-encoded |
How Percent Encoding Works
Each byte of the character's UTF-8 representation is written as a percent sign followed by two hex digits. A space (byte 0x20) becomes %20. The letter "é" (UTF-8 bytes 0xC3 0xA9) becomes %C3%A9. Emoji and CJK characters produce longer sequences because they use 3 or 4 UTF-8 bytes.
// Encoding examples
Space → %20
é (U+00E9) → %C3%A9
€ (U+20AC) → %E2%82%AC
日 (U+65E5) → %E6%97%A5
Spaces in URLs: %20 vs +
One of the most common sources of confusion is how spaces are encoded. The answer depends on where in the URL the space appears:
| Context | Space encoding | Standard |
|---|---|---|
| URL path | %20 |
RFC 3986 |
| Query string (form data) | + |
application/x-www-form-urlencoded (HTML forms) |
| Query string (general) | %20 |
RFC 3986 |
The + convention comes from HTML form encoding, not from the URL standard itself. When building URLs programmatically, %20 is the safer choice because it works everywhere. Use the URL Encoder / Decoder to see how your specific input gets encoded.
JavaScript Encoding Functions
JavaScript provides three different encoding functions, each with a different scope:
| Function | Encodes | Does NOT encode | Use for |
|---|---|---|---|
encodeURIComponent() |
Everything except A-Z a-z 0-9 - _ . ~ ! ' ( ) * |
Unreserved characters | Query parameter values, path segments |
encodeURI() |
Spaces and non-ASCII characters | Reserved characters like : / ? # & |
Encoding a complete URL while preserving structure |
escape() (deprecated) |
Most non-alphanumeric | Various exceptions | Do not use. Deprecated and inconsistent. |
// encodeURIComponent - use for parameter values
const query = encodeURIComponent("hello world & goodbye");
// "hello%20world%20%26%20goodbye"
// encodeURI - use for complete URLs
const url = encodeURI("https://example.com/path with spaces/page");
// "https://example.com/path%20with%20spaces/page"
// Common mistake: using encodeURI for a parameter value
// This does NOT encode & and = which breaks the query string
encodeURI("key=value&other"); // WRONG for parameter values
Python Encoding
from urllib.parse import quote, quote_plus, unquote
# Encode a path segment
quote("hello world") # "hello%20world"
# Encode a query parameter (+ for spaces)
quote_plus("hello world") # "hello+world"
# Decode
unquote("hello%20world") # "hello world"
Double Encoding: The Most Common URL Bug
Double encoding happens when an already-encoded string gets encoded again. The % character itself gets encoded to %25, turning %20 into %2520. This is one of the most common and hardest-to-debug URL bugs.
How It Happens
// Step 1: Original value
"hello world"
// Step 2: First encoding (correct)
"hello%20world"
// Step 3: Second encoding (bug!)
"hello%2520world"
// The server receives %2520 and decodes once to %20
// The actual value becomes "hello%20world" instead of "hello world"
How to Detect Double Encoding
- Look for
%25in URLs — this is the percent sign itself encoded, which usually indicates double encoding - Check server logs for URLs containing
%2520(double-encoded space) or%253D(double-encoded equals) - If a URL works when you paste it in the browser but fails in your application, double encoding is likely the cause
How to Prevent Double Encoding
Encode exactly once, at the point where the URL is constructed. Never encode a value that might already be encoded without checking first. If you are unsure whether input is already encoded, decode it first and then re-encode.
URL Slugs vs URL Encoding
URL slugs and URL encoding solve different problems. Encoding makes any string safe for URLs by converting unsafe characters to percent sequences. Slugs transform human-readable titles into clean, permanent URL paths using only lowercase letters, digits, and hyphens.
| Approach | Input | Output | Use case |
|---|---|---|---|
| URL encoding | "Héllo Wörld!" | "H%C3%A9llo%20W%C3%B6rld!" | Query parameters, API values |
| Slug generation | "Héllo Wörld!" | "hello-world" | Page URLs, blog posts, product pages |
Use the URL Slug Generator for clean page URLs. Use the URL Encoder / Decoder for encoding arbitrary values in query strings and API parameters. See the SEO-friendly URL slugs guide for slug best practices.
Base64 Encoding
Base64 is a different encoding scheme that converts binary data into ASCII text. Unlike URL encoding, Base64 is not designed for URLs specifically — it is a general-purpose binary-to-text encoding used for embedding data in text-based formats.
Base64 Variants
| Variant | Alphabet | Padding | Use case |
|---|---|---|---|
| Standard (RFC 4648) | A-Z a-z 0-9 + / |
= |
Email attachments (MIME), data storage |
| URL-safe (RFC 4648 §5) | A-Z a-z 0-9 - _ |
Optional | URLs, filenames, JWT tokens |
| MIME | Standard + line breaks every 76 chars | = |
Email encoding |
Standard Base64 uses + and / which are not URL-safe. URL-safe Base64 replaces these with - and _. JWT tokens use URL-safe Base64 without padding. The Base64 Encoder / Decoder handles both variants.
Data URIs
Base64 enables embedding binary data directly in HTML and CSS using data URIs:
// Embedding a small image in HTML
<img src="data:image/png;base64,iVBORw0KGgo..." />
// Embedding a font in CSS
@font-face {
src: url("data:font/woff2;base64,d09GMg...");
}
Data URIs are useful for small resources (icons, small images) where eliminating an HTTP request improves performance. For larger resources, separate files with proper caching are more efficient.
JWT Tokens: Base64URL in Practice
JSON Web Tokens (JWT) are a practical example of URL-safe Base64 encoding. A JWT consists of three Base64URL-encoded segments separated by dots:
// JWT structure
header.payload.signature
// Each part is Base64URL encoded (no padding)
eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0In0.signature
The header specifies the signing algorithm. The payload contains claims (user data, expiration time, issuer). The signature ensures integrity. Use the JWT Decoder to inspect tokens without needing to write code.
Important: decoding a JWT does not verify it. The payload is only Base64-encoded, not encrypted. Anyone can read it. Verification requires checking the signature against the secret or public key.
Common Encoding Mistakes
- Double encoding: Encoding an already-encoded string, turning
%20into%2520 - Using encodeURI for parameter values: This does not encode
&and=, breaking query string structure - Mixing standard and URL-safe Base64: Using
+and/in URLs without URL-safe encoding - Forgetting to encode non-ASCII characters: Accented characters and emoji in URLs work in modern browsers but must be encoded for API calls and older systems
- Confusing encoding with encryption: Base64 and URL encoding are reversible transformations, not security measures
FAQ
Should I use encodeURI or encodeURIComponent?
Use encodeURIComponent for encoding individual values (query parameters, path segments). Use encodeURI only when you need to encode a complete URL while preserving its structure. In most cases, encodeURIComponent is what you want.
Is Base64 encoding the same as encryption?
No. Base64 is a reversible encoding, not encryption. Anyone can decode Base64 data. It provides no security, only format conversion. Never use Base64 to protect sensitive data.
Why do some URLs use + for spaces and others use %20?
The + convention comes from HTML form encoding (application/x-www-form-urlencoded). The %20 convention comes from RFC 3986 (the URL standard). Both are valid in query strings, but %20 is correct in URL paths.
How do I know if a URL is double-encoded?
Look for %25 sequences. If you see %2520 (which decodes to %20), the URL was encoded twice. Decode once and check if the result is a valid encoded URL.
Can I put emoji in URLs?
Yes, but they must be percent-encoded for reliable transmission. Modern browsers display the decoded form in the address bar, but the actual HTTP request uses the encoded form. A single emoji can produce 12+ characters of percent encoding.
What is the difference between URL encoding and slug generation?
URL encoding preserves the original characters by converting them to percent sequences. Slug generation transforms text into a clean, readable format using only safe characters. They solve different problems and are used in different parts of a URL.
Related Tools
- URL Encoder / Decoder for percent encoding and decoding
- Base64 Encoder / Decoder for binary-to-text encoding
- URL Slug Generator for clean page URLs
- JWT Decoder for inspecting token contents