This page helps you convert between plain text and HTML entity references:
<, >, &, quotes) into entity references so the browser shows them as text instead of treating them as markup.& and numeric like ©) back into the characters they represent.Use it when you need to safely display user-generated text in HTML, debug template output, or convert copied HTML source into readable characters. For security: encoding is one part of preventing cross-site scripting (XSS), but you must also apply the correct escaping for the context (HTML text vs attribute vs JavaScript vs URL).
Tip: If you paste already-encoded strings (for example <) and click Encode again, you may “double-encode” the ampersand (becoming &lt;). That is expected—see Limitations & assumptions.
HTML uses certain characters to define structure:
< starts a tag (e.g., <div>)& starts an entity reference (e.g., &)" and ' often delimit attribute valuesIf those characters appear in content that is meant to be displayed literally (for example showing a code snippet, or displaying a user’s message), they can break the document structure or introduce XSS risk. Entity encoding replaces those characters with safe sequences that the browser interprets as literal text.
HTML supports two main forms of entity references:
© → ©© or © → ©Numeric references map directly to Unicode code points. Conceptually, decoding a numeric entity converts an integer code point into a character.
A numeric entity in decimal form uses a code point :
A numeric entity in hexadecimal form uses the same code point written base-16:
Where is the hexadecimal representation of . (For example, 169 decimal equals A9 hexadecimal.)
| Type | Example | Pros | Cons | Typical use |
|---|---|---|---|---|
| Named entity | & → & |
Readable, common for core reserved characters | Not every Unicode character has a named entity | Escaping HTML-reserved characters |
| Numeric (decimal) | © → © |
Works for any Unicode code point | Harder to read | Interchange where named entity may be unknown |
| Numeric (hex) | 😀 → 😀 |
Compact for some ranges; common in dev tools | Harder to read than named entities | Debugging, code snippets, documentation |
Input:
<script>alert("XSS")</script> & friends
Encoded output (what you can safely display as text in an HTML page):
<script>alert("XSS")</script> & friends
Interpretation: The browser will render the literal characters <script>... as text instead of executing them as markup.
Input:
Tom & Jerry © 1990
Decoded output:
Tom & Jerry © 1990
Input:
<div>
Encode output:
&lt;div&gt;
This is expected: the ampersand in < is itself a reserved character, so it is encoded again.
&, <, >, ", and '. (Some contexts require additional escaping.)<, it may be because the input was double-encoded (e.g., &lt;).For the core reserved characters (&, <, >, quotes), named entities are conventional and readable. Numeric entities are useful when a character doesn’t have a named entity or when you prefer a direct code-point form.
< into <?Yes. Decoding converts valid entity references into their literal characters. If you have &lt;, decoding once yields <; decoding twice yields <.
It helps for HTML text-node context, but XSS prevention depends on context. For example, values placed inside JavaScript, CSS, or URLs require different escaping/encoding rules. Always use your framework’s recommended output-encoding functions for the exact context.
No. Encoding here focuses on the characters that must be escaped for HTML structure. Most Unicode characters can be left as-is in UTF-8 pages. If you need “encode everything to numeric entities,” you’d use a different mode/tool.
&, <, >, ", '). It does not attempt to convert all non-ASCII characters into numeric entities.< → &lt;).