Text to Hex: Fast UTF-8 Examples for Developers

Converting text to hex is a fast way to inspect the exact bytes your system is storing or transmitting. It helps debug encoding issues, payload signing, hashing differences, and "invisible character" problems (tabs, non-breaking spaces, odd line endings).

Hex is not magic. It is just a readable way to display bytes. When you can see bytes, you can stop guessing and start proving where a pipeline changes data. This guide gives practical UTF-8 examples you can copy and test.

Hex Basics (What You Are Actually Looking At)

Hexadecimal is a base-16 number system. Each hex pair (00 to FF) represents one byte (0 to 255). When you see a hex string like 48 65 6c 6c 6f, you are looking at bytes displayed in a compact, standard format.

"Text to hex" means: encode the text into bytes (usually UTF-8), then display those bytes as hex pairs. "Hex to text" means: interpret the hex pairs as bytes, then decode bytes back into text using an encoding.

Mini FAQ

Is hex an encoding like Base64?: Hex is a representation of bytes. It is not a character encoding. It is just a way to display byte values.
Why do people separate hex with spaces?: Spaces make it easier to see byte boundaries. Many tools accept both spaced and unspaced hex.
What encoding is used for "text to hex"?: Typically UTF-8. If your system uses a different encoding, the byte output will differ.

Quick Examples (UTF-8 Bytes You Can Verify)

Start with ASCII. ASCII characters are 1 byte each in UTF-8, so the mapping is straightforward.

Hello becomes 48 65 6c 6c 6f. Test it in Text to Hex.

Now try examples that reveal why byte inspection matters:

Accents: café becomes 63 61 66 c3 a9 (the é is multiple bytes in UTF-8).
Emoji: 😀 becomes f0 9f 98 80 (4 bytes).
Newlines: Line1 Line2 includes 0a for LF; Windows CRLF includes 0d 0a.
Tabs: a tab character is 09, which explains "why does this not match?" issues in pasted data.

Mini FAQ

Why is é more than one byte?: UTF-8 is variable length. Many non-ASCII characters take 2-4 bytes.
Why do my hex values differ from someone else's?: Either the input text differs (hidden characters) or the encoding differs (UTF-8 vs a legacy encoding).
How do I find invisible characters?: Convert the text to hex and look for unexpected bytes like 09 (tab) or c2 a0 (non-breaking space in UTF-8).

Round-Trip Check (The Fastest Integrity Test)

A round-trip check answers a simple question: do my bytes decode back into the exact text I started with?

Convert text to hex with Text to Hex.
Copy the hex output.
Decode the hex with Hex to Text.
Compare the decoded output to your original text.

If the round-trip fails, your pipeline is transforming bytes or you are decoding with the wrong encoding. This is especially useful when debugging "works locally but fails in production" issues involving signing, hashing, or storage limits.

Mini FAQ

What does it mean if I see the replacement character �?: It often means the byte sequence is not valid UTF-8 or you decoded with the wrong encoding.
Is round-trip always expected to match exactly?: It should match if no system step normalizes or transforms the text. If normalization is applied, you may see differences even when content looks similar.
What is a good workflow for bug reports?: Capture the original text, the hex bytes, and the decoded text. Bytes make the report unambiguous.

UTF-8 vs UTF-16 (Why Byte Counts Surprise People)

Many environments default to UTF-8, but some APIs and libraries expose UTF-16 code units (especially in JavaScript string operations). That can create confusion: a "string length" you see in code is not always the same as byte length in storage or over the network.

Practical example: the emoji 😀 is 4 bytes in UTF-8, but many JavaScript operations treat it as length 2 because it is represented internally as a surrogate pair in UTF-16. That is not a bug in JS; it is a reminder to measure the thing you actually care about: bytes for storage and transport.

When debugging, hex makes this concrete. If your pipeline says it stores "N characters" but your API enforces "N bytes," convert representative text to hex and measure what the system will really send.

Mini FAQ

Why does my character count not match my byte count?: Because UTF-8 is variable length. Many non-ASCII characters take multiple bytes.
Does hex tell me the encoding?: Hex shows bytes. You still need to know what encoding those bytes represent (UTF-8 is the most common for web systems).
What is the safest way to enforce limits?: Enforce limits at the same level the destination enforces (bytes for storage/API, user-perceived characters for UI), and test with multi-byte examples.

When to Use Text to Hex (Real Debugging Scenarios)

Comparing payloads across environments: prove whether two systems are producing the same bytes.
API signing and hashing: signatures are computed on bytes; one hidden character changes everything.
Finding invisible whitespace: tabs, CRLF vs LF, non-breaking spaces, zero-width characters.
Investigating truncation: byte limits can cut multi-byte characters and corrupt UTF-8.
Debugging file imports: detect when a CSV is not actually UTF-8.

A practical technique: if two strings look identical but behave differently (do not match, do not hash the same), convert both to hex and compare byte-by-byte. The first difference is the root cause.

Mini FAQ

Why does hashing care about bytes?: Hashes operate on bytes. If text is encoded differently, the bytes differ, so the hash differs.
How do I tell if a space is a non-breaking space?: In UTF-8, a non-breaking space is typically c2 a0. A normal space is 20.
Can hex help with line ending bugs?: Yes. LF is 0a. CRLF is 0d 0a. That difference causes many cross-platform mismatches.

Hex vs Base64 vs Binary (Pick the Right View)

Hex and Base64 both represent bytes, but they are used for different reasons:

Hex: best for debugging and byte-by-byte inspection; easy to see boundaries.
Base64: best for transporting bytes through text-only systems; more compact than hex.
Binary: useful for bit-level understanding, but usually too verbose for quick debugging.

For Base64 transport debugging, use Text to Base64 and Base64 to Text. For bit-level inspection, use Text to Binary.

Mini FAQ

Should I store data as hex?: Usually no. Hex is great for inspection and logs, but it doubles the size of the data. Store bytes or UTF-8 text directly unless you have a specific reason.
Why is Base64 shorter than hex?: Base64 encodes 3 bytes into 4 characters. Hex encodes 1 byte into 2 characters, so it is less compact.
What should I use for a debug report?: Hex is usually the clearest because it preserves byte boundaries and is easy to compare.

Hex Basics (What You Are Actually Looking At)

Mini FAQ

Quick Examples (UTF-8 Bytes You Can Verify)

Mini FAQ

Round-Trip Check (The Fastest Integrity Test)

Mini FAQ

UTF-8 vs UTF-16 (Why Byte Counts Surprise People)

Mini FAQ

When to Use Text to Hex (Real Debugging Scenarios)

Mini FAQ

Hex vs Base64 vs Binary (Pick the Right View)

Mini FAQ

Keep exploring the encoding and decoding tools

Text to Hex