Clever Uses of Zero-Width Characters
I recently learned about zero-width characters and found them quite interesting, so I’m documenting this here.
Above is an “empty” Weibo post I sent using zero-width spaces, and it worked! However, the same method doesn’t work on WeChat Moments or Twitter, likely because their empty content detection logic filters out zero-width characters.
OK, let’s begin the main content.
Zero-Width Characters
Unicode includes a special class of characters called zero-width characters. The term “zero-width” should be familiar - regex assertions are called zero-width assertions, meaning they don’t consume width, which makes them easy to understand.
Zero-width characters are still characters and thus occupy character space, but they typically don’t display in most software like Sina Weibo, browsers, Excel, WeChat, etc.
There are several types of zero-width characters: zero-width space, zero-width joiner, and zero-width non-joiner.
- Zero-width space
U+200B
- Zero-width joiner
U+200D
- Common complex emoji expressions use this character to indicate multi-character relationships for composing complex new characters
- Zero-width non-joiner
U+200C
- Zero-width no-break space
U+FEFF
- Left-to-right mark
U+200E
- Right-to-left mark
U+200F
Clever Uses
Watermarking
For websites with strict copyright requirements, convert article visitor usernames to zero-width characters and hide them within article content. This doesn’t affect readability while marking user information. If article content leaks, simply decode the username to identify the leak source.
As shown below, the left side shows what we see, but I actually encoded my name into the information. The right side shows the decoded name.
Evading Keyword Matching
- Insert zero-width characters within keywords to evade some keyword matching programs
Invisibility
As mentioned above, these characters are invisible in most software. To see them, you can use development tools like VSC or IDEA to create text files and paste content to reveal these hidden characters.
Recommended Tools
If you want to implement zero-width character watermarking for text, you can refer to these tools: