Zero Width Characters
The general idea is called a Canary Trap, and IIRC it was popularized by Tom Clancy. umpox uploaded his source code on GitHub, and was also so kind to provide a proof-of-concept demo page. But what are Zero Width characters exactly and why does it matter? Basically, if you copy text which includes these special invisible characters you expose yourself since the attacker then know you copied his message, this is a problem for journalist or organizations which have confidential text. So the attacker can hide something in the copied text or expose you since he might know that you’re the one which copied the original message.
No (real) protection
Right now most Browsers, extensions, applications etc are can’t detect such special characters which means extensions like ‘copyplaintext’ aren’t working. The only solution seems to copy the text and compare it with e.g. a Diff Checker in order to reveal if there is something hidden in the copied text or not.
Theoretically, you can change the encoding in order to reveal this, but who is really going to do this for every text on every page. The zero-width space character is encoded in Unicode as U+200B ZERO WIDTH SPACE (HTML ). Programs like NotePad++ can based on the inserted text reveal which encoding was used, this might can help as a workaround.
Safetext by David Jacobson is a little open source utility in order to check the text.
What can you do?
- Avoid releasing excerpts and raw documents or protect them with own watermarks.
- Manually retype excerpts to avoid invisible characters and homoglyphs.
- Get the same documents from multiple leakers to ensure they have the exact same content on a byte-by-byte level and verify the source.
- Keep excerpts short to limit the amount of information shared.
- Use a tool like SafeTxt or Diff Checker that strips non-whitelisted characters from text before sharing it with others. This method might not work 100%.
I’m pretty sure that some people will change their applications, extensions and Browsers now because of these useful findings by umpox. Hopefully, we see solutions soon because I already can predict that this will be abused as soon as possible. What is really strange that these invisible characters are already longer known and that this ‘trick’ is still possible in Browser which over and over again saying there secure, shockingly is that after years someone really found a way to expose you with those trick, but on the other hand I’m thankful for the information so we might get soon or later an update on all our apps in order to detect or disallow such characters.