fbpx

Have you ever opened a document and all you get is gibberish like this “€š�舐Þ¥¿Þ¥¿????”?

Mojibake is the Japanese name for the gibberish that is created when a document is opened using the wrong encoding system. In other words, if the recipient has an encoding system that doesn’t include the letter of a specific language, it will be replaced with another character, which often just appears as gibberish to the reader1.

This led to the creation of Unicode, which is a universal character set. Before Unicode “…each manufacturer invented its own encoding to fit its client market and its usage” 2. This wasn’t a huge problem in the early days of computing, as very few users received and read documents in a language different from their mother tongue. “A computer supported only a small number of languages, the user configured his region to support languages of close countries” 3. However, in 1983, when internet usage really began to skyrocket, access to documents in multiple languages became popular. So, unsurprisingly, Mojibake became a problem for the general public at that time as well.

This was also quite common in regards to Central and Eastern European languages. “Because most computers were unconnected to any network during the 1980s, there were different character encodings for every language with diacritical characters” 4.

English, however, suffers less from Mojibake than other languages because the majority of non-coding programs took English into consideration during development.

The most affected languages are:

  • Japanese, Mandarin, Cantonese and several other Asian languages
  • Arabic 
  • Western European languages like German
  • Cyrillic-base scripts
  • Eastern European Languages like Polish
  • Nordic Languages like Danish
  • Hungarian
  • Spanish

Language is an incredibly rich and interesting aspect of life and when married with technology can become all the more beautiful and complex. What more can you say? I suppose, just this: €š�舐Þ¥¿Þ�舐¥¿

References

  1. Unknown Author. “What Is Mojibake? – Definition from Techopedia.” Techopedia.com, www.techopedia.com/definition/31986/mojibake.
  2. Stinner, Victor. “Charsets and Encodings.” Programming with Unicode, 2011, p. 15, unicodebook.readthedocs.io/encodings.html#charsets-and-encodings.
  3. Stinner, Victor. “Charsets and Encodings.” Programming with Unicode, 2011, p. 9, unicodebook.readthedocs.io/encodings.html#charsets-and-encodings.
  4. Unknown Author. “Mojibake.” Languages Wiki, worldlanguages.fandom.com/wiki/Mojibake.

Pin It on Pinterest

Share This