Representing symbols, characters and letters that are used worldwide is no mean feat, but Unicode managed it - how? Tom Scott explains how the web has settled on a standard.
This page describes normalization forms for Unicode text. When implementations keep strings in a normalized form, they can be assured that equivalent strings have a unique binary representation. This page also provides examples, additional specifications regarding normalization of Unicode text, and information about conformance testing for Unicode normalization forms.
The tutorial will provide you with an understanding of key requirements for implementing writing systems in information technology. It will do this by examining real examples of a wide range of modern scripts to discover features that a computerized implementation must support.
The Unicode Standard is a character coding system designed to support the worldwide interchange, processing, and display of the written texts of the diverse languages and technical disciplines of the modern world. In addition, it supports classical and historical texts of many written languages.