CJK Unified Ideographs – Test for Unicode support in Web browsers

Alan Wood’s Unicode Resources

Test for Unicode support in Web browsers

CJK Unified Ideographs

U+4E00 – U+9FFF   (19968–40959)

There are far too many of these Chinese, Japanese and Korean ideographs to show in a single HTML document, so only the first and last few are shown. There are more of these ideographs in the CJK Unified Ideographs Extension A, CJK Unified Ideographs Extension B, CJK Unified Ideographs Extension C and CJK Unified Ideographs Extension D ranges. The order follows the traditional KangXi dictionary, with characters with the fewest strokes first.

The characters that appear in the “Character” columns of the following tables depend on the browser that you are using, the fonts installed on your computer, and the browser options you have chosen that determine the fonts used to display particular character sets, encodings or languages.

You can find some or all of the characters in this range in the following Unicode fonts:

Arial Unicode MS, Chrysanthi Unicode (few) and Code2000 also contain characters from this range.

To see exactly which characters are included in a particular font, you can use a utility such as Andrew West’s BabelMap, Apple’s TrueEdit, or WunderMoosen’s FontChecker.

You can try out your browser and fonts with simplified Chinese text at http://www.microsoft.com/China/, with traditional Chinese text at http://www.microsoft.com/Taiwan/ and http://www.apple.com.tw/, with Japanese text at http://www.microsoft.com/japan/ and http://www.apple.co.jp/, and with Korean text at http://www.microsoft.com/Korea/ and http://www.applecomputer.co.kr/.

Users of Internet Explorer 5 for Windows can choose to install updates for viewing Chinese, Japanese and Korean Web pages and to provide an interface and an IME in any of those languages.

Users of Macintosh computers running Mac OS 9 can install Apple Language Kits for Chinese, Japanese and Korean, which include IMEs.

There are too many characters in this range to display in a single HTML file. You can find a document that displays them all at http://www.unicode.org/charts/PDF/U4E00.pdf.


Simplified Chinese

The following table has the lang="zh-Hans" attribute, which should cause your Web browser to use a simplified Chinese font.

Character
(decimal)
DecimalCharacter
(hex)
HexName
199684E00<CJK Ideograph, First>
199694E01<CJK Ideograph, Second>
199704E02<CJK Ideograph, Third>
199714E03<CJK Ideograph, Fourth>
199724E04<CJK Ideograph, Fifth>
199734E05<CJK Ideograph, Sixth>
199744E06<CJK Ideograph, Seventh>
199754E07<CJK Ideograph, Eighth>
199764E08<CJK Ideograph, Ninth>
199774E09<CJK Ideograph, Tenth>
408909FBA<CJK Ideograph, Penultimate>
408919FBB<CJK Ideograph, Last>

Traditional Chinese

The following table has the lang="zh-Hant" attribute, which should cause your Web browser to use a traditional Chinese font.

Character
(decimal)
DecimalCharacter
(hex)
HexName
199684E00<CJK Ideograph, First>
199694E01<CJK Ideograph, Second>
199704E02<CJK Ideograph, Third>
199714E03<CJK Ideograph, Fourth>
199724E04<CJK Ideograph, Fifth>
199734E05<CJK Ideograph, Sixth>
199744E06<CJK Ideograph, Seventh>
199754E07<CJK Ideograph, Eighth>
199764E08<CJK Ideograph, Ninth>
199774E09<CJK Ideograph, Tenth>
408909FBA<CJK Ideograph, Penultimate>
408919FBB<CJK Ideograph, Last>

Japanese

The following table has the lang="ja" attribute, which should cause your Web browser to use a Japanese font.

Character
(decimal)
DecimalCharacter
(hex)
HexName
199684E00<CJK Ideograph, First>
199694E01<CJK Ideograph, Second>
199704E02<CJK Ideograph, Third>
199714E03<CJK Ideograph, Fourth>
199724E04<CJK Ideograph, Fifth>
199734E05<CJK Ideograph, Sixth>
199744E06<CJK Ideograph, Seventh>
199754E07<CJK Ideograph, Eighth>
199764E08<CJK Ideograph, Ninth>
199774E09<CJK Ideograph, Tenth>
408909FBA<CJK Ideograph, Penultimate>
408919FBB<CJK Ideograph, Last>

Korean

The following table has the lang="ko" attribute, which should cause your Web browser to use a Korean font.

Character
(decimal)
DecimalCharacter
(hex)
HexName
199684E00<CJK Ideograph, First>
199694E01<CJK Ideograph, Second>
199704E02<CJK Ideograph, Third>
199714E03<CJK Ideograph, Fourth>
199724E04<CJK Ideograph, Fifth>
199734E05<CJK Ideograph, Sixth>
199744E06<CJK Ideograph, Seventh>
199754E07<CJK Ideograph, Eighth>
199764E08<CJK Ideograph, Ninth>
199774E09<CJK Ideograph, Tenth>
408909FBA<CJK Ideograph, Penultimate>
408919FBB<CJK Ideograph, Last>

Copyright © 1999–2010 Alan Wood

The hexadecimal numbers and the character names in the above table are taken from the Unicode 3.0 Character Database, Copyright © 1991–1999 Unicode, Inc., as contained in UnicodeData-Latest.txt on the Unicode ftp site (ftp://ftp.unicode.org/Public/UNIDATA/) in October 1999.

Created 3rd February 1999   Last updated 7th November 2010

HTML 4.01