czwartek, stycznia 18, 2007

Znaki diakrytyczne
Przeszukałem MSDN Library 'html entities'. Co dostałem?
Character Entities for Special Symbols and BIDI Text

This table lists the named entities (NE) with their equivalent numeric character references (NCR) that are used to escape markup characters and denote spaces and dashes. Other characters in this section apply to internationalization issues, such as the disambiguation of bidirectional (BIDI) text, and special symbols.

Using NE NE NCR Description
C0 Controls and Basic Latin
" " " quotation mark, =apl quote, U0022 ISOnum
& & & ampersand, U0026 ISOnum
< &lt; &#60; less-than sign, U003C ISOnum
> &gt; &#62; greater-than sign, U003E ISOnum
Latin Extended-A
Π&OElig; &#338; Latin capital ligature oe, U0152 ISOlat2
œ &oelig; &#339; Latin small ligature oe, U0153 ISOlat2
Š &Scaron; &#352; Latin capital letter s with caron, U0160 ISOlat2
š &scaron; &#353; Latin small letter s with caron, U0161 ISOlat2
Ÿ &Yuml; &#376; Latin capital letter y with diaeresis, U0178 ISOlat2
Spacing Modifier Letters
ˆ &circ; &#710; modifier letter circumflex accent, U02C6 ISOpub
˜ &tilde; &#732; small tilde, U02DC ISOdia
General Punctuation
[ ] &ensp; &#8194; en space, U2002 ISOpub
[ ] &emsp; &#8195; em space, U2003 ISOpub
[?] &thinsp; &#8201; thin space, U2009 ISOpub
? &zwnj; &#8204; zero width non-joiner, U200C NEW RFC 2070
? &zwj; &#8205; zero width joiner, U200D NEW RFC 2070
? &lrm; &#8206; left-to-right mark, U200E NEW RFC 2070
? &rlm; &#8207; right-to-left mark, U200F NEW RFC 2070
&ndash; &#8211; en dash, U2013 ISOpub
&mdash; &#8212; em dash, U2014 ISOpub
&lsquo; &#8216; left single quotation mark, U2018 ISOnum
&rsquo; &#8217; right single quotation mark, U2019 ISOnum
&sbquo; &#8218; single low-9 quotation mark, U201A NEW
&ldquo; &#8220; left double quotation mark, U201C ISOnum
&rdquo; &#8221; right double quotation mark, U201D ISOnum
&bdquo; &#8222; double low-9 quotation mark, U201E NEW
&dagger; &#8224; dagger, U2020 ISOpub
&Dagger; &#8225; double dagger, U2021 ISOpub
&permil; &#8240; per mille sign, U2030 ISOtech
&lsaquo; &#8249; single left-pointing angle quotation mark, U2039 ISO proposed
&rsaquo; &#8250; single right-pointing angle quotation mark, U203A ISO proposed
&euro; &#8364; euro sign, U+20AC NEW

Related Topics



Coś jeszcze
Character Set Recognition

Microsoft Internet Explorer uses the character set specified for a document to determine how to translate the bytes in the document into characters on the screen or on paper. By default, Internet Explorer uses the character set specified in the HTTP content type returned by the server to determine this translation. If this parameter is not given, Internet Explorer uses the character set specified by the meta element in the document. It uses the user's preferences if no meta element is specified.

You can use the meta element to explicitly set the character set for a document. In this case, set the HTTP-EQUIV attribute to Content-Type and specify a character set identifier in the CONTENT attribute. For example, the following meta element identifies windows-1251 as the character set for the document.

<META HTTP-EQUIV="Content-Type"
CONTENT="text/html; CHARSET=windows-1251">

To apply a character set to an entire document, you must insert the meta element before the body element. For clarity, it should appear as the first element after head, so that all browsers can translate the meta element before the document is parsed. The meta element applies to the document containing it. This means, for example, that a compound document (a document consisting of two or more documents in a set of frames) can use different character sets in different frames.

The following table contains information about the character sets supported by Internet Explorer 5, and it includes the following information.

  • Charset Friendly Name. Name used to refer to the character set.
  • Preferred Charset Label. Most common identifier used to set character sets in Internet Explorer. For example, in the previous code sample the Charset Label is windows-1251. These identifiers are used for outbound data.
  • Aliases. Other identifiers that can be used to set character sets. These identifiers are used for inbound data.
  • IE Ver. Versions of Internet Explorer that support the listed character sets.
  • Min OS. Minimum operating system that supports the listed character sets.
  • Code Page. Code page that supports the listed character sets.
  • Family Code Page. Indicates a Microsoft Windows code page that is used to represent all or most of the characters in a charset.

Charsets in Microsoft Internet Explorer 5

CharsetFriendlyName Preferred Charset Label Aliases IE Ver Min OS CodePage FamilyCodePage
Arabic (ASMO 708) ASMO-708 IE5 Win95 708 1256
Arabic (DOS) DOS-720 IE5 Win95 720 1256
Arabic (ISO) iso-8859-6 arabic, csISOLatinArabic, ECMA-114, ISO_8859-6, ISO_8859-6:1987, iso-ir-127 IE5, IE4 Win95 28596 1256
Arabic (Mac) x-mac-arabic IE5 Win2000 10004 1256
Arabic (Windows) windows-1256 cp1256 IE5 Win95 1256 1256
Baltic (DOS) ibm775 CP500 IE5 Win2000 775 1257
Baltic (ISO) iso-8859-4 csISOLatin4, ISO_8859-4, ISO_8859-4:1988, iso-ir-110, l4, latin4 IE5 Win95 28594 1257
Baltic (Windows) windows-1257 IE5 Win95 1257 1257
Central European (DOS) ibm852 cp852 IE5, IE4 Win95 852 1250
Central European (ISO) iso-8859-2 csISOLatin2, iso_8859-2, iso_8859-2:1987, iso8859-2, iso-ir-101, l2, latin2 IE5, IE4 Win95 28592 1250
Central European (Mac) x-mac-ce IE5 Win2000 10029 1250
Central European (Windows) windows-1250 x-cp1250 IE5 Win95 1250 1250
Chinese Simplified (EUC) EUC-CN x-euc-cn IE5 Win2000 51936 936
Chinese Simplified (GB2312) gb2312 chinese, CN-GB, csGB2312, csGB231280, csISO58GB231280, GB_2312-80, GB231280, GB2312-80, GBK, iso-ir-58 IE5, IE4 Win95 936 936
Chinese Simplified (HZ) hz-gb-2312 IE5, IE4 Win95 52936 936
Chinese Simplified (Mac) x-mac-chinesesimp IE5 Win2000 10008 936
Chinese Traditional (Big5) big5 cn-big5, csbig5, x-x-big5 IE5, IE4 Win95 950 950
Chinese Traditional (CNS) x-Chinese-CNS IE5 Win2000 20000 950
Chinese Traditional (Eten) x-Chinese-Eten IE5 Win2000 20002 950
Chinese Traditional (Mac) x-mac-chinesetrad IE5 Win2000 10002 950
Cyrillic (DOS) cp866 ibm866 IE5, IE4 Win95 866 1251
Cyrillic (ISO) iso-8859-5 csISOLatin5, csISOLatinCyrillic, cyrillic, ISO_8859-5, ISO_8859-5:1988, iso-ir-144, l5 IE5, IE4 Win95 28595 1251
Cyrillic (KOI8-R) koi8-r csKOI8R, koi, koi8, koi8r IE5, IE4 Win95 20866 1251
Cyrillic (KOI8-U) koi8-u koi8-ru IE5 Win95 21866 1251
Cyrillic (Mac) x-mac-cyrillic IE5 Win2000 10007 1251
Cyrillic (Windows) windows-1251 x-cp1251 IE5 Win95 1251 1251
Europa x-Europa IE5 n.a. 29001 1252
German (IA5) x-IA5-German IE5 Win2000 20106 1252
Greek (DOS) ibm737 IE5 Win2000 737 1253
Greek (ISO) iso-8859-7 csISOLatinGreek, ECMA-118, ELOT_928, greek, greek8, ISO_8859-7, ISO_8859-7:1987, iso-ir-126 IE5, IE4 Win95 28597 1253
Greek (Mac) x-mac-greek IE5 Win2000 10006 1253
Greek (Windows) windows-1253 IE5 Win95 1253 1253
Greek, Modern (DOS) ibm869 IE5 Win2000 869 1253
Hebrew (DOS) DOS-862 IE5 Win95 862 1255
Hebrew (ISO-Logical) iso-8859-8-i logical IE5, IE4 Win95 38598 1255
Hebrew (ISO-Visual) iso-8859-8 csISOLatinHebrew, hebrew, ISO_8859-8, ISO_8859-8:1988, ISO-8859-8, iso-ir-138, visual IE5, IE4 Win95 28598 1255
Hebrew (Mac) x-mac-hebrew IE5 Win2000 10005 1255
Hebrew (Windows) windows-1255 ISO_8859-8-I, ISO-8859-8, visual IE5 Win95 1255 1255
IBM EBCDIC (Arabic) x-EBCDIC-Arabic IE5 Win2000 20420 1256
IBM EBCDIC (Cyrillic Russian) x-EBCDIC-CyrillicRussian IE5 Win2000 20880 1251
IBM EBCDIC (Cyrillic Serbian-Bulgarian) x-EBCDIC-CyrillicSerbianBulgarian IE5 Win2000 21025 1251
IBM EBCDIC (Denmark-Norway) x-EBCDIC-DenmarkNorway IE5 Win2000 20277 1252
IBM EBCDIC (Denmark-Norway-Euro) x-ebcdic-denmarknorway-euro IE5 Win2000 1142 1252
IBM EBCDIC (Finland-Sweden) x-EBCDIC-FinlandSweden IE5 Win2000 20278 1252
IBM EBCDIC (Finland-Sweden-Euro) x-ebcdic-finlandsweden-euro IE5 Win2000 1143 1252
IBM EBCDIC (Finland-Sweden-Euro) x-ebcdic-finlandsweden-euro X-EBCDIC-France IE5 Win2000 1143 1252
IBM EBCDIC (France-Euro) x-ebcdic-france-euro IE5 Win2000 1147 1252
IBM EBCDIC (Germany) x-EBCDIC-Germany IE5 Win2000 20273 1252
IBM EBCDIC (Germany-Euro) x-ebcdic-germany-euro IE5 Win2000 1141 1252
IBM EBCDIC (Greek Modern) x-EBCDIC-GreekModern IE5 Win2000 875 1253
IBM EBCDIC (Greek) x-EBCDIC-Greek IE5 Win2000 20423 1253
IBM EBCDIC (Hebrew) x-EBCDIC-Hebrew IE5 Win2000 20424 1255
IBM EBCDIC (Icelandic) x-EBCDIC-Icelandic IE5 Win2000 20871 1252
IBM EBCDIC (Icelandic-Euro) x-ebcdic-icelandic-euro IE5 Win2000 1149 1252
IBM EBCDIC (International-Euro) x-ebcdic-international-euro IE5 Win2000 1148 1252
IBM EBCDIC (Italy) x-EBCDIC-Italy IE5 Win2000 20280 1252
IBM EBCDIC (Italy-Euro) x-ebcdic-italy-euro IE5 Win2000 1144 1252
IBM EBCDIC (Japanese and Japanese Katakana) x-EBCDIC-JapaneseAndKana IE5 Win2000 50930 932
IBM EBCDIC (Japanese and Japanese-Latin) x-EBCDIC-JapaneseAndJapaneseLatin IE5 Win2000 50939 932
IBM EBCDIC (Japanese and US-Canada) x-EBCDIC-JapaneseAndUSCanada IE5 Win2000 50931 932
IBM EBCDIC (Japanese katakana) x-EBCDIC-JapaneseKatakana IE5 Win2000 20290 932
IBM EBCDIC (Korean and Korean Extended) x-EBCDIC-KoreanAndKoreanExtended IE5 Win2000 50933 949
IBM EBCDIC (Korean Extended) x-EBCDIC-KoreanExtended IE5 Win2000 20833 949
IBM EBCDIC (Multilingual Latin-2) CP870 IE5 Win2000 870 1250
IBM EBCDIC (Simplified Chinese) x-EBCDIC-SimplifiedChinese IE5 Win2000 50935 936
IBM EBCDIC (Spain) X-EBCDIC-Spain IE5 Win2000 20284 1252
IBM EBCDIC (Spain-Euro) x-ebcdic-spain-euro IE5 Win2000 1145 1252
IBM EBCDIC (Thai) x-EBCDIC-Thai IE5 Win2000 20838 874
IBM EBCDIC (Traditional Chinese) x-EBCDIC-TraditionalChinese IE5 Win2000 50937 950
IBM EBCDIC (Turkish Latin-5) CP1026 IE5 Win2000 1026 1254
IBM EBCDIC (Turkish) x-EBCDIC-Turkish IE5 Win2000 20905 1254
IBM EBCDIC (UK) x-EBCDIC-UK IE5 Win2000 20285 1252
IBM EBCDIC (UK-Euro) x-ebcdic-uk-euro IE5 Win2000 1146 1252
IBM EBCDIC (US-Canada) ebcdic-cp-us IE5 Win2000 37 1252
IBM EBCDIC (US-Canada-Euro) x-ebcdic-cp-us-euro IE5 Win2000 1140 1252
Icelandic (DOS) ibm861 IE5 Win2000 861 1252
Icelandic (Mac) x-mac-icelandic IE5 Win2000 10079 1252
ISCII Assamese x-iscii-as IE5 Win2000 57006 57006
ISCII Bengali x-iscii-be IE5 Win2000 57003 57003
ISCII Devanagari x-iscii-de IE5 Win2000 57002 57002
ISCII Gujarathi x-iscii-gu IE5 Win2000 57010 57010
ISCII Kannada x-iscii-ka IE5 Win2000 57008 57008
ISCII Malayalam x-iscii-ma IE5 Win2000 57009 57009
ISCII Oriya x-iscii-or IE5 Win2000 57007 57007
ISCII Panjabi x-iscii-pa IE5 Win2000 57011 57011
ISCII Tamil x-iscii-ta IE5 Win2000 57004 57004
ISCII Telugu x-iscii-te IE5 Win2000 57005 57005
Japanese (EUC) euc-jp csEUCPkdFmtJapanese, Extended_UNIX_Code_Packed_Format_for_Japanese, x-euc, x-euc-jp IE5, IE4 Win95 51932 932
Japanese (JIS) iso-2022-jp IE5, IE4 Win95 50220 932
Japanese (JIS-Allow 1 byte Kana - SO/SI) iso-2022-jp _iso-2022-jp$SIO IE5 Win95 50222 932
Japanese (JIS-Allow 1 byte Kana) csISO2022JP _iso-2022-jp IE5 Win95 50221 932
Japanese (Mac) x-mac-japanese IE5 Win2000 10001 932
Japanese (Shift-JIS) shift_jis csShiftJIS, csWindows31J, ms_Kanji, shift-jis, x-ms-cp932, x-sjis IE5, IE4 Win95 932 932
Korean ks_c_5601-1987 csKSC56011987, euc-kr, iso-ir-149, korean, ks_c_5601, ks_c_5601_1987, ks_c_5601-1989, KSC_5601, KSC5601 IE5 Win95 949 949
Korean (EUC) euc-kr csEUCKR IE5 Win95 51949 949
Korean (ISO) iso-2022-kr csISO2022KR IE5 Win95 50225 949
Korean (Johab) Johab IE5 Win2000 1361 1361
Korean (Mac) x-mac-korean IE5 Win2000 10003 949
Latin 3 (ISO) iso-8859-3 csISO, Latin3, ISO_8859-3, ISO_8859-3:1988, iso-ir-109, l3, latin3 IE5, IE4 Win95 28593 1254
Latin 9 (ISO) iso-8859-15 csISO, Latin9, ISO_8859-15, l9, latin9 IE5 Win95 28605 1252
Norwegian (IA5) x-IA5-Norwegian IE5 Win2000 20108 1252
OEM United States IBM437 437, cp437, csPC8, CodePage437 IE5 Win2000 437 1252
Swedish (IA5) x-IA5-Swedish IE5 Win2000 20107 1252
Thai (Windows) windows-874 DOS-874, iso-8859-11, TIS-620 IE5, IE4 Win95 874 874
Turkish (DOS) ibm857 IE5 Win2000 857 1254
Turkish (ISO) iso-8859-9 csISO, Latin5, ISO_8859-9, ISO_8859-9:1989, iso-ir-148, l5, latin5 IE5 Win95 28599 1254
Turkish (Mac) x-mac-turkish IE5 Win2000 10081 1254
Turkish (Windows) windows-1254 ISO_8859-9, ISO_8859-9:1989, iso-8859-9, iso-ir-148, latin5 IE5 Win95 1254 1254
Unicode unicode utf-16 IE5, IE4 Win95 1200 1200
Unicode (Big-Endian) unicodeFFFE IE5, IE4 Win95 1201 1200
Unicode (UTF-7) utf-7 csUnicode11UTF7, unicode-1-1-utf-7, x-unicode-2-0-utf-7 IE5, IE4 Win95 65000 1200
Unicode (UTF-8) utf-8 unicode-1-1-utf-8, unicode-2-0-utf-8, x-unicode-2-0-utf-8 IE5, IE4 Win95 65001 1200
US-ASCII us-ascii ANSI_X3.4-1968, ANSI_X3.4-1986, ascii, cp367, csASCII, IBM367, ISO_646.irv:1991, ISO646-US, iso-ir-6us IE5 Win95 20127 1252
Vietnamese (Windows) windows-1258 IE5, IE4 Win95 1258 1258
Western European (DOS) ibm850 IE5 Win2000 850 1252
Western European (IA5) x-IA5 IE5 Win2000 20105 1252
Western European (ISO) iso-8859-1 cp819, csISO, Latin1, ibm819, iso_8859-1, iso_8859-1:1987, iso8859-1, iso-ir-100, l1, latin1 IE5 Win95 28591 1252
Western European (Mac) macintosh IE5 Win2000 10000 1252
Western European (Windows) Windows-1252 ANSI_X3.4-1968, ANSI_X3.4-1986, ascii, cp367, cp819, csASCII, IBM367, ibm819, ISO_646.irv:1991, iso_8859-1, iso_8859-1:1987, ISO646-US, iso8859-1, iso-8859-1, iso-ir-100, iso-ir-6, latin1, us, us-ascii, x-ansi IE5 Win95 1252 1252

Internal Charsets Not for General Use

The following character sets are not for general use, so do not use them to label documents.

Charset Friendly Name Preferred Charset Label Aliases IE Ver Min OS Code Page Family Code Page
User Defined x-user-defined IE5, IE4 Win95 50000 50000
Japanese (Auto-Select) IE5, IE4 Win95 50932 932
Auto-Select IE5 Win95 50001 50001
Korean (Auto-Select) IE5, IE4 Win95 50949 949



1 komentarz:

Anonimowy pisze...

Amiable brief and this mail helped me alot in my college assignement. Thanks you as your information.