htmlspecialchars(http://tw.php.net/manual/en/function.htmlspecialchars.php):
只針對部份特殊字元處理, 所以未指定字串實際編碼時,仍然可以正確運作。
我試完後,'這個不會被轉換
htmlentities(http://tw.php.net/manual/en/function.htmlentities.php):
針對全部字元處理,所以若未指定字串正確編碼時,就會無法轉換編碼,而所有的資料都會變成亂碼。
以下面這個例子來說...
<?php
$str='<a href="test.html">htmlentities測試頁面</a>';
echo htmlentities($str, ENT_QUOTES, 'BIG5')."<p>\n\n";
$str='<a href="test.html">htmlspecialchars測試頁面</a>';
echo htmlspecialchars($str, ENT_QUOTES, 'BIG5');
?>
'BIG5'去掉, htmlentities() 處理過的字串,中間若有中文字,就會變成亂碼。
所以說... 就相容性與可攜性而言,htmlspecialchars() 是比較高的。
htmlentities() 是比較嚴格,但若沒指定正確編碼,那就會有不如預期的結果。
反轉函式是html_entity_decode
echo html_entity_decode($str, ENT_QUOTES, 'BIG5')."<p>\n\n";
http://tw.php.net/html_entity_decode
最常用的字元實體:
Result | Description | Entity Name | Entity Number |
---|---|---|---|
不中斷空格 | |   | |
< | 小於 | < | < |
> | 大於 | > | > |
& | 等於 | & | & |
" | 雙引號 | " | " |
' | 單引號 | ' | ' |
其它常用的字元實體
Result | Description | Entity Name | Entity Number |
---|---|---|---|
¢ | 美分 | ¢ | ¢ |
£ | 英鎊 | £ | £ |
¥ | 日圓 | ¥ | ¥ |
§ | section | § | § |
© | 版權 | © | © |
® | 註冊商標 | ® | ® |
× | 乘 | × | × |
÷ | 除 | ÷ | ÷ |
HTML 4.01 支援 ISO 8859-1 (Latin-1) 字元設定.
下方的 ISO-8859-1 (codes from 0-127) 是原始 7位元 ASCII標準 .
上方的 ISO-8859-1 (codes from 160-255) 能夠使用在字元實體的名字.
摘要:字元實體是不分大小寫的
ASCII Entities with new Entity Names
Result | Description | Entity Name | Entity Number |
---|---|---|---|
" | quotation mark | " | " |
' | apostrophe | ' | ' |
& | ampersand | & | & |
< | less-than | < | < |
> | greater-than | > | > |
ISO 8859-1 Symbol Entities
Result | Description | Entity Name | Entity Number |
---|---|---|---|
non-breaking space | |   | |
¡ | inverted exclamation mark | ¡ | ¡ |
¤ | currency | ¤ | ¤ |
¢ | cent | ¢ | ¢ |
£ | pound | £ | £ |
¥ | yen | ¥ | ¥ |
¦ | broken vertical bar | ¦ | ¦ |
§ | section | § | § |
¨ | spacing diaeresis | ¨ | ¨ |
© | copyright | © | © |
ª | feminine ordinal indicator | ª | ª |
« | angle quotation mark (left) | « | « |
¬ | negation | ¬ | ¬ |
| soft hyphen | ­ | ­ |
® | registered trademark | ® | ® |
™ | trademark | ™ | |
¯ | spacing macron | ¯ | ¯ |
° | degree | ° | ° |
± | plus-or-minus | ± | ± |
² | superscript 2 | ² | ² |
³ | superscript 3 | ³ | ³ |
´ | spacing acute | ´ | ´ |
µ | micro | µ | µ |
¶ | paragraph | ¶ | ¶ |
· | middle dot | · | · |
¸ | spacing cedilla | ¸ | ¸ |
¹ | superscript 1 | ¹ | ¹ |
º | masculine ordinal indicator | º | º |
» | angle quotation mark (right) | » | » |
¼ | fraction 1/4 | ¼ | ¼ |
½ | fraction 1/2 | ½ | ½ |
¾ | fraction 3/4 | ¾ | ¾ |
¿ | inverted question mark | ¿ | ¿ |
× | multiplication | × | × |
÷ | division | ÷ | ÷ |
ISO 8859-1 Character Entities
Result | Description | Entity Name | Entity Number |
---|---|---|---|
À | capital a, grave accent | À | À |
Á | capital a, acute accent | Á | Á |
 | capital a, circumflex accent |  |  |
à | capital a, tilde | à | à |
Ä | capital a, umlaut mark | Ä | Ä |
Å | capital a, ring | Å | Å |
Æ | capital ae | Æ | Æ |
Ç | capital c, cedilla | Ç | Ç |
È | capital e, grave accent | È | È |
É | capital e, acute accent | É | É |
Ê | capital e, circumflex accent | Ê | Ê |
Ë | capital e, umlaut mark | Ë | Ë |
Ì | capital i, grave accent | Ì | Ì |
Í | capital i, acute accent | Í | Í |
Î | capital i, circumflex accent | Î | Î |
Ï | capital i, umlaut mark | Ï | Ï |
Ð | capital eth, Icelandic | Ð | Ð |
Ñ | capital n, tilde | Ñ | Ñ |
Ò | capital o, grave accent | Ò | Ò |
Ó | capital o, acute accent | Ó | Ó |
Ô | capital o, circumflex accent | Ô | Ô |
Õ | capital o, tilde | Õ | Õ |
Ö | capital o, umlaut mark | Ö | Ö |
Ø | capital o, slash | Ø | Ø |
Ù | capital u, grave accent | Ù | Ù |
Ú | capital u, acute accent | Ú | Ú |
Û | capital u, circumflex accent | Û | Û |
Ü | capital u, umlaut mark | Ü | Ü |
Ý | capital y, acute accent | Ý | Ý |
Þ | capital THORN, Icelandic | Þ | Þ |
ß | small sharp s, German | ß | ß |
à | small a, grave accent | à | à |
á | small a, acute accent | á | á |
â | small a, circumflex accent | â | â |
ã | small a, tilde | ã | ã |
ä | small a, umlaut mark | ä | ä |
å | small a, ring | å | å |
æ | small ae | æ | æ |
ç | small c, cedilla | ç | ç |
è | small e, grave accent | è | è |
é | small e, acute accent | é | é |
ê | small e, circumflex accent | ê | ê |
ë | small e, umlaut mark | ë | ë |
ì | small i, grave accent | ì | ì |
í | small i, acute accent | í | í |
î | small i, circumflex accent | î | î |
ï | small i, umlaut mark | ï | ï |
ð | small eth, Icelandic | ð | ð |
ñ | small n, tilde | ñ | ñ |
ò | small o, grave accent | ò | ò |
ó | small o, acute accent | ó | ó |
ô | small o, circumflex accent | ô | ô |
õ | small o, tilde | õ | õ |
ö | small o, umlaut mark | ö | ö |
ø | small o, slash | ø | ø |
ù | small u, grave accent | ù | ù |
ú | small u, acute accent | ú | ú |
û | small u, circumflex accent | û | û |
ü | small u, umlaut mark | ü | ü |
ý | small y, acute accent | ý | ý |
þ | small thorn, Icelandic | þ | þ |
ÿ | small y, umlaut mark | ÿ | ÿ |
Some Other Entities supported by HTML
Result | Description | Entity Name | Entity Number |
---|---|---|---|
Œ | capital ligature OE | Œ | Œ |
œ | small ligature oe | œ | œ |
Š | capital S with caron | Š | Š |
š | small S with caron | š | š |
Ÿ | capital Y with diaeres | Ÿ | Ÿ |
ˆ | modifier letter circumflex accent | ˆ | ˆ |
˜ | small tilde | ˜ | ˜ |
en space |   |   | |
em space |   |   | |
thin space |   |   | |
| zero width non-joiner | ‌ | ‌ |
| zero width joiner | ‍ | ‍ |
| left-to-right mark | ‎ | ‎ |
| right-to-left mark | ‏ | ‏ |
– | en dash | – | – |
— | em dash | — | — |
‘ | left single quotation mark | ‘ | ‘ |
’ | right single quotation mark | ’ | ’ |
‚ | single low-9 quotation mark | ‚ | ‚ |
“ | left double quotation mark | “ | “ |
” | right double quotation mark | ” | ” |
„ | double low-9 quotation mark | „ | „ |
† | dagger | † | † |
‡ | double dagger | ‡ | ‡ |
… | horizontal ellipsis | … | … |
‰ | per mille | ‰ | ‰ |
‹ | single left-pointing angle quotation | ‹ | ‹ |
› | single right-pointing angle quotation | › | › |
€ | euro | € | € |
™ | trademark | ™ |
留言列表