[PHP]將文字轉換成HTML實體－系統小雜工，專記ㄚ沙不魯的東西

htmlspecialchars(http://tw.php.net/manual/en/function.htmlspecialchars.php)：
只針對部份特殊字元處理, 所以未指定字串實際編碼時，仍然可以正確運作。

我試完後，'這個不會被轉換

htmlentities(http://tw.php.net/manual/en/function.htmlentities.php)：
針對全部字元處理，所以若未指定字串正確編碼時，就會無法轉換編碼，而所有的資料都會變成亂碼。

以下面這個例子來說...
<?php
$str='<a href="test.html">htmlentities測試頁面</a>';
echo htmlentities($str, ENT_QUOTES, 'BIG5')."<p>\n\n";
$str='<a href="test.html">htmlspecialchars測試頁面</a>';
echo htmlspecialchars($str, ENT_QUOTES, 'BIG5');
?>
'BIG5'去掉, htmlentities() 處理過的字串，中間若有中文字，就會變成亂碼。
所以說... 就相容性與可攜性而言，htmlspecialchars() 是比較高的。
htmlentities() 是比較嚴格，但若沒指定正確編碼，那就會有不如預期的結果。

反轉函式是html_entity_decode

echo html_entity_decode($str, ENT_QUOTES, 'BIG5')."<p>\n\n";

string html_entity_decode ( string $string [, int $quote_style = ENT_COMPAT [, string $charset = 'UTF-8' ]] )

http://tw.php.net/html_entity_decode

最常用的字元實體:

Result	Description	Entity Name	Entity Number
	不中斷空格
<	小於	<	<
>	大於	>	>
&	等於	&	&
"	雙引號	"	"
'	單引號	'	'

其它常用的字元實體

Result	Description	Entity Name	Entity Number
¢	美分	¢	¢
£	英鎊	£	£
¥	日圓	¥	¥
§	section	§	§
©	版權	©	©
®	註冊商標	®	®
×	乘	×	×
÷	除	÷	÷

HTML 4.01 支援 ISO 8859-1 (Latin-1) 字元設定.

下方的 ISO-8859-1 (codes from 0-127) 是原始 7位元 ASCII標準 .

上方的 ISO-8859-1 (codes from 160-255) 能夠使用在字元實體的名字.

摘要：字元實體是不分大小寫的

ASCII Entities with new Entity Names

Result	Description	Entity Name	Entity Number
"	quotation mark	"	"
'	apostrophe	'	'
&	ampersand	&	&
<	less-than	<	<
>	greater-than	>	>

ISO 8859-1 Symbol Entities

Result	Description	Entity Name	Entity Number
	non-breaking space
¡	inverted exclamation mark	¡	¡
¤	currency	¤	¤
¢	cent	¢	¢
£	pound	£	£
¥	yen	¥	¥
¦	broken vertical bar	¦	¦
§	section	§	§
¨	spacing diaeresis	¨	¨
©	copyright	©	©
ª	feminine ordinal indicator	ª	ª
«	angle quotation mark (left)	«	«
¬	negation	¬	¬
	soft hyphen
®	registered trademark	®	®
™	trademark	™
¯	spacing macron	¯	¯
°	degree	°	°
±	plus-or-minus	±	±
²	superscript 2	²	²
³	superscript 3	³	³
´	spacing acute	´	´
µ	micro	µ	µ
¶	paragraph	¶	¶
·	middle dot	·	·
¸	spacing cedilla	¸	¸
¹	superscript 1	¹	¹
º	masculine ordinal indicator	º	º
»	angle quotation mark (right)	»	»
¼	fraction 1/4	¼	¼
½	fraction 1/2	½	½
¾	fraction 3/4	¾	¾
¿	inverted question mark	¿	¿
×	multiplication	×	×
÷	division	÷	÷

ISO 8859-1 Character Entities

Result	Description	Entity Name	Entity Number
À	capital a, grave accent	À	À
Á	capital a, acute accent	Á	Á
Â	capital a, circumflex accent	Â	Â
Ã	capital a, tilde	Ã	Ã
Ä	capital a, umlaut mark	Ä	Ä
Å	capital a, ring	Å	Å
Æ	capital ae	Æ	Æ
Ç	capital c, cedilla	Ç	Ç
È	capital e, grave accent	È	È
É	capital e, acute accent	É	É
Ê	capital e, circumflex accent	Ê	Ê
Ë	capital e, umlaut mark	Ë	Ë
Ì	capital i, grave accent	Ì	Ì
Í	capital i, acute accent	Í	Í
Î	capital i, circumflex accent	Î	Î
Ï	capital i, umlaut mark	Ï	Ï
Ð	capital eth, Icelandic	Ð	Ð
Ñ	capital n, tilde	Ñ	Ñ
Ò	capital o, grave accent	Ò	Ò
Ó	capital o, acute accent	Ó	Ó
Ô	capital o, circumflex accent	Ô	Ô
Õ	capital o, tilde	Õ	Õ
Ö	capital o, umlaut mark	Ö	Ö
Ø	capital o, slash	Ø	Ø
Ù	capital u, grave accent	Ù	Ù
Ú	capital u, acute accent	Ú	Ú
Û	capital u, circumflex accent	Û	Û
Ü	capital u, umlaut mark	Ü	Ü
Ý	capital y, acute accent	Ý	Ý
Þ	capital THORN, Icelandic	Þ	Þ
ß	small sharp s, German	ß	ß
à	small a, grave accent	à	à
á	small a, acute accent	á	á
â	small a, circumflex accent	â	â
ã	small a, tilde	ã	ã
ä	small a, umlaut mark	ä	ä
å	small a, ring	å	å
æ	small ae	æ	æ
ç	small c, cedilla	ç	ç
è	small e, grave accent	è	è
é	small e, acute accent	é	é
ê	small e, circumflex accent	ê	ê
ë	small e, umlaut mark	ë	ë
ì	small i, grave accent	ì	ì
í	small i, acute accent	í	í
î	small i, circumflex accent	î	î
ï	small i, umlaut mark	ï	ï
ð	small eth, Icelandic	ð	ð
ñ	small n, tilde	ñ	ñ
ò	small o, grave accent	ò	ò
ó	small o, acute accent	ó	ó
ô	small o, circumflex accent	ô	ô
õ	small o, tilde	õ	õ
ö	small o, umlaut mark	ö	ö
ø	small o, slash	ø	ø
ù	small u, grave accent	ù	ù
ú	small u, acute accent	ú	ú
û	small u, circumflex accent	û	û
ü	small u, umlaut mark	ü	ü
ý	small y, acute accent	ý	ý
þ	small thorn, Icelandic	þ	þ
ÿ	small y, umlaut mark	ÿ	ÿ

Some Other Entities supported by HTML

Result	Description	Entity Name	Entity Number
Œ	capital ligature OE	&OElig;	Œ
œ	small ligature oe	&oelig;	œ
Š	capital S with caron	&Scaron;	Š
š	small S with caron	&scaron;	š
Ÿ	capital Y with diaeres	&Yuml;	Ÿ
ˆ	modifier letter circumflex accent	&circ;	ˆ
˜	small tilde	&tilde;	˜
	en space	&ensp;
	em space	&emsp;
	thin space
‌	zero width non-joiner	&zwnj;	‌
‍	zero width joiner	&zwj;	‍
‎	left-to-right mark	&lrm;	‎
‏	right-to-left mark	&rlm;	‏
–	en dash	–	–
—	em dash	—	—
‘	left single quotation mark	‘	‘
’	right single quotation mark	’	’
‚	single low-9 quotation mark	&sbquo;	‚
“	left double quotation mark	“	“
”	right double quotation mark	”	”
„	double low-9 quotation mark	&bdquo;	„
†	dagger	&dagger;	†
‡	double dagger	&Dagger;	‡
…	horizontal ellipsis	…	…
‰	per mille	&permil;	‰
‹	single left-pointing angle quotation	&lsaquo;	‹
›	single right-pointing angle quotation	&rsaquo;	›
€	euro	€	€
™	trademark		™

mming

系統小雜工，專記ㄚ沙不魯的東西

mming 發表在痞客邦留言(0) 人氣()

E-mail轉寄

«	十二月 2024					»
日	一	二	三	四	五	六
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

系統小雜工，專記ㄚ沙不魯的東西

大夢誰先覺？平生我自知。草堂春睡足，窗外日遲遲。

公告版位