On 7/13/2012 2:42 PM, David Starner wrote:
> On Fri, Jul 13, 2012 at 1:29 PM, Jukka K. Korpela <jkorpela_at_cs.tut.fi> wrote:
>> 2025-08-05 22:37, David Starner wrote:
>>
>>> Wikipedia says "The Unicode standard recommends against the BOM for
>>> UTF-8." and refers to page 30 of the Unicode Standard, version 6.0,
>>> that says "Use of a BOM is neither required nor recommended for
>>> UTF-8..." Calling it a myth seems bizarre.
>> “Not recommended” is distinct from “recommends against”.
> I disagree; the meaning of the two phrases overlaps in my idolect, and
> while it would be somewhat laconic, I might use "not recommended" to
> mean "if you insist on doing that, please give us a chance to get the
> fire extinguisher first",
I can state confidently and unequivocally that it is not used in that
sense in the standard, and by reading the whole phrase it's clear that
it is intended as statement of neutrality on the part of the Unicode
Standard - respectfully being aware of the difference between a
character encoding and a data transmission (or file format) protocol.
>
>> A
>> more appropriate formulation would be “Use of a BOM is not required for BOM,
>> but may be used as a signature that indicates, with practical certainty,
>> that data is UTF-8 encoded.”
> In the environment that UTF-8 was developed for, a BOM is a nuisance;
> a BOM will stop the shell from properly interpreting a hashbang, and
> other existing programs will lose the BOM, duplicate the BOM, and
> scatter BOMs throughout files. Given the number of text-like file
> formats (like old-school PNM) and number of scripts depending on
> existing behavior, these aren't going to be changed.
I think it's the cost of doing business. Unix was successful in getting
the web to use UTF-8 rather than UTF-16 etc. files to be the basis for
the exchange of markup language data. In environments that are
predicated on mandatory conversion TO Unicode, not knowing whether
something is "text" or "utf-8" text isn't as benign as it might be in
the former environment. Hence, the implementation of the UTF-8 BOM there.
> As I said before, Unicode simplified but did not solve the fact that
> text from other operating systems requires some modification before
> working just right. But I don't think that Unicode should recommend
> unconditionally the UTF-8 BOM, because it is problematic in the field
> of use UTF-8 was created for and is still used for.
>
And, as you can see, Unicode, as a standard, is neutral on the issue.
For precisely that reason!
A.
Received on Fri Jul 13 2012 - 17:33:16 CDT
This archive was generated by hypermail 2.2.0 : Fri Jul 13 2012 - 17:33:28 CDT
杜牧号什么 | 肾上腺结节挂什么科 | 天赋异禀什么意思 | 一面之词是什么意思 | 牙龈出血挂什么科 |
吃韭菜有什么好处和坏处 | 胃疼喝什么药 | 什么一梦 | 一米阳光是什么意思 | 中元节是什么时候 |
利可君片是什么药 | 上环是什么意思 | 接吻是什么样的感觉 | 太字五行属什么 | 枧水是什么 |
什么药消肿最快最有效 | 真言是什么意思 | 黑色加什么颜色是棕色 | 幽会是什么意思 | 什么是肾功能不全 |
急性阴道炎是什么引起的hcv8jop5ns7r.cn | 黑枸杞对男性性功能有什么帮助hcv9jop5ns0r.cn | gsy什么意思hcv7jop9ns4r.cn | 八六年属什么hcv8jop1ns5r.cn | 柠檬不能和什么一起吃hcv8jop9ns4r.cn |
怀孕初期应该注意什么hcv8jop6ns2r.cn | 男人眉心有痣代表什么hcv8jop6ns4r.cn | 塞上是什么意思hcv9jop2ns4r.cn | 孕早期有什么症状cl108k.com | 肝内小囊肿是什么意思hcv9jop0ns8r.cn |
脂肪肝中医叫什么名字hcv9jop3ns1r.cn | 百香果什么时候成熟hcv9jop5ns9r.cn | 肾阳不足吃什么中成药hcv8jop7ns3r.cn | 胃疼吃什么好hcv8jop4ns1r.cn | 手腕痛挂什么科youbangsi.com |
来月经腰疼的厉害是什么原因hcv7jop4ns7r.cn | 白带黄什么原因hcv8jop7ns5r.cn | 生化什么意思hcv9jop0ns0r.cn | 本体是什么意思hcv8jop7ns8r.cn | 射手座的幸运色是什么颜色hcv8jop1ns0r.cn |