Re: Re[2]: saving as text

From: Dom Lachowicz (
Date: Wed Oct 15 2003 - 08:47:57 EDT

  • Next message: Alan G Isaac: "Re[4]: saving as text"

    Hi Alan,

    > Also I made another point: ISCII export substitutes
    > '?'
    > for characters outside the character set, regardless
    > of the
    > existence of standard conventions for imitating
    > them.

    Alan, unfortunately, there are no "standard
    conventions" applicable to plaintext. SGML entity
    references are meant for exactly that - SGML

    > You missed my point. Consider the xhtml document
    > below. It
    > contains a couple entities for which there are
    > pretty
    > standard substitutes. (Normal quote for “ the
    > consecutive hyphens for —)

    <!ENTITY mdash CDATA "&#8212;" -- em dash, U+2014
    ISOpub -->
    <!ENTITY ldquo CDATA "&#8220;" -- left double
    quotation mark, U+201C ISOnum -->
    <!ENTITY rdquo CDATA "&#8221;" -- right double
    quotation mark, U+201D ISOnum -->

    As you can see, these don't fit into ASCII at all.
    They're fairly high in the unicode table - at least
    8000 entries past the first 127 or 255 that one could
    reasonably call "ASCII".

    So, what are you asking for exactly? Saving these as
    SGML entities inside of ASCII is just plain wrong. I'm
    not sure that saving them as their rough ASCII
    equivalents is ideal behavior, but it seems more
    reasonable than the SGML entity suggestion.

    I'd personally suggest that you save these documents
    as UTF-8 encoded. Just about every text editor in the
    world worth its salt supports UTF-8 now, and it
    preserves your text in its entirety. This sounds like
    correct behavior to me.

    Best regards,

    Do you Yahoo!?
    The New Yahoo! Shopping - with improved product search
    To unsubscribe from this list, send a message to with the word
    unsubscribe in the message body.

    This archive was generated by hypermail 2.1.4 : Wed Oct 15 2003 - 09:05:25 EDT