Re: Re[2]: saving as text

From: Dom Lachowicz (domlachowicz@yahoo.com)
Date: Wed Oct 15 2003 - 08:47:57 EDT

  • Next message: Alan G Isaac: "Re[4]: saving as text"

    Hi Alan,

    > Also I made another point: ISCII export substitutes
    > '?'
    > for characters outside the character set, regardless
    > of the
    > existence of standard conventions for imitating
    > them.

    Alan, unfortunately, there are no "standard
    conventions" applicable to plaintext. SGML entity
    references are meant for exactly that - SGML
    documents.

    > You missed my point. Consider the xhtml document
    > below. It
    > contains a couple entities for which there are
    > pretty
    > standard substitutes. (Normal quote for “ the
    > consecutive hyphens for —)

    <!ENTITY mdash CDATA "&#8212;" -- em dash, U+2014
    ISOpub -->
    <!ENTITY ldquo CDATA "&#8220;" -- left double
    quotation mark, U+201C ISOnum -->
    <!ENTITY rdquo CDATA "&#8221;" -- right double
    quotation mark, U+201D ISOnum -->

    As you can see, these don't fit into ASCII at all.
    They're fairly high in the unicode table - at least
    8000 entries past the first 127 or 255 that one could
    reasonably call "ASCII".

    So, what are you asking for exactly? Saving these as
    SGML entities inside of ASCII is just plain wrong. I'm
    not sure that saving them as their rough ASCII
    equivalents is ideal behavior, but it seems more
    reasonable than the SGML entity suggestion.

    I'd personally suggest that you save these documents
    as UTF-8 encoded. Just about every text editor in the
    world worth its salt supports UTF-8 now, and it
    preserves your text in its entirety. This sounds like
    correct behavior to me.

    Best regards,
    Dom

    __________________________________
    Do you Yahoo!?
    The New Yahoo! Shopping - with improved product search
    http://shopping.yahoo.com
    -----------------------------------------------
    To unsubscribe from this list, send a message to
    abiword-user-request@abisource.com with the word
    unsubscribe in the message body.



    This archive was generated by hypermail 2.1.4 : Wed Oct 15 2003 - 09:05:25 EDT