Re: bafna_p - r31386 - abiword/branches/gsoc2012math/plugins/mathview/xp

From: Prashant Bafna <appu.bafna_at_gmail.com>
Date: Fri Jun 15 2012 - 04:22:47 CEST

>> I don't like that. No xml file should have any character before the
>> first '<'. Either the file is borked, or is not valid xml, or the stream
>> gets corrupted.
>
> I thought you could have the BOM marker? In that case it should be
> identified as such. If not, just ignore. Jean is right anyway.
>
>
> BTW, I'd like to have us (the AbiWord developer) stop writting XML by
> hand and use libxml xmlWriter instead.
>
> But that's a totally different story.
>

Indeed the error was caused because of the byte order mark, although
the Unicode standard recommends against the BOM for UTF-8 but many
Windows programs (Notepad, Visual C++ etc) add the bytes 0xEF, 0xBB,
0xBF at the start of any document saved as UTF-8, which in turn get
 to the start of the MathML causing the gtkmathview to fail.

I'll revert the earlier commit and identify the BOM as such with
something like :

while (pStream->getChar(c))
        {
                // Ignore BOM
                if(c!= ((UT_UCSChar)0xEF) && c!=((UT_UCSChar)0xBB) && c!=((UT_UCSChar)0xBF))
                {
                        uc = static_cast<unsigned char>(c);
                        BB.append(&uc,1);
                }
        }

Thanks,
Prashant
Received on Fri Jun 15 04:22:56 2012

This archive was generated by hypermail 2.1.8 : Fri Jun 15 2012 - 04:22:56 CEST