'Appears to be bogus or invalid document' message on .abw files

From: Michael Teichgräber (gnubert@web.de)
Date: Thu Jul 11 2002 - 13:24:22 EDT

  • Next message: Alan Horkan: "Re: 'Appears to be bogus or invalid document' message on .abw files"

    Hi,

    I have looked at problems with two .abw files that gave me `AbiWord
    cannot open foo.abw. It appears to be a bogus or invalid document'
    messages.

    The first document was created by importing a .doc file and saving it
    as .abw with AbiWord 0.99.3. In detail, this (shortened) .abw file
    produces the error message:

    ---------------------------------------->8----------
    <?xml version="1.0"?>
    <!DOCTYPE abiword PUBLIC "-//ABISOURCE//DTD AWML 1.0 Strict//EN" "http://www.abisource.com/awml.dtd">
    <abiword xmlns="http://www.abisource.com/awml.dtd" xmlns:awml="http://www.abisource.com/awml.dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:dc="http://purl.org/dc/elements/1.1/" version="0.99.3" fileformat="1.0" styles="unlocked">
    <!-- ===================================================================== -->
    <!-- This file is an AbiWord document. -->
    <!-- AbiWord is a free, Open Source word processor. -->
    <!-- You may obtain more information about AbiWord at www.abisource.com -->
    <!-- You should not edit this file by hand. -->
    <!-- ===================================================================== -->

    <pagesize pagetype="Letter" orientation="portrait" width="8.500000" height="11.000000" units="in" page-scale="1.000000"/>
    <section props="page-margin-right:1.2500in; section-restart-value:1; section-space-after:0.0000in; page-margin-header:0.5000in; page-margin-left:1.2500in; page-margin-footer:0.5000in; page-margin-top:1.0000in; page-margin-bottom:1.0000in">
    <p props="text-align:left; line-height:1.5; keep-with-next:yes"><c props="lang:de-DE; font-weight:bold; Ñÿ¿ð‘p@ð‘p@.font-family:Arial">A headline</c></p>
    </section>
    </abiword>
    --------8<------------------------------------------

    The significant position is `<p... Ñÿ¿ð‘p@ð‘p@.font-family:Arial ..'
                                       ^^^^^^^^^^^^^^^^^^

    When I remove the part before `font-family:', AbiWord is able to open
    the document.

    The second document has been created by writing plain text into
    AbiWord and saving as .abw; the following file is an example:

    ---------------------------------------->8----------
    <?xml version="1.0"?>
    <!DOCTYPE abiword PUBLIC "-//ABISOURCE//DTD AWML 1.0 Strict//EN" "http://www.abisource.com/awml.dtd">
    <abiword xmlns="http://www.abisource.com/awml.dtd" xmlns:awml="http://www.abisource.com/awml.dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:dc="http://purl.org/dc/elements/1.1/" version="0.99.3" fileformat="1.0" styles="unlocked">
    <!-- ===================================================================== -->
    <!-- This file is an AbiWord document. -->
    <!-- AbiWord is a free, Open Source word processor. -->
    <!-- You may obtain more information about AbiWord at www.abisource.com -->
    <!-- You should not edit this file by hand. -->
    <!-- ===================================================================== -->

    <styles>
    <s type="P" name="Normal" basedon="" followedby="Current Settings" props="font-family:Times New Roman; margin-top:0pt; font-variant:normal; margin-left:0pt; text-indent:0in; widows:2; font-style:normal; font-weight:normal; text-decoration:none; color:000000; line-height:1.0; text-align:left; margin-bottom:0pt; text-position:normal; margin-right:0pt; bgcolor:transparent; font-size:12pt; font-stretch:normal"/>
    </styles>
    <pagesize pagetype="A4" orientation="portrait" width="210.000000" height="297.000000" units="mm" page-scale="1.000000"/>
    <section props="page-margin-footer:0.5in; page-margin-header:0.5in">
    <p style="Normal"><c props="lang:de-DE">A normal Line</c></p>
    </section>
    </abiword>
    --------8<------------------------------------------

    Here, the character with ASCII value 31 at `A normal Line' makes
                                                     ^
    AbiWord show the error message mentioned above. In the (longer)
    original text there were some of these characters. It seems to me like
    a hyphenation permission mark, but I don't know whether AbiWord
    inserted it or whether it has been inserted by the one who edited the
    text (if this is technical possible); some pieces have probably copied
    by X-Window-Drag-And-Drop from a Webbrowser.

    Similar to the first case, Abiword will be able to read the file when
    the characters with ASCII value 31 (037) are removed.

    I don't know whether this behaviour should be considered of as being a
    bug, maybe the current version of AbiWord wouldn't create such invalid
    files. Do you have an idea why AbiWord fails on reading these lines or
    why it saved files containing these `invalid' characters?

    As Tim in http://bugzilla.abisource.com/show_bug.cgi?id=1665
    mentioned, it would help if AbiWord would tell the position of where
    it found the invalid content.

    -- 
    Michael
    -----------------------------------------------
    To unsubscribe from this list, send a message to
    abiword-user-request@abisource.com with the word
    unsubscribe in the message body.
    


    This archive was generated by hypermail 2.1.4 : Thu Jul 11 2002 - 13:32:01 EDT