From: Dom Lachowicz (email@example.com)
Date: Fri Mar 01 2002 - 18:12:56 GMT
> This looks to me like this fix was in the non-plugin code, which I
> believe Dom specified was deprecated. Perhaps you want to add the fix
> in the plugin code?
> Does anyone else see the fact that we have two different sections of
> code in development for the same feature as confusing, misleading, and
The plugin code is being maintained independently from the other HTML
importer. There are some reasons and advantages and disadvantages for
1) Plugin handles HTML, non-well-formed HTML, and XHTML, not just XHTML.
2) Plugin is theoretically more robust.
3) Plugin adds a libxml2 dependency. XHTML importer does not.
4) other stuff that I'm forgetting
So theoretically, the HTML plugin is more robust and can handle more
kinds of inputs. The two importers scratch different itches for
different people. The HTML importer doesn't work on Win32 (though it
probably could be made to), which is a definite drawback.
I don't see these as the same features but as distinct. It just so
happens that XHTML is derived from HTML so there could theoretically be
an overlapping of code. In practice there isn't much overlapping.
Honestly, no one has worked on the XHTML importer in quite some time,
and it's showing its age. If Hub or whomever would like to get it
working to fix bug 1406, so be it. It's their time, not mine.
Hopefully after 1.0 we can put our full concentration into the HTML
importer plugin and making that work on a large number of documents and
platforms, but that is by no means a guarantee.
Ideally, the XHTML importer will be deprecated and scrapped post-1.0
when we move most of the import/export architecture over to using
plugins. But for now, it stays.
We're not Mozilla. A real HTML importer is a *ton* of work if we're to
do even a passable job. For now I'm willing to accept a
less-than-perfect implementation, especially considering the load of
crap that is the HTML standard and the larger load of crap that
represents all of the misformed documents on the internet.
This archive was generated by hypermail 2.1.4 : Fri Mar 01 2002 - 13:20:36 GMT