a 'mooh' point

clearly an IBM drone

OOXML is defective #2 (depends on proprietary technologies)

A standard is not "free enough" if implementation of it depends on existance of a proprietary technology on the specific platform. Ideally it should be possible simply to buy the specification and implement it without any other financial requirements.

This is where OOXML fails.

OOXML heavily depends on "Object Linking and Embedding Technology" also known as "OLE-technology". Section 9.3.3 of the specification deals with how objects are embedded in the file format. The section is divided in two where the first section specifies how to embed documents otherwise defined in this standard. These documents are defined as

  • Formulas
  • Charts
  • Spreadsheets
  • Text documents
  • Drawings
  • Presentations

This is one of the clear cases where it is obvious that Microsoft continiously tries to preserve their main cash-cow: The Microsoft Office eco system! OOXML not only depends on Microsoft's proprietary technology OLE, the specification itself also makes it more easy to embed it's own "cousins" than any other file format. Talk about "first class citizens" of OOXML!

The section goes on telling us about binary objects:

Objects that do not have an XML representation. These objects only have a binary representation [...] (see [OLE]).

WTF? Once again a reference and requirement to use proprietary technologies like OLE! What if I want to embed my own JLSObjectType? What if I want to embed some object from the Linux-world like Bonobo-elements or KParts? The schema-elements only emphasizes my point:

<draw:object/> and <draw:object-ole/>

Are you also puzzled by this? Well, I don't blame you. To wrap up - we can embed "our own documents" and we can embed everything else. There are even two seperate elements from the draw-namespace that specifies this for us: <draw:object/> and <draw:object-ole/>. The entire schema-fragment is included here for your pleasure:

<define name="draw-object">
    <element name="draw:object">
        <ref name="draw-object-attlist"/>
            <ref name="common-draw-data-attlist"/>
            <ref name="office-document"/>
            <ref name="math-math"/>

<define name="draw-object-ole">
    <element name="draw:object-ole">
        <ref name="draw-object-ole-attlist"/>
            <ref name="common-draw-data-attlist"/>
            <ref name="office-binary-data"/>

This is yet another example of Microsoft on one hand claiming "openness" and with the other hand forcing everyone to use their own proprietary, undocumented technology.

But we're not done:

The embedded object is referenced through an XLink attribute in the enclosing frame-element. The behaviour is described as (bold typeface is my addition, /JLS):

The xlink:href attribute links to the object representation, as follows:

  • For objects that have an [OO]XML representation, the link references the sub package of the object. The object is contained within this sub page exactly as it would as it is a document of its own.
  • For objects that do not have an XML representation, the link references a sub stream of the package that contains the binary representation of the object.

Wow - wait a minute: Is this it? Don't you think a bit of clarification would be in order?

The fileformat for the physical file is a Zip-archive with a number of files and folders in it. But this archive also contains a "TOC"-list of the files and the mime type of the entire package. The latter is not an XML-file - where do I put this? Where do I put the TOC-file? What if my spreadsheet contains an image? Since the image is not in XML-format (it's binary) ... would my entire spreadsheet qualify as having "an XML representation"? And did you notice the part "the link references a sub stream of the package that contains the binary representation of the object."? A stream? Binary representation? Again totally unspecified behaviour and noone will ever be able to implement this apart from Microsoft and Microsoft Office 2007.

Microsoft had a good chance to specify this properly in the beginning. They could have made an open format to enable competition or a format that would stiffle competetion. So what does Microsoft do? Yup, the anti-competitive choice. Anyone surprised?

Comments (14) -

Amusing if I had not read the ODF spec it would have fooled me indeed.

The feeds to your blog are all coming in duplicated -- two entries for every post.  (At least on my PC, with IE7.  No other blog that I subscribe to has this problem.)

You might want to look into this.

P.S.  If they give out Nobel prizes for irony, I'd nominate you.

I understand what you're doing with these "OOXML is defective" posts, in that you're actually talking of ODF, but what's to prevent anti-OOXML forces from citing these posts at the upcoming BRM?  BRM attendents might just look at the title or merely skim the content and assume that OOXML is the defective spec, and thus reject OOXML.  Seems like this is a dangerous game to be playing.

I think Jesper is actually attending the BRM.

I would be funny if a BRM attendee were to cite these blog post as issues relating to OOXML whilst the article actually contains quoted references from the actual ODF ISO spec.

Hi Ian,

I have noticed this as well - but I have not been able to track down the cause of the issue. One possible thing could be that I sometimes modify an article after its initial publication (I did this with at least the latest article in the "OOXML is defective ..." -series) and maybe this triggers it. Blogs are a dynamic thing so when friends and contacts tell me that they didn't quite get some point, I sometimes modify the article to make my point clearer.

I don't use IE7 for RSS myself (I use Firefox, where I don't experience the problem) but the problems are also present when I use Google Reader.


Well, you are not the first one to point this out to me and I am aware of the inherent "danger". However - I am willing to take the risk. It seems like the "major battle-arenas" in the OOXML/ODF-discussion are visited by both sides, so I am sure that if one of my articles get cited when attacking OOXML, it will get noticed.

... but thanks for your concern Smile

Hi hAl,

Yes, I will go to the "High-school reunion" in Geneva. But I actually think it would be really, really sad if someone would cite these articles as OOXML flaws. The BRM is a technical gathering with technical people attending, so you could say that anyone citing these articles should not have gone there in the first place. It's not rocket science, really.



And so it happened ...


It doesn't matter much, though ... who uses Usenet anyways?


Jesper wrote:

...who uses Usenet anyways?

Only the dinosaurs. ;)

Well, don't get me wrong - Usenet is a great tool for exchanging information and the number of posts I have made on Usenet still exeedes the number of blog posts and comments I have ever made but several orders of magnitude (maybe even 10 times).

The trick is that I would like to respond to the guy referencing my article ... but that requires a Usenet reader ... or at least configuration of one. I will have to tweak ThunderBird to do it when I come home Smile.

If anyone can beat me to it - feel free to answer him and tell him that he should check his sources a bit better.

I know, I still use Usenet a lot. I mostly use it for the groups in the microsoft.public.* hierarchies. Lots of helpful people. These groups are of course also available on Microsofts webpages, but It's so much easier to read through a dedicated news client. The web interface on Google Groups is no good at all.

Well, the interface I had to Usenet at work was not so good, so I couldn't get a proper overview of what kind of poster Erik Funkenbusch is. Having looked at him and his posting details in more detail, I actually think Erik is very much aware of the irony of this article and is propably messin' with da minds of the angry mob at comp.os.linux.advocacy .



Yes, I noticed after I posted a reply...Smile


The feeds to your blog are all coming in duplicated -- two entries for every post. (At least on my PC, with IE7. No other blog that I subscribe to has this problem.)

I made the post "Santa Claus is coming to town" and it has not appeared twice in my feed on Google Reader yet - is it the same on your side?

If it is the case, I think I have the answer to why they appear twice: IE7 (and Google Reader) apparently checks if the posts have been modified ... and I have not modified the post after it's initial publication.


Comments are closed