a 'mooh' point

clearly an IBM drone

OOXML is defective #2 (depends on proprietary technologies)

A standard is not "free enough" if implementation of it depends on existance of a proprietary technology on the specific platform. Ideally it should be possible simply to buy the specification and implement it without any other financial requirements.

This is where OOXML fails.

OOXML heavily depends on "Object Linking and Embedding Technology" also known as "OLE-technology". Section 9.3.3 of the specification deals with how objects are embedded in the file format. The section is divided in two where the first section specifies how to embed documents otherwise defined in this standard. These documents are defined as

  • Formulas
  • Charts
  • Spreadsheets
  • Text documents
  • Drawings
  • Presentations

This is one of the clear cases where it is obvious that Microsoft continiously tries to preserve their main cash-cow: The Microsoft Office eco system! OOXML not only depends on Microsoft's proprietary technology OLE, the specification itself also makes it more easy to embed it's own "cousins" than any other file format. Talk about "first class citizens" of OOXML!

The section goes on telling us about binary objects:

Objects that do not have an XML representation. These objects only have a binary representation [...] (see [OLE]).

WTF? Once again a reference and requirement to use proprietary technologies like OLE! What if I want to embed my own JLSObjectType? What if I want to embed some object from the Linux-world like Bonobo-elements or KParts? The schema-elements only emphasizes my point:

<draw:object/> and <draw:object-ole/>

Are you also puzzled by this? Well, I don't blame you. To wrap up - we can embed "our own documents" and we can embed everything else. There are even two seperate elements from the draw-namespace that specifies this for us: <draw:object/> and <draw:object-ole/>. The entire schema-fragment is included here for your pleasure:

<define name="draw-object">
    <element name="draw:object">
        <ref name="draw-object-attlist"/>
        <choice>
            <ref name="common-draw-data-attlist"/>
            <ref name="office-document"/>
            <ref name="math-math"/>
        </choice>
    </element>
</define>

<define name="draw-object-ole">
    <element name="draw:object-ole">
        <ref name="draw-object-ole-attlist"/>
        <choice>
            <ref name="common-draw-data-attlist"/>
            <ref name="office-binary-data"/>
        </choice>
    </element>
</define>

This is yet another example of Microsoft on one hand claiming "openness" and with the other hand forcing everyone to use their own proprietary, undocumented technology.

But we're not done:

The embedded object is referenced through an XLink attribute in the enclosing frame-element. The behaviour is described as (bold typeface is my addition, /JLS):

The xlink:href attribute links to the object representation, as follows:

  • For objects that have an [OO]XML representation, the link references the sub package of the object. The object is contained within this sub page exactly as it would as it is a document of its own.
  • For objects that do not have an XML representation, the link references a sub stream of the package that contains the binary representation of the object.

Wow - wait a minute: Is this it? Don't you think a bit of clarification would be in order?

The fileformat for the physical file is a Zip-archive with a number of files and folders in it. But this archive also contains a "TOC"-list of the files and the mime type of the entire package. The latter is not an XML-file - where do I put this? Where do I put the TOC-file? What if my spreadsheet contains an image? Since the image is not in XML-format (it's binary) ... would my entire spreadsheet qualify as having "an XML representation"? And did you notice the part "the link references a sub stream of the package that contains the binary representation of the object."? A stream? Binary representation? Again totally unspecified behaviour and noone will ever be able to implement this apart from Microsoft and Microsoft Office 2007.

Microsoft had a good chance to specify this properly in the beginning. They could have made an open format to enable competition or a format that would stiffle competetion. So what does Microsoft do? Yup, the anti-competitive choice. Anyone surprised?

Interoperability - between what?

What is interoperability, really?

Well, when it comes to document formats, some people seems to think that interoperability is the ability to transform one format to another. That high-fidelity interoperability can only be achieved when it is possible to perform a complete translation/conversion of format X to format Y.

The basic problem for this premis is that if you were able to do this conversion, it would be the same as being able to make a 1-1 mapping between the functionality and features of format X and format Y (and vice versa). However - this effectively means that format X is actually just a permutation of format Y ... making format X and format Y the same format (pick up your favorite book on mathematical topology to see the details).

When it comes to ODF and OOXML, the case is pretty clear - the two formats are not the same. Sure - they can both define bold text,  but there are quite a few differences between the formats. A list of some of them can be found at the ODF-Converter website. I think that the list is the best argument for not being able to do a complete conversion of ODF to OOXML (and back). This was also one of the conclusions of the Frauenhofer/DIN-work in Germany, where they concluded that a full 1-1 mapping between the two formats could not be done.

The key question here is: Is interoperability diminshed by this fact?

If you ask Rob's posse, they will almost certainly say "Yes". They will say something like "Microsoft chose not to make OOXML interoperable with the existing ISO-standard ODF and therefore OOXML is a blow to interoperability".

If you ask me, I will say "No". I will say no because the term "interoperability" has been hijacked by the anti-OOXML-lobby in much the same way the SVG-namespace was hijacked by ODF TC. I will say "No" because interoperability means something radically different. The meaning is not rocket sciency, really ... and usually most people agree with the basis definition of interoperability. A few of those are:

Computer Dictionaly online: 

http://www.computer-dictionary-online.org/interoperability.htm?q=interoperability

The ability of software and hardware on multiple machines from multiple vendors to communicate.

IEEE: 

http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?tp=&isnumber=4683&arnumber=182763&punumber=2267

the ability of two or more systems or components to exchange information and to use the information that has been exchanged

US e-Government Act of 2002:

http://frwebgate.access.gpo.gov/cgi-bin/getdoc.cgi?dbname=107_cong_public_laws&docid=f:publ347.107.pdf

ability of different operating and software systems, applications, and services to communicate and exchange data in an accurate, effective, and consistent manner.

If you also look at the enormous list from Google you will see, that none of the definitions talk about the ability to convert formats. Instead they talk about communication between machines, platforms and networks. This is very close to my definition of interoperability when it comes to document formats.

The interoperability gained by using a specific document format is based on the possibility of implementing the format on any kind of platform, in any kind of software using any kind of operatingsystem. It is based on how well and consice and clear the language of the specification of the format is and it depends of howwell thought out the specification is.

It has nothing, nothing, nothing to do with the possibility of converting the format to any other format. 

A cry for help

Working almost everyday with implementing solutions that support ODF and OOXML I am naturally tasked (or more appropriately: challenged) with ambiguities in the forementioned specifications. At first glance ODF has an appealing simpleness and form, and reading the specification is almost like reading a book in natural "prose". However - the easiness to read sadly comes at the expense of clear language. So - as always when implementing any specifications, you need to have somewhere to go to ask your technical questions regarding how to implement the damn thing or questions about how to read the devil.

And therein lies my problem:

Where do I go to get answers to get these questions about ODF? Where is the website for ODF-development?

I have tried the forums at opendocument.xml.org -but the groups there are almost dead.

I have tried the maillist for the OpenDocument TC, but it is also almost dead.

So please help me - where do I go?

Update: I almost forgot - I have also prowled the Danish blogsphere where the ongoing battle between OpenXml and ODF usually takes place, but noone has been able to give me any pointers to where they usually get their information about implementing ODF.

(or have I been so heavily stigmatised by being pro-choice that noone wants to help me?)

Smile

OOXML is defective #1 (Pasword hashing)

OOXML has been accused of being rushed through not even the writing itself but also certification in both ECMA and ISO. It's a quick accusation to make but sometimes it can be really tricky to figure out if a statement is true or false. But you know, sometimes you stumple over something that really shows you that the specification was rushed through not only preliminary editing but also certification in ISO.

The one thing I noticed in was password hashing. As with other document formats, document protection can be defined in multiple ways. There is of course protection of the document itself but most document formats also allow protection of specific parts of the document or even read-only protection of the document. The way it's usually done is to ask the user for a password, hash it and store it in the document. When the document is opened the next time, the user is prompted for a password, and if it matches the stored value - the protection of the document (or parts of it) is released.

Now, this is defined, amongst other places, in section 4.4.1 (Section attributes) where it deals with protection of sections. The text says:

A section can be protected, which means that a user can not edit the section. The text:protected attribute indicates whether or not a section is protected. The user interface must enforce the protection attribute if it is enabled.

This is more or less what I wrote above. It also says:

A user can use the user interface to reset the protection flag, unless the section is further protected by a password. In this case, the user must know the password in order to reset the protection flag. The text:protection-key attribute specifies the password that protects the section. To avoid saving the password directly into the XML file, only a hash value of the password is stored.

And that's it.

WTF? Nothing more? Nothing about how to specify the hashing algorithm? Nothing about how to specify initialization vectors, prepending of zeroes ... nothing?

But wait - what if we look in the schema itself - maybe it's just the descriptive text that is a bit ... ahem ... limited. Ok - the schema says:

<define name="sectionAttr" combine="interleave">
 <optional>
  <attribute name="text:protection-key">
   <ref name="string"/>
  </attribute>
 </optional>
</define>

Dammit - nothing here either. Notice also that it is not possible to store the way the hash-value is persisted. Is it a bit-sequence? A Hex'ed bit-sequence? A Base64-sequence? Nothing!

But wait again - let's look into the file of an actual document with read-only protection. Let's see what is stored in the document. Well, the XML-fragment lists as:

<table:table
 table:name="Ark1"
 table:style-name="ta1"
 table:protected="true"
 table:protection-key="PnKGfjzdfrt6XxQxdTcQVqbmA/7Ro="
 table:print="false"
>

Any clever suggestions for me as an ocument consumer to what to do with this value? This is truly amazing. One one hand the authors talk about their document format being able to provide true and pure interoperability ... but they haven''t specified something as common as document protection. I wonder how they can claim this with a straight face. Interoperability is certainly not enabled by limiting the details of the specification to as little as this ... but maybe they just hope noone will use this feature and thereby have "interoperability by rejection".

I cannot help to wonder: who in their right mind would put up a suggestion for standardisation of a document format that was unspecified in such a central feature as "document protection". This must be one of those places where

Ratification trumphs perfection 

Yeah, well ...

Word of recognition from an unexpected side

Today - or was it yesterday? - Patrick Durusau issued an open letter regarding the standardization of OOXML. It is an interesting read - especially for those of us that have worked endless hours in NSBs with processing the dispositions of comments from IEC/ISO editor Rex Jaeschke. I will not dig too much into the details of the statement, since I am sure others will do so, just quietly note that is it nice once in a while to be appreciated and not only picked at because of our "lack of qualifications" and accusations of being angle-grapping, bribed, paid for puppets only acting by the will of Microsoft.

Thank you, Patrick!

Smile

I will only quote this:

The OpenXML project has made a large amount of progress in terms of the openness of its evelopment. Objections that do not recognize that are focusing on what they want to see and not what is actually happening with OpenXML

Ooh - and one prediction: I think the anti-OOXML-lobby will try to drop this like a hot potato. The Pro-choice side will naturally salute this - and the Pro-ODF side will quietly wait out the storm quietly mumbling "Nothing to see here, please pass along".

Yes, some of them might even use some of the skills they learned in the third part of the course they took, Hypocricy 101.

"Talk is silver, but silence is gold"

Do your math - OOXML and OMML (Updated 2008-02-12)

As I promised in my latest article about ODF and MathML, I have worked a bit with the ECMA-equivilants of ODF and MathML: OOXML and OMML (Office Math ML).

A bit of introduction is propably a good idea:

In OOXML, mathematical content is structured using the internal markup language, Office Math ML or OMML, for short notation. OMML is closely tied to the structure of WordProcessingML and the look-and-feel is very similar. In contrast to the ODF-way, OMML is usually inserted inline in the WordProcessingML whereas it in ODF is kept in a seperat part of the package. 

Ok - now that that is done with - lets get on with the good stuf!

As in my previous article, I'll work with the same  base equation



Now, as I wrote in the other article, learning MathML is like learning a new (programming)-language, and I can tell you, it is no different with OMML. MathML arranges the mathematical elements by position whereas OMML arranges the mathematical elements by their explicit meaning, so a fraction is created in MathML as (simplified)

<math:mfrac>
  <math:mi >
π</math:mi>
  <math:mn>4</math:mn>
</math:mfrac>

and in OMML it is created as (simplyfied)

<m:f>
  <m:num>
    <m:r>π</m:r>
  </m:num>
  <m:den>
    <m:r>4</m:t>
  </m:den>
</m:f>

So when dealing with MathML and e.g. fractions, we look at a fraction with "something at the top and something at the bottom". When dealing with OMML, we deal with "numerators" and "denominators". It is rather clear to me, that any skills learned in MathML are not directly applicable to OMML - and vice versa. It took me about the same amount of tíme to "get" MathML as it did to "get" OMML. In both cases, I had not worked with the specific ML before. It has taken me about a day to research and write each article.

Anyway - back to the plot.

As always I work with my friend, "the minimal OOXML-file". It is an OOXML-file stripped from all the junk and cut down to the bare minimum - not even a single, not-used namespace declaration is left behind. You can see the minimal file here: Minimal OOXML.docx (1,16 kb).

So my task was a two-step-task: Since OOXML is rather new there is not that much information about OMML out there. So as first step I created a sample equation using Word 2007 to get a feeling of what it's all about. Then I found Part 4 of the OOXML-spec, located section 7 and started to put the OMML together. The OMML I ended with was this:

<m:oMathPara>
  <m:oMath>
    <m:r>
      <w:rPr>
        <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
      </w:rPr>
      <m:t>cos</m:t>
    </m:r>
    <m:f>
      <m:num>
        <m:r>
          <w:rPr>
            <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
          </w:rPr>
          <m:t>π</m:t>
        </m:r>
      </m:num>
      <m:den>
        <m:r>
          <m:t>4</m:t>
        </m:r>
      </m:den>
    </m:f>
    <m:r>
      <m:t>=</m:t>
    </m:r>
    <m:f>
      <m:num>
        <m:rad>
          <m:radPr>
          </m:radPr>
          <m:deg/>
          <m:e>
            <m:r>
              <m:t>2</m:t>
            </m:r>
          </m:e>
        </m:rad>
      </m:num>
      <m:den>
        <m:r>
          <m:t>2</m:t>
        </m:r>
      </m:den>
    </m:f>
  </m:oMath>

I bet you are now thinking what I was thinking: what the f***? That's a lot of markup! Well, the reason why there is so much markup is that each piece of text/data in the equation is encapsulated in a "run"-element that enables additional styling. If all this additional markup including other property-markup is removed, the result is this:

<m:oMathPara>
  <m:oMath>
    cos
    <m:f>
      <m:num>π</m:num>
      <m:den>4</m:den>
    </m:f>
    =
    <m:f>
      <m:num>
        <m:rad>
          <m:e>2</m:e>
        </m:rad>
      </m:num>
      <m:den>2</m:den>
    </m:f>
  </m:oMath>
</m:oMathPara>

Ain't that purdy?

The OOXML-file with the equation is available here: minimal ooxml with math.docx (1,25 kb). It displays like this in Microsoft Office 2007:

Why not just use MathML?

Before I go into the details with converting from MathML to OMML, I think it is appropriate to pause and look at how MathML and OMML differ from each other. As I noted above there is quite a lot of "overhead" in OMML with everything being encapsulated in "runs". But there is a reason for this. The overhead enables us to do a couple of things that we cannot do with MathML.

Everything fits

You can put virtually everything into a OMML-formula that you can put into a normal WordprocessingML-fragment. As Murray Sargent puts it:

Word needs to allow users to embed arbitrary span-level material (basically anything you can put into a Word paragraph) in math zones and MathML is geared toward allowing only math in math zones. A subsidiary consideration is the desire to have an XML that corresponds closely to the internal format, aiding performance and offering readily achievable robustness. Since both MathML and OMML are XMLs, XSLTs can (and have) been created to convert one into the other. So it seems you can have your cake and eat it too. Thank you XML!

MathML allows some styling of the individual text fragments in the equations, but that's basically it.

WordprocessingML look-and-feel is preserved

To me it is really nice to work with markup for equations that is similar to the markup surrounding it. If I was to use MathML inline instead of OMML, the markup would be completely different than the markup around it. You can say that using MathML enables you to reuse any MathML-skills you might have in advance. Similarly you can say, that using OMML for equations enables you to reuse the skills you have from working with WordprocessingML. It's kind of a "give-and-take"-sitiation.

Revision-control (change-tracking) is possible

Having the overhead enables change-tracking on the same granular level as with your regular text. You can track changes in your equations on a character-by-character basis. In Word 2007 it looks like this when I make a modification to the equation (multiply the second fraction with "2" and remove the cosine-function from the first fraction).

 

 

The markup enabling this is here (for removing the cosine function, where "w:del" means "delete"):

<w:del w:id="0" w:author=" Jesper Lund Stocholm" w:date="2008-01-30T10:41:00Z">
  <m:r>
    <w:rPr>
      <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
    </w:rPr>
    <m:t>cos</m:t>
  </m:r>
</w:del>

This is not at all possible when using MathML out-of-the-box. You cannot merge the MathML with other markup like this, and if you use MathML as it is done in ODF (i.e. not "inline) it is simply impossible (at least as far as I can see). MathML in ODF is treated as an external object. which means that it is encapsulated in a OpenDocument Draw frame. The markup for one of the files I used in the other article is like this:

<text:p text:style-name="Standard">
 <draw:frame
   draw:style-name="fr1"
   draw:name="Objekt1"
   text:anchor-type="as-char"
   svg:width="2.418cm"
   svg:height="1.034cm"
   draw:z-index="0"
 >
  <draw:object
    xlink:href="./MathML"
    xlink:type="simple"
    xlink:show="embed"
    xlink:actuate="onLoad"
  />
  <draw:image
    xlink:href="./ObjectReplacements/MathML"
    xlink:type="simple"
    xlink:show="embed"
    xlink:actuate="onLoad"
  />
 </draw:frame>
</text:p>

If I wanted to change some text like "Display equation below"  to "Disrply equation below" (add an 'r' and delete an 'a') in ODT, it would look something like this:

<text:p>
  Dis<text:change-start text:change-id="ct102825880"/>
  r<text:change-end text:change-id="ct102825880"/>
  pl<text:change text:change-id="ct102844952"/>
  y equation below
</text:p>

So registration of the changes are - as with OOXML - merged into the text being modified. I think you could mark the whole equation as "modified" in ODF by putting an <text:change-start>-element around the complete <draw:object>-element, but I am not sure it would work. Also, OpenOffice.org doesn't seem to register changes to MathML-zones at all. Using OpenOffice.org it looks like this

 

(I changed the denominator of the first fraction to "54") 

 

I cannot say that there are (or are not) other areas where MathML just doesn't cut it - these were just a couple of those that I have experienced myself. I do believe, though, that the examples above warrant the simply question:

Why the hell did OASIS ODF TC decide to use MathML in the first place?

Interoperability

Interoperability is clearly what the young kids want these days - so let's see what we can do with mathematical content. MathML and OMML are clearly two different markup languages, but is it possible to convert between them? Fortunately it is. Microsoft Office 2007 allows c/p of MathML into OMML-equations and it can even export OMML to MathML. Luckily for us the logic around this is not embedded into some fancy place in Microsoft Office 2007 - it is done using simple XSLT-transformations. They have made the stylesheets OMML2MML.xsl and MML2OMML.xls and if you apply these to either your OMML or MathML, it is translated to the other. Just for the fun of it I tried to convert the OMML-version of the equation to MathML. All I did was to find the OMML2MML.XSL and insert a single line in the XML-file document.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="OMML2MML.XSL"?>
<w:document
  xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
  xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
  >
  <w:body>
    <w:p>
      <m:oMathPara>
        <m:oMath>
          <m:r>
            <w:rPr>
              <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
            </w:rPr>
            <m:t>cos</m:t>
          </m:r>
...

(and then I processed the file using my favorite XSLT-translator)

I'm sure - if you are a "technical" person - you have found yourself using/writing some code and just before you press "Compile" or "Run" you think: "This is sooo not gonna work". This was one of those situations for me - but you know what, it actually worked in the first try. The MathML generated is this

<?xml version="1.0" encoding="utf-8"?>
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
  <mml:mi mathvariant="italic">cos</mml:mi>
  <mml:mfrac>
    <mml:mrow>
      <mml:mi>π</mml:mi>
    </mml:mrow>
    <mml:mrow>
      <mml:mn>4</mml:mn>
    </mml:mrow>
  </mml:mfrac>
  <mml:mo>=</mml:mo>
  <mml:mfrac>
    <mml:mrow>
      <mml:mroot>
        <mml:mrow>
          <mml:mn>2</mml:mn>
        </mml:mrow>
        <mml:mrow />
      </mml:mroot>
    </mml:mrow>
    <mml:mrow>
      <mml:mn>2</mml:mn>
    </mml:mrow>
  </mml:mfrac>
</mml:math>

... and it validates as well (using Amaya and changing the XML-file from a UTF-16 file to UTF-8)

Ét voilá

Now, wouldn't it be cool if the MathML generated from the OMML could be used in a ODT-document? You know what ... it can! I took the MathML above and inserted it into one of the documents I made for the ODF/MathML-article and inserted it into the MathML-zone of the ODF-package. The file is available here: minimal-mathml-omml-inject.odt (1,31 kb).

The result of opening the file using OpenOffice.org:

In the words of Murray Sargent, I guess you can have you cake and eat it too after all.

Smile

Update:

When writing my post about where to get help for ODF-development I suddenly remembered that I missed a part of this article: "The quirks". Because - naturally there are quirks with using OMML with Microsoft Office 2007 ... just as there were with MathML and OpenOffice.org.

Now, if you take another look at the OMML/XML-fragment I created, there were to parts I really couldn't figure out a way to remove:

<m:oMathPara>
  <m:oMath>
    <m:r>
      <w:rPr>
        <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
      </w:rPr>
      <m:t>cos</m:t>
    </m:r>
    <m:f>
      <m:num>
        <m:r>
          <w:rPr>
            <w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
          </w:rPr>
          <m:t>π</m:t>
        </m:r>
      </m:num>

Now, the <w:rPr>-elements should have absolutely nothing to do with the content of <w:t>-element - or more correctly, the visibility of the text in the <w:t>-element should not depend of existance of an <w:rPr>-element. But if the two <w:rPr>-sections are omitted, the "cos"-text as well as the π-sign are not displayed. I really have no idea of why this is to so if you do, please let me know. Maybe one of the Microsoft Office 2007-Math guys could step in here?

Do your math - ODF and MathML

When I studied at DTU (Technical University of Denmark) I basically lived in the Department of Mathematics. I did my bachelor project there and I did my thesis there. I think it would be fair to say that math is really in my blood (or was).

Of course - in those days we wrote our equations in LaTeX (not the suit) and I remember how we laughed diabolically at our co-students that did their papers in e.g. Microsoft Word and had to use the really, really annoying "Equation Editor" (shudder). I remember how we also laughed at the students that did pictures and graphs in e.g. Adobe PhotoShop or Visio (before it was aquired by Microsoft, afaik), coz everybody knew that it had to be done using xFig ... the program with the worst possible UI ever ... at least in those days.

For the purpose of these articles (an article about Microsoft Office 2007 and OMML will follow shortly) I dug into my thesis and looked at how math was displayed using LaTeX. I created a "reference equation" to use when trying to display some math in either ODF or OOXML. The test equation I made was this:

\begin{equation}
    \cos\Big(\fraq{\pi}{4}\Big) = \Big(\fraq{\sqrt{2}}{2}\Big)
\end{equation}

For those of you not speaking LaTeX fluently - you should consult the "Not so short introduction to LaTeX" chapter 3 - or simply behold the equation below:

  

In ODF mathematical notations are done using MathML (section 12.5) - a W3C-standard for displaying mathematical content. The mathematical content is embedded in the ODF-package as an object and as far as I can see, it is not possible to use MathML inline in the content of the paragraphs of the document itself. I have earlier talked about ODF being vague and this is imo one of the places where some clarity could help.

But - learning MathML is like learning a new language ... it doesn't really make sense in the beginning. So I started to poke around a bit on the W3C-website in search of some tools or tutorials that would help me figure ot what MathML is all about. I eventually found a W3C tool called Amaya. It's a MathML/SVG-tool developed by W3C and I used this tool to create the MathML for the base equation above. In Amaya it looks like this:

 

 

The interesting part, of course, it the MathML created by Amaya. The MathML (slightly modified, but validated) is listed below

<?xml version="1.0" encoding="utf-8" ?>
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <mrow>
    <mtext>cos</mtext>
    <mo>(</mo>
    <mfrac>
      <mi>&pi;</mi>
      <mn>4</mn>
    </mfrac>
    <mo>)</mo>
    <mi>=</mi>
    <mo>(</mo>
    <mfrac>
      <msqrt>
        <mn>2</mn>
      </msqrt>
      <mn>2</mn>
    </mfrac>
    <mo>)</mo>
  </mrow>
</math>

If you look at the XML, it is pretty easy to identify the different parts of the equation.

So - in theory I should be able to put this into an ODF-document and it would be displayed when opening the document using OpenOffice.org - the reference implementation of ODF. 

Let's see

Smile

Step 1

Create an ODF-document using OpenOffice.org with an mathematical formula embedded.

Now, this was the easy part. I cannot figure out how to insert a regular "Pi"-sign in the formula, but the formula looks just fine. The file is available here: math.odt (9,72 kb). It looks like this:

 


 

Step 2

Clean the file for all the disturbing crap that the application puts in per default

This was a bit more tricky, since somehow it seems that the mathical formula can only be contained in a file called "content.xml" - otherwise OpenOffice.org simply shuts down. Also, I have removed alle meta-data, styling, extra namespace-declarations, embedded thumbnails and graphical representation of the formula. The cut-down ODT-file is available here: math-minimal.odt (1,43 kb). The visual representation is completely like the original file. 

Step 3

Inspect the MathML in the application created MathML-file

The MathML created by OpenOffice.org looks like this: 

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE math:math PUBLIC "-//OpenOffice.org//DTD Modified W3C MathML 1.01//EN" "math.dtd">
<math:math xmlns:math="http://www.w3.org/1998/Math/MathML">
  <math:sema
ntics>
    <math:mrow>
      <math:mi>cos</math:mi>
      <math:mrow>
        <math:mfenced math:open="" math:close="">
          <math:mfrac>
            <math:mi math:fontstyle="italic">pi</math:mi>
            <math:mn>4</math:mn>
          </math:mfrac>
        </math:mfenced>
        <math:mo math:stretchy="false">=</math:mo>
        <math:mfenced math:open="" math:close="">
          <math:mfrac>
            <math:msqrt>
              <math:mn>2</math:mn>
            </math:msqrt>
            <math:mn>2</math:mn>
          </math:mfrac>
        </math:mfenced>
      </math:mrow>
    </math:mrow>
    <math:annotation math:encoding="StarMath 5.0">cos left ( pi over 4 right )  = left (sqrt{2}  over 2  right )</math:annotation>
  </math:sema
ntics>
</math:math>

There are a couple of things to note about this. Firstly, I don't understand the namespace declaration as

"<!DOCTYPE math:math PUBLIC "-//OpenOffice.org//DTD Modified W3C MathML 1.01//EN" "math.dtd">

The doctype should not matter at all - and why they chose to use a "DTD Modified W3C MathML 1.01" is beyond me. I'm not saying it's an error - I just don't get it. Enlighten me, pleze.  Secondly the MathML created looks different from the MathML created my Amaya. However - just as the same paragraph can be presented in all sorts of way using HTML and the same equation can be presented in different ways (e.g. sin2(x) + cos2(x) = 1 is basically the same as a2 + b2 = c2), the same equation can be created in an endless myriad of ways using MathML. Thirdly there are two distinct ways where the OOo MathML is different from the MathML of Amaya. Notice how it uses the <mfenced>-element to make a parenthesis instead of <mo>)</mo>. There is really no difference - however I tend to think that using the <mfenced>-element is slightly more sophisticated than the <mo>-element, but it's just a personal belief. Also, look at the usage of the <semantics> and <annotation>-elements. This is actually really cool. The <semantics>-elements are used to provide "meaning" to the MathML-markup and the content in the <annotation>-elements directly maps the MathML markup to the corresponding expression tree. Also, OpenOffice.org allows you to type in the annotation directly, thereby enabling some of the ease of writing LaTeX directly by hand.

Step 4

Validate the MathML-file using W3C-validator or Amaya 

The picture below shows the content.xml loaded and displayed in Amaya. The green dot in the bottom right corner indicates that the MathML is valid. I have also made a test with embedding the MathML in a HTML-document and validated it against the W3C-validator and the result is the same.

 

 

Super!

Step 5

Insert the MathML created by Amaya into the ODT-file and open the file using OpenOffice.org

Now, I have previously created the formula using Amaya and I just have to inject it into the ODT-file. I did and the file is available here: mathml-minimal-error.odt (1,23 kb). The result is, however, not as I expected

 


 

Ok - but as you might have noticed, all elements in the OOo MathML-file were namespace-prefixed, so maybe this will do the trick. I tried this as well but with the same result. File is available here: mathml-minimal-nsprefix-error.odt (1,24 kb).

Final step

Figure out what the hell is wrong 

I finally figured out what is wrong with the way OpenOffice.org handles MathML-content. It turns out that if I took the Amaya MathML (without ns-prefix) and inserted the MathML into the original content.xml-file but preserved the DOCTYPE-declaration, it works almost as expected. File is available here: mathml-minimal-inject-succes.odt (1,30 kb).



Well, some error are introduced. The Π-character is not displayed and the equation is displayed in bold. Also the equal-sign has disappeared as well.

Just for the fun of it I took the MathML-file generated by OpenOffice.org and removed the <semantics>-element as well as the <annotation>-element. File is available her: mathml-minimal-inject-no-semantics.odt (1,35 kb). The result when opening it in OpenOffice.org is .. well ... sad:



I have absolutely no idea of why it displays it like this. Removing the <semantics>-element and <annotation>-element should have no effect on the visual representation of the equation.

Conclusion?

Well, I don't really know what to conclude. Most of the things I have shown above are imo due to errors in the implementation of OpenOffice.org where MathML is clearly not implemented correctly sufficiently. It seems that there are some unwritten rules to how MathML is supposed to be used when working with it in OpenOffice.org, but they seem rather unclear and weird to me.

But how OpenOffice.org behaves is really not important to me - some implementations of ODF are better than others, and maybe other implementations do a better job at displaying MathML. The point should be how the specification says it should be used. Luckily the ODF-spec only talks about how MathML is used in a single place - section 12.5 Mathematical Content. It says that "Mathematical content is represented by MathML 2.0 (see [MathML])". The RelaxNG-snippet provided also tells us that you can put everything into a "math area", <math:math>:

<?xml version="1.0" encoding="UTF-8" ?>
<define name="math-math">
    <element name="math:math">
        <ref name="mathMarkup" />
    </element>
</define>
<!-- To avoid inclusion of the complete MathML schema, anything -->
<!-- is allowed within a math:math top-level element -->
<define name="mathMarkup">
    <zeroOrMore>
        <choice>
            <attribute>
                <anyName />
            </attribute>
            <text />
            <element>
                <anyName />
                <ref name="mathMarkup" />
            </element>
        </choice>
    </zeroOrMore>
</define>

So basically, all bets are off. I can only begin to wonder how other implementations of ODF use MathML.

And a small appetizer:

As soon as I get the time for it, I'll write an article as this one with Office 2007 and OMML. I will investigate how to markup mathematical content using OMML and I will also try to use the XSL-files provided by Microsoft in Office 2007 to create XSLT-translations of my base equation from OMML to MathML and vice versa.

... stay tuned ... 

Smile

Embrace and extend - SVG in ODF revisited

One of the attack-vectors on OOXML has been the lack of reuse of existing standards. Specifically it lands directly in the discussion of DrawingML vs. SVG and OOML vs. MathML ... both of which are relatively interesting subjects. The argument has been why Microsoft chose not to reuse SVG and created DrawingML instead - and likewise with MathML and OMML.

Now, some of the arguments for reusing existing standards are:

  • Reuse of other people's code
    As a programmer, I love this - there is nothing more satisfying than being able to reuse something that others have made an effort to produce
  • Increase quality
    If something is an existing standard, someone else has propably reviewed it and the worst bugs have likely been removed.
  • Brain cycle reuse
    If you reuse some work already defined, you will propably be able to find someone in your organization that has skills in this area - and you avoid the costs of re-educating them to use a new tool.

So, with respect to ODF, it has tried to reuse as many standards as possible, so e.g. mathematical content is done using MathML and vector graphics are supposedly done using SVG. Microsoft has chosen a different path where they have created new formats for their formats, så mathematical content is done using OMML (Office Math Markup Language) and vector graphics are done using DrawingML.

A couple of weeks ago I heard some rumours that ODF had not actually only used SVG as vector graphics format but also even extended it beyond the standardized format. My initial response was that it had to be wrong information. One of the corner stones of ODF is namely that it reuses existing standards and that there is a "clean cut" between ODF and the standard it utilizes. This way I would be able to buy/aquire some library that supports SVG and simply incorporate it in my product implementing ODF. But if the referenced standard is extended - I will either experience less functionality due to extensions not being parts of the standard or I could experience crashing code when I try to pass the extended format to the external library - at least if it performs e.g. DTD/schema validation and finds out that invalid elements are present in the input.

So what did I do?

Basically I started by doing a random text-search in the ODF-spec for occurences of "[SVG]". One of the first things that caught my attention was the paragraph in section 1.3 Namespaces, Table 2 where it says:

Prefix Description
Namespace
svg
For elements and attributes that are compatible to elements or attributes defined in [SVG].
urn:oasis:names:tc:opendocument:xmlns: svg-compatible:1.0


The term "compatible to elements or attributes" seems quite odd to me, since it should not be necessary to specify this if the referenced standard is not extended. I did another quick search and I stumpled over these sections of the specification:

  • 14.14.2 SVG Gradients
  • 15.13.13 Line Join

Let me quickly walk through the contents of each section.

14.14.2 SVG Gradients

The contents of section 14.14.2 says, amongst other things.

In addition to the gradients specified in section 14.14.1, gradient may be defined by the SVG gradient elements <linarGradient> and <radialGradient> as specified in §13.2 of [SVG].

Cool!

Now, the section goes on as

The following rules apply to SVG gradients if they are used in documents in OpenDocument format:

  • The gradients must get a name. It is specified by the draw:name attribute.
  • For <linarGradient>, only the attributes gradientTransform, x1, y1, x2, y2 and spreadMethod will be evaluated.
  • For <radialGradient>, only the attributes gradientTransform, cx, cy, r, fx, fy and spreadMethod will be evaluated.
  • The gradient will be calculated like having a gradientUnits of objectBoundingBox, regardless what the actual value of the attribute is.
  • The only child element that is evaluated is <stop>.
  • For <stop>, only the attributes offset, stop-color and stop-opacity will be evaluated.

 So, to be able to determine if ODF is only referencing SVG, we need to look at section 13.2 in SVG spec. It says:

<!ELEMENT %SVG.linearGradient.qname; %SVG.linearGradient.content; >
<!-- end of SVG.linearGradient.element -->]]>
<!ENTITY % SVG.linearGradient.attlist "INCLUDE" >
<![%SVG.linearGradient.attlist;[
<!ATTLIST %SVG.linearGradient.qname;
    %SVG.Core.attrib;
    %SVG.Style.attrib;
    %SVG.Color.attrib;
    %SVG.Gradient.attrib;
    %SVG.XLink.attrib;
    %SVG.External.attrib;
    x1 %Coordinate.datatype; #IMPLIED
    y1 %Coordinate.datatype; #IMPLIED
    x2 %Coordinate.datatype; #IMPLIED
    y2 %Coordinate.datatype; #IMPLIED
    gradientUnits ( userSpaceOnUse | objectBoundingBox ) #IMPLIED
    gradientTransform %TransformList.datatype; #IMPLIED
    spreadMethod ( pad | reflect | repeat ) #IMPLIED  
>

So it seems that at least the attribute gradientUnits is not used in the ODF-adapted version of SVG.

If we look at <radialGradient>, we need to cross reference with the corresponding  DTD in SVG. It says:

<!ENTITY % SVG.radialGradient.extra.content "" >
<!ENTITY % SVG.radialGradient.element "INCLUDE" >
<![%SVG.radialGradient.element;[
<!ENTITY % SVG.radialGradient.content
    "(( %SVG.Description.class; )*, ( %SVG.stop.qname; | %SVG.animate.qname;
    | %SVG.set.qname; | %SVG.animateTransform.qname;
    %SVG.radialGradient.extra.content; )*)"
>
<!ELEMENT %SVG.radialGradient.qname; %SVG.radialGradient.content; >
<!-- end of SVG.radialGradient.element -->]]>
<!ENTITY % SVG.radialGradient.attlist "INCLUDE" >
<![%SVG.radialGradient.attlist;[
<!ATTLIST %SVG.radialGradient.qname;
    %SVG.Core.attrib;
    %SVG.Style.attrib;
    %SVG.Color.attrib;
    %SVG.Gradient.attrib;
    %SVG.XLink.attrib;
    %SVG.External.attrib;
    cx %Coordinate.datatype; #IMPLIED
    cy %Coordinate.datatype; #IMPLIED
    r %Length.datatype; #IMPLIED
    fx %Coordinate.datatype; #IMPLIED
    fy %Coordinate.datatype; #IMPLIED
    gradientUnits ( userSpaceOnUse | objectBoundingBox ) #IMPLIED
    gradientTransform %TransformList.datatype; #IMPLIED
    spreadMethod ( pad | reflect | repeat ) #IMPLIED
>

So here the attribute gradientUnits is not used as well. 

But luckily the good guys at ODF TC have solved this mystery for us - since they have decided that the value of the (non-existing) attribute gradientUnits is calculated as having a value of "objectBoundingBox", regardless of the value passed as this parameter. It's a bit odd, but I suppose it has something to do with the way the SVG-fragments positions themselves around the other objects in the document.

15.13.12 Line Join

The contents of section 15.13.13 is:

The attribute draw:stroke-linejoin specifies the shape at the corners of paths or other vector shapes, when they are stroked. The values are the same as for [SVG]'s strokelinejoin attribute, except that the attribute in addition to the values supported by SVG may have the value middle, which means that the mean value between the joints is used.

They have even been so kind to provide us with a schema fragment defining the possible usage of this feature in ODF:

<define name="style-graphic-properties-attlist" combine="interleave">
    <optional>
        <attribute name="draw:stroke-linejoin">
            <choice>
                <value>miter</value>
                <value>round</value>
                <value>bevel</value>
                <value>middle</value>
                <value>none</value>
                <value>inherit</value>
            </choice>
        </attribute>
    </optional>
</define>

Compare this with the DTD of SVG (Appendix A.1.7 Paint Attribute Model):

<!ENTITY % SVG.stroke-linejoin.attrib
    "stroke-linejoin ( miter | round | bevel | inherit ) #IMPLIED"
>

So the attribute value "middle"  is indeed an addition to SVG.

Conclusion 

You might be wondering if all this is really worth an entire article about a couple of additions/exclusions of SVG, and you kindda have a point. However, the devil lies in the details.

The modifications to SVG (even if they are minor) are bad enough as they are, because they basically kill high-fidelity interoperability when using existing SVG-libraries. When you are limiting the usage of some component (the limitations to the values of gradientUnits) you basically loose control with how existing data behaves. And when you enlarge a standard (addition of the middle-attribute of the stroke-linejoin element) you loose control with how your own data behaves when using it in other scenarios. You know, this is exactly what Microsoft did when they enlarged not only CSS but JavaScript. Maybe the memory of the ODF-founders is not that great, but I certainly remember the loads of crap-work we had to do in the late ninetees when creating web-pages to "IE5-compatible" browsers and "the rest". In fact - this nightmare still haunts us with the Microsoft additions to JavaScript.  Maybe they just thought: "If Microsoft pulled it off, so can we". I think that's a bad choice.

Also, you should note that ODF does not use SVG "as such" at all. They use fragments of SVG, i.e. elements with same names and attributes and then they fit it into the overall architecture of ODF. This is hardly "just referencing". As the paragraph says above (stroke-linejoin), the elements specifying this are not SVG-elements. They are similar to SVG-elements and even extended beyond this. I actually find it really hard to see or understand how the ODF TC can claim - with a straight face - that ODF only references SVG. I suppose that if I made my own JLSMarkup for document formats and used an element called <body> I would also be able to claim that I was reusing W3C xHTML 1.0. I just don't find it the right thing to do.

My only surprise is why this has not surfaced until now and how anyone can sit down and read in ODF (as being both pro-ODF or pro-choice) and not be just a little confused about how they could claim "just referencing existing standards", is a bit mind-baffling to me. I suppose ECMA could do the same with OOXML and claim "reusage of HTML DOM in OOXML-architecture" since a WordProcessingML-document contains both a <body>-element as well as a <p>-element.

Post scriptum

On his blog Brian Jones speculated in his last comment on the thread "Why all the secrecy?" if you could take an existing SVG-drawing, put it into an ODF-document and expect it to work. Well, just as OOXML, ODF has no limitations to what kind of data you might want to put into it, so usage of SVG in a ODF-document is indeed possible from a technical/architectural point of view. It is not a format question but an implementation-specific question. However - will it work?

ODF has several ways to embed data into the document. The two relevant means are inclusion of an SVG-drawing as an image and inclusion of an SVG-image as an object. ODF supports two ways to embed an object, as stipulated in section 9.3.3:

A document in OpenDocument format can contain two types of objects, as follows:

  1. Objects that have an OpenDocument representation. These objects are:'
    1. Formulas (represented as [MathML])
    2. Charts
    3. Spreadsheets
    4. Text documents
    5. Drawings
    6. Presentations
  2. Objects that do not have an XML representation. These objects only have a binary Representation, An example for this kind of objects OLE objects (see [OLE]).

 

Well, SVG is clearly XML but it is not an "OpenDocument representation" - but then again, neither is MathML, so I'll opt for using these two methods when trying to embed an SVG-drawing into a ODT-document:

  • Insert the SVG-drawing as an image
  • Insert the SVG-drawing as an XML part using the <draw:object>-element as specified in section 9.3.3 of the ODF spec.

I'll use the latest and greatest release of OOo, OpenOffice 2.3.1 DA, to try to display the files. You can see the SVG-file here: ex.svg (482,00 bytes)

Insert SVG as an image

I have created a small ODT-document and added the SVG-file to it. I have added an SVG-image to content.xml as a regular image and put the SVG-file in a folder by itself. The XML-file content.xml is displayed here below.

<?xml version="1.0" encoding="UTF-8" ?>
<office:document-content
 xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
 xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
 xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"
 xmlns:xlink="http://www.w3.org/1999/xlink"
 xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0"
>
 <office:body>
  <office:text>
   <text:p >Test of insertion of SVG-image in ODT-document</text:p>
   <text:p >
    <draw:frame
     draw:style-name="fr1"
     draw:name="grafik1"
     text:anchor-type="paragraph"
     svg:width="17cm"
     svg:height="13cm"
     draw:z-index="0">
       <draw:image
      xlink:href="SVG/ex.svg"
      xlink:type="simple"
      xlink:show="embed"
      xlink:actuate="onLoad" />
      </draw:frame>
   </text:p>
  </office:text>
 </office:body>
</office:document-content>

As it is seen the SVG-image is simply added as a regular image using the ODF-modified version of SVG. The ODT-file is available here: test svg image.odt (1,48 kb). Anyone want to take a guess on what the result of opening this file will be?

 

 

Insert SVG as an "XML-object"

As noted above ODF allows insertion of objects with an "XML-representation" as just a text file. The construction of the ODF-package is a bit more complicated and I'd be happy if anyone could tell me if I made a mistake - and what the correct way would be. As basis for my file I have used an ODT-file with a formula in MathML embedded, an so I'll just again show the contents of the content.xml-file here below.

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
  xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
  xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
  xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0"
  office:version="1.0">
  <office:body>
    <office:text>
      <text:p >Test of insertion of SVG in OOo</text:p>
      <text:p >
        <draw:frame
          draw:name="My SVG drawing [JLS]"
          text:anchor-type="as-char"
          svg:width="1.011cm"
          svg:height="0.467cm"
          draw:z-index="0"
        >
          <draw:object
            xlink:href="./SVG"
            xlink:type="simple"
            xlink:show="embed"
            xlink:actuate="onLoad"
           />
        </draw:frame>
      </text:p>
    </office:text>
  </office:body>
</office:document-content>

Again an xlink reference to the SVG-file is "simply" added to content.xml. The ODT-file is available here: test insert svg.odt (1,48 kb). Anyone want to take a guess on what the result of opening this file will be?

 

 

 

So it seems to recognize the SVG filetype - it just doesn't understand how to process it.

I have a feeling that I might have made an error in the manifest, so I'll include it here and hopefully someone can pinpoint if there is an error:

<?xml version="1.0" encoding="UTF-8"?>
<manifest:manifest xmlns:manifest="urn:oasis:names:tc:opendocument:xmlns:manifest:1.0">
 <manifest:file-entry manifest:media-type="application/vnd.oasis.opendocument.text" manifest:full-path="/"/>
 <manifest:file-entry manifest:media-type="image/svg+xml" manifest:full-path="SVG/ex.svg"/>
 <manifest:file-entry manifest:media-type="application/vnd.oasis.opendocument.image" manifest:full-path="SVG/"/>
</manifest:manifest>

OOo and SVG

I have said before that the devil lies in the details - but here it actually lies right up-front. You see - OpenOffice.org does presently (version 2.3.1) not suppport SVG. It doesn't support SVG as regular images and it does not support SVG as providing vector graphics or "line art". You can import SVG-images with OOo, but it is converted to OpenDocument Draw and Open Document Draw data can be exported to SVG. The import/export is not done not using OOo itself but with a filter, that converts the SVG into the internal ODF Draw format. The feature of supporting SVG is apparently the single most requested feature in OOo, so maybe it will soon be a part of OOo. Also take a look at the "General note" on the "Unsuppoted SVG features"-page of the filter:

SVG and what's named SVG-compatible in OpenDocument is really different. Therefore, the import filter can only approximate the SVG contents.

Ooh - and incidentally - the way ODF and OOo handles SVG is exactly the same way OOXML and Microsoft Office 2007 handles MathML.

Smile

ECMA har udsendt de sidste svar

I går var så dagen, hvor de sidste svar fra ECMA blev gjort tilgængelige for de nationale råd rundt omkring i verden. Dermed har ECMA svaret på alle godt og vel 3500 kommentarer, der indløb i løbet af behandlingen af DIS 29500 i sommer/efterår 2007.

Under arbejdet med standarden og diskussionerne om den henover sommeren kunne jeg ikke lade være med at tænke på, at rigtigt mange af kommentarene var det rene vås eller i bedste fald ligegyldige. De var som lavet ud fra devisen "hvor jeg nu bevidst prøver at misforstå det - hvor er det så lettest henne?" (ex: OLE). Det er klart, at der var mange gode kommentarer, men mange af dem var faktuelt noget ævl.

Men jeg må erkende, når jeg nu sidder og kigger på resultatet af behandlingen af kommentarene, at den samlede mængde kommentarer har resulteret i en standard, der på mange måder er bedre end den var før. Standarden er helt enkelt blevet mere præcist formuleret og generelt lettere at anvende. Det er helt klart et anerkendende nik værd overfor alle de mennesker, der (om de er for- eller imod OOXML) har gennemtrævlet forslaget til standard. Tak til jer! Det er værd at understrege, at standarden ikke er blevet lavet totalt om - den er derimod blevet forbedret på en række områder, hvor den trængte til finpudsning. Selve arkitekturen er den samme, dvs den energi man skulle have brugt på at anvende den eksisterende ECMA-376 er bestemt ikke spildt. Af de punkter, hvor jeg synes de største forbedringer er kommet, er:

  • Der er ikke længere noget krav om at skulle anvende VML i nye dokumenter
  • Angivelse af landekoder skal nu ske som specificeret i RFC-4646
  • Det er mere tydeligt, at OOXML skal anvende eksisterende, velafprøvede hash-koder som bla. specificeret ved FIPS-180
  • Conformance-kravene er blevet mere tydelige
  • Den berømte "leap year bug" er nu markeret som forældet
  • Det er muligt at anvende datoer før 1900
  • Formel-specifikationerne for regneark er nu beskrevet i EBNF-notation

Og hvad så med resten af de mange kommentarer som fx "Compatibility-elements? Tja - nu nævnte jeg blot de dele, som jeg synes er de vigtigste (og så har jeg naturligvis sikkert glemt nogle andre vigtige).

Smile

Endnu en spand svar fra ECMA

Så er ECMA klar med endnu en spand svar til de forskellige lande i forbindelse med arbejdet omkring DIS 29500. Af deres pressemeddelelse kan det ses, at ECMA nu har svaret på 92% af de indkomne 3500 kommentarer, og det ser ud til, at det lykkes for dem at nå alle svar inden deadline på mandag d. 14. januar 2008. Af svarene på de danske kommentarer mangler nu kun ganske få at blive behandlet og det bliver spændende at se, hvad ECMA svarer på de sidste punkter.

Én ting jeg har haft svært ved at hitte ud af er, om ECMA får lov af ISO/IEC til at offentliggøre en ny samlet standard med alle rettelser indkluderet. Er der nogle af jer læsere, der har denne information? Så vidt jeg læser JTC1-direktiverne, så må de ikke offentliggøre de enkelte dispositioner i sig selv og heller ikke kommentarerne fra landene, så den eneste mulighed for at få svarene på kommentarene ud er vel at offentliggøre den endelige, fulde rapport. Jeg tror personligt ikke, at ECMA vil offentliggøre den fulde, reviderede, standard før efter BRM i februar - men uanset udfaldet er det jo lidt et valg imellem kolera og pest. Jeg skal være ærlig at indrømme, at jeg har nydt arbejdsroen i de sidste måneder efter 2. september 2007 og specielt efter ECMA begyndte at rundsende svarene til de enkelte lande. Det er klart, at der ikke har været så meget debat - faktisk meget mindre end jeg havde troet - men det er jo også en helt anden situation de enkelte lande står i. I første del af den 5 måneder lange ballot period var det i mine øjne en klar fordel, at OOXML blev diskuteret så bredt, for det fik afdækket en lang række mangler og uhensigtsmæssigheder ved standarden. Jeg tvivler på, at de enkelte lande have kunnet levere samme arbejde, hvis det ikke havde været for GrokDoc, IBM, Andy og andre, der har gennemtrævlet OOXML-spec for fejl. Der var en overhængende risiko for, at landene blot havde stemt "abstain" fordi de ikke kunne forstå spec - ganske som de gjorde med ODF i december 2006. Det gjorde de jo heldigvis ikke, og situationen nu er jo, at de enkelte lande skal se, om svarene fra ECMA til de enkelte kommentarer er god nok. Det er naturligvis et arbejde af en helt anden karakter, og det er min opfattelse, at vi her ikke har brug for nøglepersonerne fra den anden side af floden.

Men - det bliver spændende at se det endelige resultat af ECMA TC45s arbejde. 

Smile