a 'mooh' point

clearly an IBM drone

Moving towards OOXML(S)

Some time ago I wrote a bit about what Microsoft Office had managed to get into Microsoft Office 2010 CTP1 (or, I wrote about the stuff I had tested). As you might recall, the results were rather slim, so I wrote to Microsoft to hear, if that was really it. It has been the fear of many that Microsoft will never, ever care at all about the strict conformance clause of ISO/IEC 29500, and my tests clearly was a sign that they were right. Heck, some even mentioned that "the only choice for Microsoft is to avoid adding new BRM features in their OOXML files".

On the other hand I have always regarded big companies like Novell, IBM, ORACLE etc as rather simplistic in their development cycles - that they'll always choose the path of least resistance. Microsoft is in no different here, and moving towards strict and side-tracking nasty. legacy stuff like VML etc is clearly an attempt to make the developmentpath easier in the future.

The list I got was this:

File typeFeatureComment 
DOCX Ink Drawings Previously used VML, now uses DrawingML
Advarsel, gult trafiklys
XLSX Ink Drawings Previously used VML, now uses DrawingML Advarsel, gult trafiklys
PPTX Ink Drawings Previously used VML, now uses DrawingML Advarsel, gult trafiklys
DOCX Legacy Diagrams Previously used VML, now uses DrawingML Advarsel, gult trafiklys
XLSX Legacy Diagrams Previously used VML, now uses DrawingML Advarsel, gult trafiklys
PPTX Legacy Diagrams Previously used VML, now uses DrawingML Advarsel, gult trafiklys
DOCX Drawing Shapes Previously used VML, now uses DrawingML Succes, grønt trafiklys
DOCX Textboxes Previously used VML, now uses DrawingML Succes, grønt trafiklys
DOCX WordArt Previously used VML, now uses DrawingML Succes, grønt trafiklys
DOCX Groups Previously used VML, now uses DrawingML Advarsel, gult trafiklys
XLSX Form Controls Previously used VML, now uses DrawingML - except on "chart sheets" Advarsel, gult trafiklys
XLSX ActiveX Objects Previously used VML, now uses DrawingML Advarsel, gult trafiklys
PPTX ActiveX Objects Previously used VML, now uses DrawingML Advarsel, gult trafiklys
XLSX OLE Objects Previously used VML, now uses DrawingML Succes, grønt trafiklys
DOCX ST_OnOff Uses the new ISO-approved simple type without the values "on" and "off" Succes, grønt trafiklys
XLSX ST_OnOff Uses the new ISO-approved simple type without the values "on" and "off" Succes, grønt trafiklys
PPTX ST_OnOff Uses the new ISO-approved simple type without the values "on" and "off" Succes, grønt trafiklys
XLSX ISO-dates Can persist dates in ISO-8601 format and avoids the "evil" serial dates. Succes, grønt trafiklys

(The last four was addeed by me and didn't appear on the list from Microsoft)

Now, "someone" once wrote to me that you shouldn't make any decisions based on what Microsoft Office says they will do - you should wait until they actually do act. I couldn't agree more, so I tried to test the list I received. I have tested the lines marked with a green traffic light by first creating a document in Microsoft Office 2007 to verify the usage of e.g. VML and then I created the same document in Microsoft Office 2010 Beta [']. The lines marked with yellow traffic lights have not been tested by me, since I frankly don't have the Office-skills to create a file (what the hell is an "Ink Drawing, btw?). If anyone can test this, I'd be happy to update the list. Also, regarding the lines about ST_OnOff, I have tried to create files that would contain the "bad" On/Off-values, but I haven't succeeded in this. That is not the same as deterministically verifying that it cannot be done, so again - if you can create a file in Microsoft Office 2010 with the bad values, send it to me and I'll update the list.

So getting back to "don't trust Microsoft as far as you can throw them", this is in no way a definitive list. The list is based on Microsoft Office 2010 Beta, and much can happen until final RTM - both in the right direction with even more things being fixed, but also in the wrong direction with things being pulled off the list again (WinFS, anyone?). But for those of us not implementing complete Office suites but "merely" interacting with the ecosystem by generating files, this is undoubtly good news. Add to this that Microsoft confirmed a few TC-calls ago in WG4, that pending the current AMD1-ballot, Microsoft would add the new namespaces of strict files to the white-list of known namespaces in Office 2010. This effectively means that Microsoft Office will be able to load (some) strict files, and if you just happen to generate PPTX-files with embedded objects, you'll likely never again have to generate markup like this:

[code:xml]<w:object w:dxaOrig="15" w:dyaOrig="15">
    <v:shapetype
        id="_x0000_t75"
        coordsize="21600,21600"
        o:spt="75"
        o:preferrelative="t"
        path="m@4@5l@4@11@9@11@9@5xe"
        filled="f"
        stroked="f">
        <v:stroke joinstyle="miter"/>
            <v:formulas>
                <v:f eqn="if lineDrawn pixelLineWidth 0"/>
                <v:f eqn="sum @0 1 0"/>
                <v:f eqn="sum 0 0 @1"/>
                <v:f eqn="prod @2 1 2"/>
                <v:f eqn="prod @3 21600 pixelWidth"/>
                <v:f eqn="prod @3 21600 pixelHeight"/>
                <v:f eqn="sum @0 0 1"/>
                <v:f eqn="prod @6 1 2"/>
                <v:f eqn="prod @7 21600 pixelWidth"/>
                <v:f eqn="sum @8 21600 0"/>
                <v:f eqn="prod @7 21600 pixelHeight"/>
                <v:f eqn="sum @10 21600 0"/>
            </v:formulas>
            <v:path
                o:extrusionok="f"
                gradientshapeok="t"
                o:connecttype="rect"/>
        <o:lock v:ext="edit" aspectratio="t"/>
    </v:shapetype>
    <v:shape
        id="_x0000_i1025"
        type="#_x0000_t75"
        style="width:.75pt;height:.75pt"
        o:ole="">
        <v:imagedata r:id="rId4" o:title=""/>
    </v:shape>
    <o:OLEObject
        Type="Embed"
        ProgID="opendocument.WriterDocument.1"
        ShapeID="_x0000_i1025"
        DrawAspect="Content"
        ObjectID="_1327745060"
        r:id="rId5"/>
</w:object>[/code]

['] I have tried, in vain, to get my hands on the latest pre-release edition of Microsoft Office 2010 ... so much for being a drone, when you can't get your hands on the latest bits Frown

 

Microsoft Office 2010 Beta, ODF and leap-year-bug

Some time ago I did some tests of Excel in Microsoft Office 2010 (CTP). The test was around OOXML - but test of ODF-support was missing.

One of the things ODF is missing but is in OOXML is the leap-year-bug ... although most of propably don't miss it all that much. The leap-year-bug is the good ol' Lotus 1-2-3 bug that treated 1900 as a leap year. As a consequence of that, calculations based on dates in the range from January 1st 1900 and February 28th 1900 with dates after this period will be off with one day.

Since Microsoft Office supports (a subset of) ODF, I thought it'd be fun to look at how Excel 2010 handles the leap-year-bug.

The first thing to do is to show how the leap-year-bug is handled by Excel:

So adding a day to February 28th 1900 will result in the non-existing date February 29th 1900, and if you subtract the dates February 27th 1900 and March 2nd 1900 (you'd expect the a value of 3) you actually get a value of 4.

So what will happen if you save this spreadsheet in ODF-format and open it again in Excel? You might expect that - since it was round-tripped through a format not supporting the leap-year-bug, the calculations would now be correct.

... but you'd be wrong. The result is excatly the same:

As I was, you might be wondering how the hell that was possible. But a simple inspection of the markup generated by Microsoft Excel 2010 reveals the answer:

[code:xml]<table:table-cell

  office:value-type="date"

  office:date-value="1900-02-29T00:00:00"

  table:formula="msoxl:=A2+1"

  >
  <text:p>29-02-1900</text:p>

  </table:table-cell>[/code]

A quick-and-dirty conclusion to this would be that Microsoft Excel 2010 violates not only ODF but also xsd:datetime, since February 29th is not a valid xsd:datetime. However, an inspection of ODF reveals that this is not the case. Microsoft Office claims conformance to ODF 1.1. and ODF 1.1 states the following about the value-space of the attribute "office:date-value" (Section 16.1 , p 702) :

A dateOrDateTime value is essentially an [xmlschema-2] date and time value with an optional time component. In other words, it may contain either a date, or a date and time value.

So strictly (*giggle*) speaking, Microsoft Office 2010 does not violate ODF 1.1 .

However - specifying an invalid date in an attribute that might contain xsd:dates is not very smart, dear Microsoft. Those of us wanting to use standard libraries to process the content of an ODF-document will likely get unpredictable results when trying to parse this invalid date. Heck, even .Net's DateTime.Parse()-method throws an exception when trying to parse this value.

Also, ODF TC has tightened up the prose in ODF 1.2 and it is now:

A dateOrDateTime value is either an [xmlschema-2] date value or an [xmlschema-2] dateTime value.

So Microsoft Office 2010 might not violate it now - but it will when ODF 1.2 comes out.

Extending ODF

Microsoft could always opt for extending ODF using the extension mechanism (to add elements and attributes using a foreign namespace). So Microsoft could chose to add their own attribute to the <office:spreadsheet>-element saying something like

[code:xml]<office:spreadsheet mso:EnableLeapYear="true"/>[/code]

The problem with this approach is that is comes into conflict with the new conformance clauses of ODF where a clear distinction between "normal" documents and "extended" documents is made. Procurement-wise it is a big no-no only to support extended documents (look what happened in Denmark!) and Microsoft risks that some government somewhere decides not to use Microsoft Office due to lack of conformance to the "normal" conformance clause of ODF 1.2.

Thus, Microsoft needs to find another solution ...

Configuration to the rescue!

Luckily for Microsoft (and we all know how picky they are wrt "preserving functionality" etc), there is a fully compliant way out of this while still preserving the leap-year-bug in spreadsheets - regardless of persistance format.

As you probably know that so-called config-item-sets are a gold-mine of endless possibilities. Originally (until ODF 1.1) the purpose of these elements and attributes were to store application specific settings, like (and this is a quote from ODF 1.1) "document settings, for example a default printer or view settings, for example zoom level". In ODF 1.2, all bets are off and there are no restrictions to the usage of the elements. The config-item-set elements were never meant to be an extension mechanism (by ODF TC co-chair from Sun/ORACLE - go figure), but OpenOffice.org uses them extensively - in fact, when creating a "blank" text-document, spreadsheet or presentation in OpenOffice.org, a total of 228 (76 for text documents, 66 for spreadsheets and 86 for presentations) settings (of which non are described in ODF) are defined in the the settings.xml-file of the packages. Somehow ODF TC has not found it necessary to include usage of config-item-sets in the "extended conformance clause", so a document can claim 100% conformance to ODF 1.2 "normal documents" while throwing dozens of settings into config-item-set elements. So the solution to claim conformance to ODF while enabling the leap-year-bug is simply

[code:xml]<config:config-item-set  config:name="mso:spreadsheet-settings">
  <config:config-item

    config:name="EnableLeapYearBug"

    config:type="boolean"

  >

    true

  </config:config-item>
</config:config-item-set>[/code]

This should be combined with this markup for the specific cell

[code:xml]<table:table-cell

  office:value-type="float"

  office:-value="60"

  table:formula="msoxl:=A2+1"

  >
  <!--<text:p>29-02-1900</text:p>-->

  </table:table-cell>[/code]

(and you don't really need the bit I have commented out).

I don't know about you, but I find this just darn right fantastic!

Smile