Microsoft’s Doug Mahugh: Inside the real OOXML debate

microsoftThe Office Open XML format may or may not be ratified by the ISO, but in either event, it will still be a driving force in millions of the world’s offices. So one way or the other, its senior product manager tells us, the interoperability debate will be resolved.

From the outside, the debate over whether the International Organization for Standardization should formally ratify Microsoft’s Office Open XML format as international standard DIS 29500 seems almost completely political. And during last month’s ballot resolution meeting (BRM), the reports on how that debate was proceeding were so wild and uncorroborated as to be almost unintelligible.

Perhaps the only people making sense of it all were on the inside, and one of them was Microsoft’s OOXML senior product manager, Doug Mahugh. The purpose of the BRM was not to cast a final vote or verdict, but instead to bring to a head the literally thousands of concerns software engineers and concerned parties from around the world raised about the format’s viability. But with some notable exceptions, most of those concerns were not at all political — in fact, they may not even be the kind of stuff one writes BetaNews articles about.

link Source: BetaNews

What’s truly interesting about the process is that despite all the apparent politics along the outskirts, at the core, the debate appears to be centered around making things work. BetaNews spoke to Mahugh at length about his impressions about what the process may end up teaching us all, regardless of its eventual outcome.

We began by relating to him the story of Beihang University’s ongoing efforts to build a translator between OpenDocument Format and the Chinese standard UOF. Though Sun Microsystems Chairman Scott McNealy had suggested that those two formats could be better suited for the community if they were merged, a Beihang team report last spring indicated that such a feat might be technically impossible. The primary reason was because the basic grounding concepts for the two formats were somewhat different — they start, if you will, from different positions.

Some say ODF and OOXML start from different positions. Doug Mahugh was in Beihang at the time the report was written, and met with the translator project’s developers.

SCOTT FULTON, BetaNews: You’re dealing with a team that, just a few weeks ago, bound itself to a decree of interoperability. But are there times in your line of work where your duty to be open smashes head-on with your ability to provide a one-to-one mapping between the specifications you deal with and those in other people’s minds?

DOUG MAHUGH, Senior Product Manager, Office Open XML, Microsoft: The way I would answer that is to step back and look at just interoperability between document formats in general — let’s say, between PDF and HTML. It’s always very difficult to do this one-to-one mapping approach. You can always find things that map — for example, the concept of a paragraph. UOF, ODF, [and] Open XML all have that concept of some structural unit called a “paragraph.” So that’s kind of low-hanging fruit, that’s an easy one. But once you get into how to style text and how to organize the semantics or the structure of the document, each format reflects a slightly different philosophy. And I think that’s what the folks at Beihang University came up against.

I was there doing a workshop on Open XML for the people you’re talking about, the first week of April of last year. I was there during their research phase that led to the report. But [with regard to] that one-to-one mapping, I saw some of their work there, and it is a complicated, subjective task. If there’s one thing that has often been missing in the public debate about it, it’s the fact that there are subjective interpretations or decisions to be made about how that mapping could best work. There’s not just one canonical mapping out there that everyone could use; it’s very much a design question, and there are many different ways to approach it. So I think that’s what the UOF group has run into [is] just the reality and the complexity of making all those decisions, given that they are subjective decisions and they all fit together in one big story of interoperability.

SCOTT FULTON: I got the feeling that, all through this debate, there were folks who came to the conclusion that since Open Document Format was already standardized, that would have become already the objective namespace for how things should be mapped internationally…and that anyone else’s interpretation, be it China’s or Microsoft’s or Corel’s, thus becomes a subjective interpretation that must be objectively unstrung, like undoing strands of spaghetti, in order to make it parallel with the objective interpretation. There’s another line of thinking that says ODF itself is a subjective interpretation [one among many], but that the standardization process is really a ratification of that subjective interpretation as a viable approach.

DOUG MAHUGH: On the comparisons to ODF, one thing that I think is an interesting aspect is, look at Patrick Durusau’s recent statements as the editor of the original ODF spec. I spent a lot of time with Patrick in the US V1 [INCITS] technical committee, and I know he has a fairly consistent view on that. He’s a big fan of ODF, and he’ll tell you point-blank that ODF is his favorite document format, and has some pride of authorship there. But at the same time, the idea of extending ODF to include everything Open XML does, his view is that they started from different goals, and that in the case of the Open XML format, compatibility with this huge corpus of existing documents was the fundamental goal, originally.

In ODF, there was a different set of goals that started with the StarOffice formats, of course, and then they tried to come up with a more generic, minimal subset approach, if you will. And it’s not clear that it’s possible to extend that as far as would be necessary to encapsulate everything Open XML does, technically, without breaking some of the design assumptions there. An interesting point of reference on that: The DIN working group in Germany that is an international consortium of people looking at these details of how those two formats might or might not map to one another, that’s where we expect to see some of the most definitive work, defining what these philosophical differences might be, and what the technical solutions to some of those challenges might be. It’s a big, complicated topic, and nobody right now knows exactly what the best mapping would be, but there is some debate about whether it would even be technically possible to extend ODF as far as would be necessary to encapsulate the design goals of Open XML, in addition to some of its original design goals.

SCOTT FULTON: I’ve read some of Durusau’s recent statements — but you know him personally, I just have bits and pieces of semantics to go on. I gather that his impression is that the process of putting a format like his and like yours under international scrutiny can only improve it, and can in so doing only help weed the garden, if you will.

DOUG MAHUGH: Yes, and I think you see that very much reflected in the evolution of Patrick’s view. Keep in mind that, last July — this is all public now, the V1 votes that took place — Patrick, at that time, voted for disapproval of DIS 29500, and now he is very publicly in favor of approval of DIS 29500, and as he would tell you, that’s based on his view that the process has worked well over the last nine months. He’ll be quick to say it hasn’t been flawless, and he has specific ideas of how little things might be improved, but overall, he feels that the DIS 29500 spec is in much better shape now than it was nine months ago, and based on that, that’s why he’s now recommending approval. This is how standards are supposed to work, we all get together and hammer it out.

So has the online community and the development community grown so out of sync with one another’s interests that one can’t tell any more when the other is solving a problem rather than creating new ones? We asked Microsoft’s Doug Mahugh about the rather curious perceptions about last week’s ISO Ballot Resolution Meeting:

DOUG MAHUGH: What was really great about the BRM was, you had all these experts in the room talking about specific details. So it wasn’t quite like the more abstract discussions you see in the blogosphere, people would talk about these specific words in this specific section of the spec, and it was kind of refreshing how specific and concrete the discussion was there.

SCOTT FULTON: I got the impression that folks in the blogosphere might have preferred it if this were like “WWE Smackdown,” or some similarly polarized form of thumb wrestling going on.

DOUG MAHUGH: Yea, let’s face it, some of the details of something like the specification are not really exciting, flashy things. It’s low-level, technical details, and if you have a blog and you want to drive traffic, that’s not always the best strategy to talk about those little details in the standards process.

SCOTT FULTON: The debate over how to implement drop caps, when the drop cap is a graphic.

DOUG MAHUGH: Exactly, it’s hard to spin that into a controversy. And that’s what was cool about the BRM. There was just so much meaty, technical discussion. We just had such a cast of experts.

The standards process, Mahugh told us, is almost on a separate track from not only what tech media bloggers are talking about, but also from what everyday developers are focused on. Developers, he said, aren’t even asking him about the ISO. They want to know when the next set of tools will be available. Part of that question was answered this morning, with the announcement that version 1 of the Office Open XML SDK would be released to developers next month, with the CTP for version 2 to follow as soon as this July.

DOUG MAHUGH: On a personal note, I’ve traveled a lot and done some of these developer workshops, and amongst the people who are really writing code around the formats, this news about the SDK is very positive news for those people. It’s something I’ve fielded a lot of questions about over the last nine months since we released the CTP.

SCOTT FULTON: Assuming all goes well with the standards process, what does Microsoft do at this point to help foster a development community around this new standard, which then no really belongs to it exclusively?

DOUG MAHUGH: That’s a great question…We have a few things that we’re involved in: There’s the Open XML Developer group Web site, and there, [you’ll find] many code samples already, and we have some people we’re working with who are going to put more code samples up there. In fact, in the next few days, we’ll have some samples around the new SDK. That’s the centerpiece of our strategy for building the developer community, and getting them the tools they want.

Then also, we have a few specific bloggers who have gone into quite a bit of technical detail in the recent past, and we’ll be doing a lot more of that. Eric White in particular, also James Newton-King, and of course, Brian Jones’ blog and my blog, after we get beyond the standards process here, I think we’ll be able to get to more technical stuff as well. So a combination of blogging and the Open XML Developer site, is the main way that we’re looking at driving that.

SCOTT FULTON: And if there’s a negative outcome to the standards process, I take it that won’t drive down momentum toward continuing this as an interoperability project? Don’t you still have the leverage you need to drive an international community of developers for outside applications, line-of-business applications, OBAs around the OOXML format?

DOUG MAHUGH: At this point, so close to the end of the process, I don’t really want to speculate about outcomes. But I would say in general terms that the work to be done to empower developers with tools to work with the Open XML format, and how we can educate the developer community on the benefits of the format, and how to work with them, that doesn’t really change based on anything to do with the standards process. The same group of developers is going to be interested in building the same sorts of applications, regardless of how things play out on that side of the fence.

To me, that’s something kinda fun about planning some of the developer awareness, developer education work that we have coming up: It’s not really affected by some of these other things, it’s very much an independent track, getting developers up to speed.