Striking a Balance: Simplicity vs. Richness in XML Architecture
Our devices are getting smarter, and developers are finding interesting ways to improve the effectiveness of applications that make our lives easier. For example, we can talk to Siri, Cortana, Alexa, or a range of other personal assistant apps, speaking a request or asking a variety of questions, and receive useful or perhaps intentionally funny responses. We also receive unintentionally funny responses or responses that are not what we are looking for, such as when Siri responded to a request for help with a gambling problem by providing a list of casinos. We are, after all, in the early stages of artificial intelligence. Currently, much of what our computers know still has to be defined by us humans.
How we provide much of that intelligence is with markup. The richer our markup, the more control we have over our content. To illustrate, we can go beyond using markup to distinguish between paragraphs, lists, and other document constructs to defining multiple types of lists that might contain different categories of information or need specific formatting (author lists, glossary lists, and so on). With that understanding, schemas have bloomed. DITA is the best known example; the latest DITA specification defines over 600 elements. However, OASIS, the organization responsible for developing DITA, is not the only group to maintain a policy of inclusion. We’ve seen a number of proprietary schemas developed in this fashion—XML architects often err on the side of caution, creating elements that are used rarely or where they might not be needed yet but may one day be useful.
In addition to creating schemas with a lot of vocabulary, XML architects have also been known to create schemas with fairly complex grammar. Sometimes the structure of the XML document relies heavily on nesting of elements. Some schemas will have a wide variety of required and optional elements and attributes. Furthermore, schema design sometimes mixes use of elements and attributes such that inconsistencies cause confusion.
The problem is that, for now, we rely on people to add XML markup to documents. Traditionally, XML authoring systems have been powerful yet difficult to use. Either the system requires users to be experienced and well trained or the authoring tool needs to have been highly customized. SyncroSoft’s <oXygen/> XML editor, for example, can be used with any XML schema you choose to define and is fairly easy for anyone with development experience to customize, but the out-of-the-box experience for the end user is not at all intuitive for someone new to XML. It’s easy for the new user to insert an element in an invalid position, and though <oXygen/> prompts the user with warnings and explanations, it can still be a frustrating experience for someone who doesn’t have the context for interpreting those warnings.
For these reasons, many organizations have resisted implementing XML-based workflows for their business-critical content. As a result, we’re in the midst of a counter-revolution in document XML intended to simplify XML. We can see evidence of this counter-revolution in schema movements, such as that of Lightweight DITA. Such proposals not only trim down the number of elements to a bare minimum but also remove much of the nesting and other structural complexities. We also see the trend to simplify document XML in tools like Simply XML and Quark XML Author, two XML authoring systems that allow users to continue authoring in the familiar environment of Microsoft Word. Quark Author, a web-based XML authoring system, uses both a lean XML schema called Smart Content on the back end and an easy-to-use interface on the front end to take XML simplicity to the next level.
The challenge, of course, is not to oversimplify or the markup loses the power that XML was intended for altogether. For instance, CommonMark, a type of MarkDown, was dreamed up as an alternative language to XML/HTML to allow users to type in a text editor and still get some special rendering, but if you need anything more complex than emphases or lists, you literally have to type out the required HTML markup into the editor. CommonMark, then, has a very limited application.
Life will continue to get easier for content creators as our tools get smarter, but while humans are responsible for the richness of intelligent content, a balance needs to be struck between simple and powerful. Go too simple, such as with CommonMark, and your authors lose the ability to add much context to the words written. On the opposite end of the spectrum, a schema too descriptive becomes difficult for authors to use effectively. The sweet spot is XML nirvana.
About Autumn Cuellar
Autumn Cuellar has had a long and happy history with XML. As a researcher at the University of Auckland in New Zealand, Autumn co-authored a metadata specification, explored the use of ontologies for advancing biological research, and developed CellML, an XML language for describing biological models. Since leaving the academic world, Autumn has been delighted to share her enthusiasm for XML in technical and enterprise applications. Previously at Design Science, her roles included MathML evangelism and working with standards bodies to provide guidance for inclusion of MathML in such standards as DITA and PDF/UA. Now at Quark Software, Autumn provides her XML expertise to organizations seeking to mask the XML for a better non-technical user experience.
Note: This article was originally published in the June 2017 issue of CIDM eNews.
XML vs JSON
A funny thing happened on the way to XML’s world domination of the dissemination of written, document-oriented content: the data exchange world hijacked XML’s value and kept it for many years. Now JSON has the attention of web developers for data transactions – is XML in the way?
Getting Our Definitions Straight
For data (as used here ‘data’ refers to relational, or otherwise highly structured, discreet information such as financial data), XML and JSON are two sides of the same data description coin: either can be called and the game will be played. JSON works best for web-only developers, but learning XML isn’t too hard and the supporting resources are widely available with many available free and open-source.
For documents (as used here ‘documents’ means a mix of authored prose, multimedia, and data meant for presentation to a content consumer), XML is still the dominate open-standard format for semantically-rich content automation applications such as Quark Enterprise Solutions and modern word processing tools such as the Microsoft Office suite – though the purpose, use, and value of XML is significantly different between these document-focused solutions.
History Lessons for Your Markup Language of the Day
XML became an official W3C recommendation in February of 1998. At my previous company, two team members worked on the XML standard for several years alongside a who’s who of document and hyper-text technologists. The whole idea of XML, as driven by Jon Bosak, then at Sun Microsystems, was to take the benefits of SGML (Standard Generalized Markup Languages) and apply them to this new thing called “The World Wide Web.”
I remember how excited we all were when the spec was finally approved. So much attention was now being paid to our corner of the high-tech universe and the idea of having semantic XML content on the web was, to us at least, so clearly valuable. But then, the data jocks overwhelmed us document kids like a high school basketball team coming on the court after the band warms up the crowd.
EDI (electronic data interchange) methods have been around since the early days of computing. By the time XML became a recommendation, the data world was already building a new EDI method that took advantage of the web’s HTTP for the transport of messages and data payloads with the data package built using XML syntax. This EDI method was called SOAP (simple object access protocol) and when released by Microsoft and others in 1999, it very quickly became the main hype of XML’s value. All of us document folks were left playing the sad trombone sound while we continued our efforts to make semantically rich content’s value accessible and available to all (and still do today!).
Of course, all was not perfect for XML as an EDI solution. XML is a fairly verbose markup language and therefore the XML data payload can be multiple times larger than the data set it’s describing. And XML requires a robust parser, which has its own rules that were originally targeting document requirements, not the needs of more compact data structures. And lastly, many browsers were slow to adopt XML as a web standard.
It’s not XML vs. JSON, It’s Selecting the Right Tool for the Job
An oversimplification to the answer of “What is the Right Tool” is something like this:
- XML for documents (written content)
- JSON for data (transactions over the web)
Of course there are still many systems that offer SOAP APIs. Further still, the more modern REST (representational state transfer) web API doesn’t really care about the payload format, so many systems may provide both XML and JSON responses (as does Quark Publishing Platform – developer’s choice). But there are definitely gray areas when trying to determine if XML or JSON is the best fit.
Several standards exist that are used for transacting files and metadata between parties including:
- RIXML (Research Information Exchange Markup Language) used in Financial Services Investment Research publishing
- eCTD (electronic common technical document) used in pharmaceuticals for transmitting drug research to the FDA (US Food and Drug Administration)
- And other, more general metadata standards such as Dublin Core, XMP, and more
What these standards share is the use of XML to describe a package of documents in a way that lets the receiver of the package automate the handling of that package. For RIXML and eCTD, the payload mostly consists of PDF documents. The XML is used to hold the metadata that describes the package (producer, purpose, date, a description of each attached file, etc.). For the metadata “driver” or “backbone” file, XML made sense for many reasons, not the least of which was the contributors developing these standards were XML-knowledgeable folks and the tools and methodologies for creating these standards as XML were widely available.
Of course 27 characters isn’t particularly meaningful, but multiply the size of those messages by 10, 100, or 1000 and the size difference becomes meaningful. Yegor concludes that JSON is great for data sent to dynamic web pages, but he recommends XML for all other purposes.
However, his arguments against JSON were already being addressed (as he admits toward the end of the article) as the JSON world brought more tools to the party such as JSONPath and JSON rules files with validating JSON parsers. JSON features are now reasonably on par with XML, though of course still focused on solving the challenges of transacting data.
A Little More about RIXML – A Good Test Case
If you are technically minded and curious, it might be worth reviewing the RIXML Data Dictionary and jump to page 21 where the data dictionary begins in earnest. It takes a little over 100 pages to document the entire data semantics structure of the main areas of concern (not including the “sidecars” as RIXML calls them). This results in a metadata file describing the payload for a transaction of what is typically one or more PDF documents.
There is no reason why that structure couldn’t be represented as JSON, but there’s also not a particularly good reason to do so either. Ultimately what matters is which system receives the RIXML and document payload. In the case of most RIXML processing systems it is likely a backend server using Java or .NET code to parse the RIXML file and then update a database and file system according to agreed-upon business rules.
For example, take a distributor of financial research information that is produced and sent to the distributor by multiple different banks (the reason RIXML was created to begin with!). They receive the RIXML package, process it, store the information in their database or content management system and then present some portion of that information on a web page for subscribers to access. They don’t present the entire RIXML metadata – most of that would be useless to the research consumer. And a RIXML package isn’t really dynamic either – for a particular package, the metadata doesn’t change very frequently, if ever.
The distributor’s system isn’t going to rely on creating a subset of the original RIXML file to send to the browser. No, they’re going to query the system, because it is their single source of truth for content that is available. Delivering query results from the system as JSON to the browser is easier than creating or re-parsing or modifying the original RIXML file.
So an argument could be made to support both XML and JSON in RIXML (and by extension other metadata standards). Unfortunately for the JSON-only audience, the expense to recast those 100 pages of specifications for XML as JSON is non-zero, and a one-time conversion of the XML to JSON is not developer friendly. And for all of this additional effort the benefit would only apply to those that have yet to learn XML.
Long Story Bygones
There is and will always be waves of new technology that provide an alternative to, overlap with, completely replace, or partially supplant an existing technology. At the time of XML’s development, its use for transacting data was secondary to its original purpose and shows just how hungry the data world was for a better defined standard for transactions. That a more purpose-built, data-friendly format, JSON, was created at close to the same time also highlights how much need there was for improvement and standardization in data transactions.
However, XML is still a fantastic technology for handling documents, metadata, and data, and especially adept at merging all three into a common structure that can be utilized by software for automation and by humans for authoring and consuming. If you are not exclusively processing discreet data transactions, there is a lot of benefit to understanding and utilizing XML and the rich toolsets that are available.
If you are purely a web-data jockey, it would still benefit you to learn XML and the associated tools because: a) you’re likely to run into a system that provides only XML; and b) having some XML skills would extend your opportunities to cool things that can be done in the document content domain.
Is Your Architecture Truly Open?
Enterprise system architecture can be evaluated from many different perspectives. Similar to conventional building architecture, a solid system design must consider several different criteria to maximize the pragmatic features of the engineered construction. When we talk about Enterprise Architecture, several criteria are commonly used to gauge the success or failure of the architecture:
- Ease of management
- Ease of integration
- Response to failure
- Reporting and analytics for the above
Among the various best practices is the question raised in the title around open architectures.
Why does open architecture matter?
When engineering an enterprise solution architecture, many criteria matter. Open architecture matters because practically every enterprise solution must meet the non-functional requirement of co-existing in an established technical ecosystem. Enterprise systems must interact with each other to accomplish business tasks and these systems may be provided by separate vendors, built in-house, or rely on 3rd party APIs as a layer for interfacing with various web services and systems.
This article highlights the benefits of truly open and relevant standards as related to enterprise architecture.
Some key terms
Open architecture differs from open source. Open source software involves sharing raw source code to facilitate crowdsharing benefits in building out the code. Open architecture is focused on easily decoupling data from the proprietary layers of the code, so that data can easily be transferred with other systems and business logic. It also has to do with the architecture being easily and highly extensible on the back end, so that any front end (user experience or integration to other applications) can be applied through a robust, fully-featured API. System to system communications must be easy, reliable, and efficient.
Even when an architecture is implemented by proprietary technologies, we still consider it open if the architecture is able to facilitate data transfer in and out of the system and support data transfers across systems. In addition to direct API support, this includes access to system functionality via meaningful layers of abstraction to serve as the glue between business rules and other enterprise systems.
Best practices for enterprise system architecture
Quark Publishing Platform is an Enterprise Java Web Application built on the open source Spring Framework. That means it’s scalable, secure, incredibly extensible, and easily adopted by IT departments with sophisticated and challenging requirements. We support major investment banks with their extremely complex IT requirements, as well as small 20-person shops that just need “out of the box” to work well.
Extensibility is core to every product we build and every enhancement we make:
- We support adding a custom service which can automatically be exposed through our REST interface to rebuilding the entire web-based user experience for a custom business portal using nothing but our APIs.
- We support many server-side Java integration points (such as JMS), as well as our robust RESTful interfaces which expose API access to the features a customer needs (and we use the same APIs ourselves for our cross-product integration).
- When there is a gap in the API based on a new customer requirement, we can address it very quickly. In response to customers’ requests, we have added new content markup in Quark Author to drive new publishing features in QuarkXPress Server – all in a single development cycle. This was true for “regions” in Quark Author and in our next release cycle (due in September) we’ll be adding two more: Index term markup and Index sort/format publishing.
- In the content design space, Quark – via QuarkXPress – invented the use of an SDK for 3rd parties to extend the application in ways that were extremely useful. We even created a marketplace for eXtensions, as they’re known, that at one time was larger than many software companies in their entirety. Other desktop design software vendors directly duplicated that model in their products. The eXtensions model was certainly the first such commercial retail “add-in” marketplace in the 1990s that current Quark CTO Dave White was aware of – long before he was directly involved with Quark.
A tremendous amount of Quark capabilities come from our integration with XML:
- Authoring tools content models
- Most of our products’ configuration files
- REST API posts and responses (also available in robust, modern JSON)
- Even our QuarkXPress Modifier format for automated publishing
It’s a common best practice to avoid attaching your data to a proprietary system or application. At a recent professional conference, Eliot Kimber humorously compared some highly proprietary content management systems to roach hotels: “content checks in but it doesn’t check out.”
At the simplest level, an enterprise component content management system architecture must provide rich methods to extract data. While this is commonly available in one-off operations, it’s important for enterprise systems to consider orders of magnitude when it comes to the execution of any single task. In other words, extracting a single asset or collection of assets from your system is only the beginning. Enterprise applications often need to import or extract data based on a number of factors including business process, queries based on traceability or auditing, or when making global transformations to data entering or existing an enterprise system. Doing it once is part of the answer, but doing it at scale is often required by customers with requirements for automation during multiple steps of the content life cycle.
How is this done? It depends on the system, but a common best practice is to provide import/export features to various data formats or even .zip archives representing a data dump of variable scope. It’s even better when these transactions can be managed by REST-based calls which may be invoked programmatically. Better still, rich REST-based APIs should provide a headless and efficient means of extracting all assets, including every version and all metadata for each asset, without introducing proprietary structures that interfere with the usefulness of the extract. These mechanisms do very little good if the customer’s original data structure is changed or rendered useless. For example, if references use a proprietary model instead of a standards-based approach like a URI pattern. Yes, this still happens.
Every elegant design should seek to simplify the most common tasks for end users, but also consider the administrative needs required to properly care and feed the system. It’s the rare design that also takes into consideration these additional unforeseen use cases and exposes additional layers of tooling to simplify configuration without high-cost services. Every system architecture should also consider its own end of life and the necessity to interact across initially incompatible systems in the enterprise ecosystem.
How current is your understanding of the solution architecture?
From time to time, technologists conduct due diligence and feasibility analysis to help them arrive at decisions on acquiring new/replacement products or services. More often, such research is based on some quick Internet searches. To remain relevant, the best technologists will read as much as they can every day. They will also go beyond the research to get hands on experience with those same products and services on a periodic basis to better understand exactly how the offerings have changed and validate or invalidate their understanding. Sometimes the technology changes significantly and in very good ways. At other times, the review is nothing more than the regurgitation of some fluffy buzz words used as click bait and the technology has actually stagnated. How will you know the difference? For example, one vendor might state “A robust API for integration with any other system,” which sounds pretty good. But what if you learned that the vendor only offered their API technology in an older standard called CORBA, widely considered a dead technology since 2004? Understanding how to ask the right questions is crucial to making good decisions.
Change is abundant, increasing in frequency, and far-reaching in scope. Technologies come and go, and yet it’s common for many business not to see a return on technology investment for 3 to 5 years. Therefore, the solution architecture must be robust enough to withstand and embrace the inevitable changes. These may be impossible to predict in every case. If the system is to remain relevant and deliver its value over the system’s life cycle, this additional layer of research is invaluable. Almost every vendor is willing to give a demo or build a snazzy web site, but how many of them will stand up a live proof of concept solution and let you play with it to test it out against your actual requirements versus the filtered language that appears in an RFP or its response? And how much of your RFP is devoted to ensure that reasonable architectural requirements are identified?
Open means easy integration
Easy integration starts with proven architectural frameworks for web-based applications. Those frameworks are constantly evolving and changing, so any solid architecture will build out further tooling, back-end improvements, 3rd party technology partnerships, and flexibility for the continuously evolving front-end frameworks as well.
Let’s look at another example of easy integration. Many organizations manage assets using proprietary formats and aren’t ready to make the full transition to XML across the enterprise. That’s why the Quark Publishing Platform also supports managing InDesign documents and components (though we don’t provide any automation of InDesign documents). Otherwise, InDesign is treated as a first class application and content type for those that need it. Similarly, MS Office documents have robust support and some, such as Excel, PowerPoint, and Visio can be a source of reusable components in XML. Of course, Platform also supports reusable components for QuarkXPress projects, Quark XTensions, and QuarkXPress server publishing channels to assemble reusable pipelines for omni-channel publishing and delivery to HTML5, ePub, App Studio, PDF, 3rd party ECMs like FileNet and SharePoint, and more.
Why proprietary can be good
Quark owns all of the major technology in our enterprise system architecture that provides the business value for content automation. One of the strengths of having a broad solution stack is that we can manage and synchronously release updates and enhancements according to our schedule and prioritization without having to wait for a 3rd party to decide if they agree, shall prioritize, or deliver in a timely manner.
Of course, we still integrate several components into our system that do rely on 3rd parties. It makes perfect sense in many cases, so we carefully evaluate and select those open-source and proprietary partners who work with us to provide the best value for our customers at a reasonable cost. One significant driving factor is the underlying technical architecture and how responsive these 3rd party vendors can be to us. Just like our customers expect high quality and tight turnaround for fixes and features, we expect the same of our partners and appreciate when we have a strong rapport based on results.
Even proprietary technologies can and must play more nicely together. The underlying Quark technical infrastructure has recently captured some attention for supporting other proprietary formats:
A key measure for proving an architecture starts with asking the right questions. A truly open architecture is not present if it is dominated by proprietary software that simply imports/exports formats from one product to another. We must dig deeper to examine the underlying technical landscape.
Here are some questions worth asking:
- If key proprietary components are removed or replaced, does the architecture still hold together?
- How well can the architecture be extended by standard web-based technologies and modern development frameworks?
- If your solution is focused on content automation for the creation, management, publishing, delivery, and analytics of business-critical content, how easy is it to mix and match authoring tools for the content?
- How many different types of content can truly be managed as components which can be assembled for publishing, reviewed and approved, and ultimately delivered to various formats using reusable channels?
- How easily can you replace the publishing engine used to render various output formats?
- At the database layer, how many different databases are supported?
- Do your requirements force you into a relational database model?
- Do your requirements include supporting a native XML database and why?
- How well can your data tier scale horizontally and vertically for increases in assets and transactions by orders of magnitude?
- How do you know?
- Do you have benchmarks or test cases which help add quantitative analysis to the discussion?
- How responsive is a vendor to your needs as a customer and how well can that vendor address end-to-end enhancements at the velocity of business?
- Does the vendor have more than one or two reference customers in production for over a year who can back up their claims?