Archived at Pineapplesoft 
  The “XML/EDI FAQ” page was archived in 2003 to preserve the original content of June 1999.  
  | Home | Contact | Site map | Writings | Open source software |  

  European XML/EDI Pilot Project

The project for this FAQ (Frequently Asked Questions) was conceived during the December 1998 meeting of the CEN/ISSS XML/EDI Pilot Project. It aims at answering basic questions on XML/EDI, particularly for people with some EDIFACT background.

All the CEN workshop members are encouraged to contribute to this FAQ or challenge its contents. Please email your questions and answers to [don't email questions anymore, the FAQ is no longer actively maintained] . We also invite you to visit the project homepage <> where you will find more information on how to participate in the project.

Edited by: Jacques Garderet (Team Leader), Origin
  Benoît Marchal, Pineapplesoft

1. What is XML?

XML, the eXtensible Markup Language, is a syntax for use in web environment. It is being developed by the World Wide Web Consortium (W3C) which has a leading role in the development of Web standards. See <>.

XML conveys contents and structure, not presentation and behaviour. The structure of XML documents can be (but need not be) formally described in a Document Type Definition (DTD); software tools can validate an XML document against a DTD definition.

To put it simply, each logical part (known as an XML element) is marked up/surrounded by tags. Together with attributes, tags are used to structure information. XML elements may be repeated and nested at any depth to allow for instance several <orderlines> in an <order>.

The XML syntax has rapidly imposed itself as a popular format for exchange of information on the web. Around this syntax, a large set of companion standards and tools are being developed. The acronym XML often refers to the whole family of standards and products built around the XML syntax; they are often written in the Java programming language. XML Document Object Model, XML Schema, XLink/XPointer, Extensible Style Sheets Language (XSL), Resource Description Framework (RDF) are some of these companion standards.

2. What does markup mean?

Mark-up refers to annotations in the margin or in the body of a document for styling, etc. For example, when you review a document and note in the margin that the title should be in bold, you are marking up the document.

Electronic markup (note there is no hyphen), refers to information, embedded in an electronic document, to convey styling or structural information.

3. What are tags?

Tags are delimiters for XML elements. XML elements are enclosed between opening and closing tags.

Opening tags are of the form <element-name>, whereas closing tags are of the form </element-name>. These tags may be chosen to be human readable, e.g. <quantity>36</quantity> but they need not be, e.g. <g11>36</g11>.

4. What are XML attributes?

XML attributes are additional information attached to an XML element; they are pairs of attribute name and attribute value, e.g. target="EUR". XML attributed are enclosed in the opening tag, according to the following syntax:

<element-name attribute1="value" attribute2="value"/>
or <element-name attribute1="value" attribute2="value">data</tag>
e.g. <currency reference="GBP" target="EUR">1.4276</currency>

5. What is a DTD?

A Document Type Definition (DTD) is a formal description of the full structure of an XML document (such as an XML/EDI message) with relevant sets of values (allowed and defaulted values). It is used for example to check the syntactical and structural validity of an incoming XML/EDI message (e.g. correct sequence and nesting of XML/EDI elements) and validate the transmitted values - by comparing them with the allowed values listed in the DTD.

In that sense, DTDs play a role similar to the existing UNSM (United Nations/EDIFACT Standard Message) specifications (e.g. structure of a message type - however without the functional description). Considering that DTDs may convey default and fixed values for attributes/qualifiers, DTDs are getting close to MIGs (Message Implementation Guidelines) - with the added advantage of possible automated implementation. To sum-up, we could say that a DTD is half-way between a UNSM and a MIG.

6. How to build a DTD?

A few EDI software tools have developed the automatic generation of DTDs strictly based on UNSMs (or any of their subsets). This has the advantage of an immediate compatibility with existing corporate EDI applications (strict compliance to the same EDIFACT standard). It may be argued that EDIFACT information used for process control should be mapped to attributes of XML elements rather than separate XML elements, cf. the recommendation of CEN/ISSS Electronic Commerce workshop in April 1999 that DTDs should be "derived" from EDIFACT messages using both the MIGs (CEN/ISSS Generic MIG Register) and the semantic entries defined in the BSR (ISO Basic Semantic Register).

7. Who is responsible for Internet standards?

Formally the Internet Engineering Task Force (IETF) <> is responsible for promoting Internet standards. However, recently, the W3C, the World-Wide Web Consortium <>, has taken a leading role in the development of web standards.

8. What is XML/EDI?

This is the present attempt to combine the best features of traditional EDI (which has a broad industry support) with the improvements in technology offered by XML. It is EDI with XML, or XML with EDI - depending on the perspective.

In an XML/EDI message the EDI information is explicitly labelled using tag names. Reference may be made via Internet to a Document Type Definition (DTD) - which contains structure declaration and relevant sets of code values.

Web browsers are expected to support XML, therefore XML/EDI messages. Like EDI messages, XML/EDI messages could be transmitted in any way: e-mail, VAN, Internet etc.

9. What is XSL?

This is the Extensible Style Sheets Language developed by W3C primarily to control the way information is presented on screen. XSL is used to display the information structured in XML; presentation aspects are dissociated from data structure (unlike HTML which tries to perform both functions). XSL involves the creation of Graphical User Interface (GUI) form objects out of an XML document.

XSL also provides some key document-handling facilities beyond styling. For instance, transformations, re-arrangement of elements, extract of a table of contents from a document.

10. How does XML/EDI relate to forms?

It is expected that for SMEs, web-based forms will be an important mechanism to create and read XML/EDI messages. These forms may be linked to local database systems.

11. Does XML/EDI replace EDIFACT?

The answer is yes and no. Yes, in the long term XML/EDI may replace EDIFACT, at least for some category of partners such as SMEs. However, XML/EDI builds on the EDIFACT foundations in terms of semantic contents (message types, segments and data elements) and related UN code lists. In that sense, it opens new opportunities for EDIFACT. For more information on EDIFACT see <>.

12. What are the benefits of XML/EDI over existing approaches?

There is no simple answer to this question. Arguments vary depending on the speaker. The general consensus however is that XML is the most promising approach to move EDI technologies to the Internet. In particular XML/EDI could significantly lower the costs of EDI and thus should bring in SMEs.

Another argument is that XML is versatile and benefits from a large industry support. This opens new areas for EDI such as direct integration with the major databases and off-the-shelf business packages, browser software, multimedia, extranets (i.e. secure internet network shared among several corporations), etc. All these are to be built with the same core technology.

13. Any other added value of XML/EDI compared with traditional EDI?

It will be easier to incorporate components such as digital signature, smartcard authorisation, routing instructions, spreadsheet, graph and Internet-like adds on into an EDI message, thus making use of the intrinsic flexibility and extensibility of XML.

Because XML is a self-describing format (the structure is apparent in the document), tools can analyse the structure and automatically derive mappings, conversion, import rules, etc.

Consequently the task of mapping is expected to be easier through the use of extended functions in existing user tools (word processor, database, etc.). It is anticipated that search engines could also retrieve information held in XML/EDI messages through the use of embedded tags.

14. What will be left of traditional EDI?

Standards that have been established between large companies will remain to increase readability/predictability in the data flows, e.g. UN/EDIFACT standard components and code lists.

Industry-wide DTDs will still be needed, in the same way as the different EDIFACT subsets used now in different sectors, i.e. there cannot be an Invoice DTD both universal and simple! There will still be the need for partners' agreements to refer to specific repositories, sets of DTD, sets of codes (currency, country, etc.).

However it is expected that the self-describing nature of XML/EDI (a message and its DTD is a formalised MIG) will simplify mappings.

15. What is a repository used for?

In the context of XML/EDI, a repository is a web site holding data accessible to a community of users. Such a site may contain definitions of Message Implementation Guidelines (MIGs) and equivalent DTDs. 'Indirect' validation of an incoming XML/EDI message could be performed by referring to a DTD which in turn could refer to another web site holding a code set (e.g. ISO currency codes).

Reference to a repository may be given in a DTD to indicate, for example, that elements the name of which is qualified by BSR are to follow the rules laid down on the site <>.

A Repository may also be accessed through the xsl:import option of XSL.

16. Where can I find more information on repositories?

Visit these web sites:

17. What is SIMPL-EDI?

This is EDI using a small part (or subset) of a full standard message such as United Nations/EDIFACT Standard Message type. It is hoped that the concept of SIMPL-EDI will represent core implementation guides, based on the principles of simplified business process.

Note: SIMPL-EDI is not a typo, it's the preferred spelling according to SIMAC. (Ad hoc working group on Simpl-EDI, forms and Web based EDI).

18. How do XML/EDI messages compare with EDIFACT messages in terms of number of transmitted characters?

First indications are that an XML/EDI message is two to three times longer than the equivalent EDIFACT message.

19. How to decrease the number of elements to be transmitted in an XML/EDI message?

When referring to a DTD defined for an industry sector, it is possible to use default values indicated in the DTD. For example, default value for "Currency format" may be ISO 4217 (a format of 3 alphabetical characters such as EUR, GBP, USD). Default value for "Date format" may be CCYYMMDD. No need to transmit such default values any more; this corresponds to an automatic implementation of some aspects of message implementation.

20. How can an XML document be generated out of an EDIFACT message?

One may refer to a solution proposed by Suli Ding at <>. It involves a template file that describes the relationship between an EDIFACT message and a valid XML document (itself validated against one DTD).

21. Equivalent expressions between XML/EDI and EDIFACT

data typing validation rules
DTD formal (computer-processable) document half-way between an UNSM and a MIG
namespaces no equivalent
parser validation/conversion/translation program
XML/EDI 'element' EDIFACT component such as message type, segment or data element
(element1, element2)+, * or ? a repeatable group of segments
XML/EDI 'attribute' XML element which may be an attribute of another XML element. An XML/EDI attribute does not necessarily correspond to an EDIFACT qualifier data element.

22. What is the meaning of symbols | + * and ? which may follow the element names in a DTD?

Symbol "|" is used to indicate an OR relationship, e.g. (ContractNo | RefNo).
Symbol "," is used to indicate an AND relationship, e.g. (Quantity, Deliver).
Symbol "+" means must occur once or more times.
No symbol means must occur once only.
Symbol "*" means may occur zero or more times.
Symbol "?" means may occur zero or once.

This would correspond to the following EDIFACT Mandatory and Conditional status:

  • + means M1, M2, … Mn
  • nothing means M1
  • * means C1, C2, … Cn
  • ? means C1

For example, "(Quantity, Deliver?)+" means that the following must be present and may be repeated:

  • Quantity and (if needed) Deliver

Complete XML language specifications may be found at <>

Thank You

Special thanks to Martin Bryan, The SGML Center, <mtbryan (at)>; Rik Drummond, Drumond Group and EC Consultancy, <drummond (at)>; Andrew Hinchley <ce12 (at)>; David Webber, Gnosis, <gnosis_ (at)> and Jim J. Yang, KITH, <jim.yang (at)> for suggestions and corrections.

Last update: June 1999.

Design, XSL coding & photo: PineappleSoft OnLine.