JATS/NLM XML Export
- Caitlin Gebhard (Unlicensed)
- Sylvia Hunter (Unlicensed)
Your implementation may have a custom export filter designed to create XML files that are valid according to one or more varieties of the NLM Book DTD, the BITS (Book Interchange Tag Suite) DTD, the JATS (Journal Article Tag Suite) DTD, and/or the NLM Journal Publishing DTD.
When exporting files from eXtyles, select the appropriate option from the eXtyles Export menu. This option will convert the file from a Word document to XML and perform a validating parse of the XML against the DTD. The new file is by default saved with the same name as the Word file, except that it has an .XML file extension. The following documentation has been developed with generic paragraph style names. The specific names used for some elements (e.g. Book Information in a book review) will vary according to the specific paragraph style names used within your eXtyles configuration.
Before creating the export, it is very important to style the document correctly and run all eXtyles functions as specified. In addition, correct copyediting of the document to your publication’s style, especially with regard to the front matter (e.g. authors, affiliations, and footnotes) is essential. Failure to style and edit the document correctly prior to export may result in parsing errors or the production of incorrect XML.
The latest book DTD that is currently supported by eXtyles is version 1.0 of the BITS DTD. For journal articles, the latest version of the DTD that eXtyles supports is version 1.1 of JATS. This documentation is written from the perspective of these latest versions of the journal and book DTDs; pertinent differences in earlier DTD versions are noted.
General Export Notes and Best Practices
Never use an XML file created by eXtyles that has not parsed successfully.
The following items should be kept in mind when preparing manuscripts for export:
- Only use paragraph styles that appear on the eXtyles Paragraph Styling palette. Any other styles may result in XML parsing errors, or possibly XML that parses but is not correct. A warning is given during export if any non-conformant paragraph styles are used.
- All tables, or content styled with the Table elements of the paragraph styles palette, must be in Word table format. Otherwise, an export is likely to produce parsing errors. All content for each table should be in a single Word table. Splitting content for a single table across multiple Word tables will likely result in incorrect XML and may result in parsing errors.
- If one Word table immediately follows another, there must be a blank paragraph between the tables. This blank paragraph cannot have any of the table paragraph styles applied to it. It may be styled as a regular paragraph or left as "Normal". To avoid the need for blank paragraphs between tables, Inera recommends that table titles and footnotes be placed outside of the Word table as ordinary paragraphs.
- No information should be set in a table that is not intended to be in a table in XML. For example, if a table has been used to format the layout of figures and captions visually in the Word document for creation of author proofs, such items should be "de-tabled" before export.
- Avoid using ranges of numbered citations. Although eXtyles supports ranges of numbered literature citations (e.g. "see refs 12–15"), ranges of display items (e.g. "Figs 1–4" or "Tables 1–3") are not recommended because there is no text for a reader to click on in an electronic product to reach, say, Fig. 2 in the first example. Although such ranges may parse and can be set up to yield usable XML, it may be easier to reword these citations as "Figs 1, 2, 3, and 4" and "Tables 1, 2, and 3". If you need to use ranges of display items in your content, please contact eXtyles-support@inera.com for more information and a possible alteration to your export filter.
- When possible, place paragraphs in the same order as they appear in print. In general, eXtyles is quite flexible about the order of front-matter or metadata paragraphs — it will automatically reorder them according to the DTD requirements during export. However, it is always best to place paragraphs in the same order as they appear in print, except that footnotes may appear at the end of the Word file. Note that when author/affiliation information appears at the end of items (e.g. editorials, letters), it may be placed at the end in the Word document and eXtyles will correctly place the elements in the front matter according to the DTD requirements.
- All appendix matter must be labelled. Each appendix matter must start with either a paragraph styled as an appendix head, or with a figure or table that has: (a) "Appendix" in the label (e.g. "Appendix 1"), or (b) an uppercase letter preceding the number (e.g. "Table A1").
Preparing the Word file for Export
The key to obtaining correct XML is for the Word file to be correctly structured. In addition to the correct use of Word paragraph and character styles and paragraphs not appearing in "forbidden" locations, certain document elements need to be structured appropriately for the XML to be correct.
eXtyles is designed to identify various sub-elements within certain paragraph styles automatically. This not only saves you time but also reduces the number of styles that your configuration requires. Cases where eXtyles automatically tags content on export are illustrated within this section.