Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Live Search
spaceKeyEUDR
additionalpage excerpt
placeholderSearch this space

eXtyles will automatically parse keywords, abbreviations, and glossary paragraphs to XML keyword groups or definition lists, according to certain standard editorial styles. In particular, they must be segmented consistently into their various elements.

On this page

Table of Contents
minLevel1
maxLevel1

In this section

Child pages (Children Display)
pageJATS/NLM XML Export

Keywords

Unless otherwise specified in your export filter, the strongest symbols to separate keywords are:

  • semicolons,

  • em dashes,

  • and bullets.

These characters are highly unlikely to occur within the text of a single keyword. If eXtyles finds one of these characters within the keywords paragraph, it will use that character to separate the paragraph into individual keywords.

If eXtyles fails to find either semicolons or bullets in the keyword paragraph, it will try weaker characters in an attempt to parse the list:

  • tabs

  • em spaces

  • commas (followed by a space)

  • slashes with spaces on either side

Example:

The following Word paragraph will be correctly processed during XML export:

Keywords: bone marrow-derived macrophages; dihydrofolate; site-directed mutagenesis

to yield the following XML:

Code Block
<kwd-group>
<title><bold>Keywords:</bold> </title>
<kwd>bone marrow-derived macrophages</kwd>
<kwd>dihydrofolate</kwd>
<kwd>site-directed mutagenesis</kwd>
</kwd-group>

eXtyles will also automatically parse specialized keyword vocabularies such as PACS, OCIS, or JEL codes.

Tip

Find out more about the Keywords paragraph style in the eXtyles JATS Style Guide.

Translated Keywords

The following example demonstrates the use of the Keywords (Translated) paragraph style:

Screenshot in Draft View of a keywords list in Russian. The paragraph style used is 'Keywords_Translated'

which yields this XML:

Code Block
<kwd-group kwd-group-type="translator" xml:lang="ru">
    <title>Ключевые слова: </title>
    <kwd>математика</kwd>
    <kwd>Microsoft Word</kwd>
    <kwd>XML</kwd>
    <kwd>JATS</kwd>
</kwd-group>

Abbreviations

For the purposes of eXtyles, an "Abbreviations" paragraph is a list of two or more term–definition pairs in a single paragraph.

For parsing of abbreviations, eXtyles uses strong and weak characters in turn to attempt to separate the paragraph into its elements. The strong characters are:

  • tabs,

  • (semi)colons,

  • and em dashes.

Weak characters are en dashes and commas.

Consistent use of strong separator characters is crucial to obtaining accurate XML.

Examples:

Abbreviations paragraphs that will yield correct XML include:

ExPEC, extra-intestinal pathogenic Escherichia coli; SSTI, skin and soft-tissue infection; UTI, urinary tract infection.

ExPEC — extra-intestinal pathogenic Escherichia coli: SSTI — skin and soft-tissue infection: UTI — urinary tract infection.

These various styles will yield the following XML:

Code Block
<def-list list-type="simple" list-content="abbreviations">
<def-item>
<term>ExPEC</term>
<def><p>extra-intestinal pathogenic <italic>Escherichia coli</italic></p></def>
</def-item>
<def-item>
<term>SSTI</term>
<def><p>skin and soft-tissue infection</p></def>
</def-item>
<def-item>
<term>UTI</term>
<def><p>urinary tract infection</p></def>
</def-item>
</def-list>

The characters that are used to separate the abbreviation and definition and the abbreviation–definition pairs generally cannot be used within an abbreviation or definition.

Example:

The following abbreviations list would not be parsed correctly because a comma appears in the first definition:

Abbreviations: ExPEC, extra-intestinal, pathogenic Escherichia coli; UTI, urinary tract infection.

Tip

Find out more about the Keywords paragraph style in the eXtyles JATS Style Guide.

Translated Abbreviations

The following example demonstrates the use of the Abbreviations (Translated) paragraph style:

Screenshot in Draft View of an abbreviations paragraph in Russian. The paragraph style used is 'Abbreviations_Translated'

which yields this XML:

Code Block
<def-list list-type="simple" list-content="abbreviations">
    ...
    <def-item>
        <term>XML</term>
        <def>
            <p>расширяемый язык разметки</p>
        </def>
    </def-item>
</def-list>

Glossary

For the purposes of eXtyles, a “Glossary” is a multi-paragraph list with one term–definition pair per paragraph.

This definition allows rather more flexibility in style. As with abbreviations, consistent use of a strong separator character is crucial to obtaining accurate XML.

The glossary section must always appear in the <back> section of your XML.

The following paragraphs would parse to give correct XML:

1) Term in bold

ExPEC Extra-intestinal pathogenic Escherichia coli

SSTI Skin and soft-tissue infection

UTI Urinary tract infection

2) Tab separating term and definition

ExPEC   Extra-intestinal pathogenic Escherichia coli

SSTI      Skin and soft-tissue infection

UTI        Urinary tract infection

3) Colon, equal sign, or other "strong" separator separating term and definition

ExPEC: Extra-intestinal pathogenic Escherichia coli

SSTI: Skin and soft-tissue infection

UTI: Urinary tract infection

Here is what Options 2 and 3 look like in a Word document:

Screenshot in Draft View with non-printing characters turned on. The image is of an example glossary--the same text as options 2 and 3 above. The paragraph styles visible from the top down are Glossary_Head, Glossary_Section, Glossary_Entry (x3), Glossary_Section, Glossary_Entry (x3)

These various styles each yield the following XML:

Code Block
<glossary>
<def-list>
<def-item>
<term>ExPEC</term>
<def><p>Extra-intestinal pathogenic <italic>Escherichia coli</italic></p></def>
</def-item>
<def-item>
<term>SSTI</term>
<def><p>Skin and soft-tissue infection</p></def>
</def-item>
<def-item>
<term>UTI</term>
<def><p>Urinary tract infection</p></def>
</def-item>
</def-list>
</glossary>
Tip

Find out more about the Glossary paragraph style in the eXtyles JATS Style Guide.

Definition List

Similar to a Glossary, a "Definition List" is a multi-paragraph list with one term–definition pair per paragraph.

As with abbreviations and glossary entries, consistent use of a strong separator character is crucial to obtaining accurate XML.

The following example demonstrates the use of the Definition List paragraph style:

Screenshot in Draft View of an equation followed by various defined terms. The visible paragraph styles from the top down are Paragraph, Equation, Paragraph_Continued, Definition_List (x3)Image RemovedScreenshot in Draft View of an equation followed by various defined terms. The visible paragraph styles from the top down are Paragraph, Equation, Paragraph_Continued, Definition_List (x3)Image Added

which yields this XML:

Code Block
<def-list>
    <def-item>
        <term>E</term>
        <def><p>energy</p></def>
    </def-item>
    <def-item>
        <term><italic>m</italic></term>
        <def><p>mass</p></def>
    </def-item>
    <def-item>
        <term><italic>c</italic></term>
        <def><p>the speed of light in a vacuum</p></def>
    </def-item>
</def-list>