Cleanup
Normalize Document
Operation | Description |
---|---|
Style Whole Document with Default Style | All paragraph styles are converted to the Default Style (the Word style “Normal,” unless a specific override is a part of your configuration). |
Auto-Style
Operation | Description |
---|---|
Regular Body Paragraphs with Style | All regular body paragraphs are converted to the style selected in the drop-down box. Local italic and boldface markup is retained. |
Tables | Tables created with the Word Table Editor are automatically styled with Table Head, Table Body, and Table Footnote paragraph styles. Table titles are automatically styled even if they are outside of the table. |
References | Numbered references, or unnumbered references that follow a heading such as “References,” are automatically styled as Reference paragraphs. |
Auto-Style Regular Body Paragraphs Criteria
The eXtyles Auto-Style Regular Body Paragraphs with Style option in the Cleanup dialog can be set to automatically apply your organization’s custom body paragraph style to all regular body paragraphs in the Word file, dramatically reducing the time spent manually applying paragraph styles to the manuscript.
The style will be applied to left-aligned paragraphs that start with an upper case letter unless one of the follow criteria is met:
- The paragraph is styled as MTDisplayEquation (i.e., a MathType Display or Right-numbered equation), Footnote Text, or Endnote Text. To configure this criterion, please contact your eXtyles representative.
- The paragraph is not a Body Text level paragraph in Word’s Outline view.
- The paragraph begins with a left indent.
- The entire paragraph is in the selected font for Preformatted text (e.g., Courier—this font can be preset in your configuration or can be selected from the Cleanup dialog) or in the font of Word’s HTML Preformatted style. To configure this criterion, please contact your eXtyles representative.
- The entire paragraph is styled with an ALL CAPS or small caps font or character style.
- The paragraph is shorter than the default 60 characters and ends with a period.
- The paragraph is manually entered in all capital letters and is shorter than the default 60 characters. To configure this criterion, please contact your eXtyles representative.
- The paragraph starts with an inline shape (e.g., an embedded graphic).
- The paragraph starts with a word that suggests a table or figure caption (e.g., “Table,” “Figure,” “Fig.,” “Plate”), followed by punctuation (period, colon, em or en dash, or hyphen), a table or figure number followed by punctuation, or the end of the paragraph. The lists of caption words include words for “figure” and “table” in a number of European languages; to configure these lists, please contact your eXtyles representative.
- The first character of the paragraph is bold, italic, underline, superscript, raised, or lowered.
- The first character of the paragraph is a digit.
- The first character of the paragraph is a special character in a special character font (e.g., Symbol). Special characters include but are not limited to: opening parentheses, square brackets, angle brackets and braces, asterisk, daggers, raised dot, dashes, and bullets.
- The first character of the paragraph is an opening parenthesis, square bracket, angle bracket or brace, asterisk, raised dot, or dash, and is not in a special character font.
White Space (Whole Document) Settings
The following White Space Cleanup options are performed on the entire document.
Operation | Description |
---|---|
Remove Section Breaks | Word’s section breaks are removed. |
Remove Page Breaks | Word’s page breaks are removed. |
Remove Column Breaks | Word’s column breaks are removed. |
Remove Space Between a Number and % | Spaces between a number and the % sign are removed. |
White Space (Font-Sensitive) Settings
The following White Space Cleanup options may be set to exclude text in a specified font. Please see the description of this feature following the table for more information.
Operation | Description |
---|---|
Convert Tabs to Spaces* | Each tab character is converted to a single space. |
Remove Multiple Spaces | Multiple spaces are converted to a single space. Note that this step is performed after converting tabs to spaces, so all extra white space is removed. |
Remove Start and End Paragraph Spaces | Leading and trailing spaces in paragraphs are removed. Note that this step is performed after converting tabs to spaces, so all extra white space is removed. Non-breaking spaces and soft returns are also removed. |
Remove Blank Paragraphs | Blank paragraphs are removed. |
*Tabs are required for the proper XML export of some elements (e.g., between a term and definition in an abbreviations list). Because of this, the USGS default cleanup setting is to retain all tabs in the document. However, if during your visual scan of the document you notice that there is no term and definition content in the document (e.g., Abbreviations, DMU), you may choose to have eXtyles Cleanup remove tabs by selecting this option.
White Space Cleanup: Excluding Paragraphs in a Specified Font
In some documents, you might wish to run white space–related Cleanup features but exclude certain paragraphs from this type of cleanup. For instance, programming code often has intentional blank lines and intentional leading spaces that should not be removed during Cleanup, although the rest of the document should have no blank lines. The same is true for verse and poetry, in which white space is often intentional and should be retained.
So that users don’t have to forgo use of the white space Cleanup features entirely, eXtyles allows users to specify that paragraphs in a particular font should not be affected by some of the white space–related Cleanup options. That is, you can set computer code in Courier and specify that paragraphs in Courier should not be affected by white space cleanup operations; you can set verse in a font such as Garamond and specify that paragraphs in Garamond should not be affected by white space cleanup operations.
You can protect additional text by selecting a specific font to exclude from white-space cleanup.
The ability to exclude paragraphs from white space–related Cleanup features is optional. If you prefer, you may still run all Cleanup features on all paragraphs in your documents by leaving the Exclude Text in Font checkbox unchecked.
The following subset of white space Cleanup options may (optionally) be set to not run on paragraphs in a specified font:
- Convert Tabs to Spaces
- Remove Multiple Spaces
- Remove Start and End Paragraph Spaces
- Remove Blank Paragraphs
To use this feature, check your document for elements that eXtyles should skip when running these options. After Activation, eXtyles will automatically protect paragraphs that are in a constant-width font or that are correctly tagged with a paragraph style that is set in a constant-width font. You should check any verse/poetry paragraphs to make sure they’re in a font that isn’t used elsewhere in the document and apply such a font if necessary.
In addition, the paragraphs in question must be entirely in the excluded font; paragraphs in a mix of fonts will not be excluded from these Cleanup operations. For instance, if you have programming code paragraphs that contain both Courier and Courier New, select the entire code block and change the font to either Courier or Courier New. Then specify that font as the excluded font in the Cleanup dialog. Similarly, set the font of any verse or poetry sections that should not be affected by these options to a font that doesn’t appear elsewhere in the document, and then specify the chosen font as the excluded font in the Cleanup dialog.
eXtyles will then not apply the white space–related Cleanup functions to paragraphs in the specified font, which means that tabs, extra spaces, leading and trailing spaces, and blank paragraphs will be retained for paragraphs that are in the specified font.
Authors sometimes apply a code (monospace) font instead of using a monospaced paragraph style. eXtyles protects these paragraphs from white space Cleanup as well.
During Activation, eXtyles checks the document for paragraphs that are in a monospace font but not a monospace-font paragraph style. eXtyles then applies a monospace-font paragraph style (HTML Preformatted) to exclude these paragraphs from the subset of white space operations during Cleanup. The HTML Preformatted paragraphs can then be styled as usual during the Style Paragraphs stage.
By using the HTML Preformatted paragraph style, eXtyles Activation protects from white space cleanup paragraphs that are in a monospace font.
When you use this feature and exclude paragraphs in the specified font, the Cleanup features Normalize Document and Auto-Style will also automatically skip paragraphs in the specified font, so that code or verse or otherwise protected paragraphs will not be tagged with your base paragraph style.
Typographic Settings
Operation | Description |
---|---|
Remove Optional Hyphens | Discretionary hyphens are removed. |
Soft Returns | Soft returns—that is, line breaks inserted with Shift + Enter—are converted to spaces or hard returns. |
Character Style to Face Markup Conversion
Operation | Description |
---|---|
Built-in Styles | Converts Word’s built-in character styles, such as Emphasis and Strong, to plain face markup. This conversion may exclude the Word character styles for “Hyperlink,” “Footnote Reference,” “Endnote Reference,” and “Comment Reference,” depending on your configuration. |
User-Defined Styles | Converts cases of user-defined character styles to plain face markup. This conversion is most important in preparing the document for advanced processing functions. |
Auto Text to Plain Text
Operation | Description |
---|---|
Word Fields | Word fields are converted to text. In particular, fields generated by add-in products such as EndNote are automatically converted to plain text. |
Auto-Numbered Headings and List | Numbers generated with Word’s auto-numbering feature in headers and in numbered lists are converted to plain text. |
Comments, Bookmarks, and Hidden Text
Operation | Description |
---|---|
Remove Word Comments | Remove Word comments in the file. The default is to remove all comments. You can also remove comments from specific reviewers by selecting the reviewer from the drop-down list. |
Remove Word Bookmarks | Remove Word bookmarks from the file. We recommend selecting this option to avoid conflicts with eXtyles Advanced Processes. |
Hidden Text | Warns when hidden text is present in the document, and reveals it in one of two ways: (1) Flag with Comments (default), (2) Make Not Hidden, or (3) Leave As-Is. |
Graphics
Operation | Description |
---|---|
Leave Graphics/Remove Graphics/Export and Remove Graphics | eXtyles Cleanup can remove all graphics from your manuscript, allowing eXtyles functions to work faster. The default setting is to leave graphics as they are (“Leave Graphics”), or you can select “Remove Graphics” and “Export and Remove Graphics.” If you choose to export graphics, they will be placed in a subfolder of the document and named according to the document. Equations and Excel worksheets are not removed when this option is selected. Advanced configuration capabilities for this option are available. |
Graphics replacement text box | When you choose to remove graphics, they are replaced with a text string to identify where they were located and to identify to which exported file they correlate. The default text is “[INSERT FIGURE %Z]” where “%Z” becomes “001,” “002,” and so on, as sequential graphics are removed. This text may be altered to meet your specific needs. Note: This option is disabled unless you select “Remove Graphics” or “Export and Remove Graphics.” |
Tables
Operation | Description |
---|---|
Remove Borders | Removes borders on all cells and on the entire table. Unchecked by default. |
Center Columns | Centers text in all table columns. Unchecked by default. |
Remove Shading | Removes all shading in tables. Unchecked by default. |
Add Top/Bottom Border | Adds an x-pt. rule to top and bottom of all tables. Unchecked by default. |
Left Justify First Column | Left-justifies text in first column of all tables. Unchecked by default. |
Insert Space in Empty Cell | Inserts a single non-breaking space in empty tables cells. Checked by default. |
Add Header Border | Removes any specific width settings and applies auto-fit to all table cells. Unchecked by default. |
Remove 1st-Line Indent | Unindents the first line of all table cells. Unchecked by default. |
Auto-Fit Contents | Applies auto-fit to all table cells. Unchecked by default. |
Extract Equation Numbers
Starting in May 2020, eXtyles can locate equation numbers that have been added inside Word Equation Builder objects and move them outside of the math object. This is necessary for Citation Matching to work correctly (i.e., the correct application of the cite_eq character style to citations of equations), because eXtyles will not detect the equation number if it is inside the Equation Builder math. Only equation numbers placed after the equation, and which are the last content in the paragraph, will be moved by this functionality.
Detable Non-Table Content
Occasionally content in your document may be incorrectly formatted in a Word table. For example, the author may have used table cells to achieve white space between an equation and its number:
This content should not remain in table cells and eXtyles Cleanup can automatically “detable” such content.
For all of the following functions, content will not be detabled if:
- The table is preceded by a paragraph that eXtyles identifies as a table title
- The table is followed by a table title that is not itself followed by another table
- The first row of the table contains a table title
Single-Row Tables
This Cleanup option finds all tables that are just one row and converts them to either tab-separated text, if the contents of the table looks like an equation, or otherwise paragraph-separated text.
Equations
Finds all tables that have content in just two columns (discounting empty cells), with an equation (i.e., a, Equation 3.0 or MathType object or a Word Equation Builder object) in one cell and an equation number in the other cell, regardless of the number of rows. These tables are then converted to tab-separated text.
Figures
Finds all tables that contain only (1) images, (2) figure captions, or (3) panel captions or panel labels. If the table contains more than one figure caption, then the text is separated as paragraph-separated text; otherwise the table is converted to tab-separated text.
References
Finds all tables that are preceded by a reference list title and contain no more than two columns of content, and converts them to tab-separated text. This option is selected by default on the Cleanup dialog.
Additional Cleanup Dialog Features
The Cleanup dialog contains three buttons, located along the right-hand side of the dialog box, to allow rapid setting selections:
- Set All: Select all options in the dialog.
- Clear All: Clear all options in the dialog. Click this button first if you want to run just a single operation, such as removing comments from a specific reviewer only.
- Reset: Set the default check boxes of the dialog to the default choices.