Files
-
- 20 KB
- Download
Introduction
FME Form 2024.2 introduced a Hub Package for reading and writing to Microsoft Word files. With the MSWordStyler transformer, users can now read .docx files and extract information such as heading levels, styles, and paragraph text. This information is stored in the Format Attributes and can be exposed in the reader itself or with an AttributeExposer and importing the Data Cache.
Transformer Properties
MSWord Reader
Open FME Workbench and select New. Click on the add data box in the centre and type MSWord.
Navigate to the location of the Word document on your computer and double-click on it. No additional parameters are required to read the file in, so you can click OK, or click on Format Attributes to expose specific msword attributes.
An AttributeExposer can also be used to access additional attributes. In the transformer parameters window, click on Import and select From Data Cache. This option requires users to run the workspace first so the file is loaded into the data cache.
Some of the attributes you can expose are:
Bullets:
- Indent level
- Type of list - bullets, checklist, numbered, etc.
- The text next to each bullet point
Font:
- Styles like bold, italic, and underlined
- The color of the text, if applied
- Size and name of the font. The default font and size will show as <missing>
Headings:
- Heading Level
- Heading text
Paragraphs:
- Paragraph style
- Paragraph text
Images:
- Information about the image is extracted, such as the filepath and size
Tables:
- Table styles and size are provided, as well as the information within each cell
If there is no value in a given attribute, the resulting table in the Data Preview will say <missing> as the value. Below is an example of a list of people in a Word document.
MSWord Writer
The MSWord writer is a simple way to write a new .docx file. Along with the MSWordStyler transformer, you can set predefined and custom styles to headings, tables, and paragraphs based on the format attribute data in the file.
For details on creating the template file, please see our article How to Create a Microsoft Word Base File to Use With the MSWordStyler Transformer.