Messages retrieved in HTML format are fine for viewing, but not really suitable for extracting data. Fortunately most email clients will also create a 'plain text' portion which is a version of the HTML message without any of the HTML tags. ThinkAutomation will automatically use the plain text portion of the message for extracting - if it exists.
What If There Is No Plain Text?
If the plain text portion of the message does not exist, ThinkAutomation will create one by removing all the HTML tags itself. However the resulting plain text portion may need additional extraction options depending on how the original message was formatted.
To be able to set up your extraction fields properly, use the Find & Extract Helper field in the Field Extraction form. Paste the complete HTML source code of the message into this entry. ThinkAutomation will then automatically convert it to plain text. The resulting plain text will be how ThinkAutomation will 'see' it during field extraction.
Extracting From HTML Manually Using Scripts
The field MSG_Html will contain the HTML portion of the message (whilst the MSG_Body field will contain the plain text). You can use this field in extraction scripts if you want to extract using specific HTML tags.