Regular Expressions

<< Click to Display Table of Contents >>

Navigation:  »No topics above this level«

Regular Expressions

Regular expressions can be used when searching and extracting fields from message text.

 

The regular expression syntax used by ThinkAutomation is quite powerful and can be used to match virtually any type of text pattern.

 

Char

Description

^

Beginning of a string.

$

End of a string.

.

Any character.

[list]

Any character in list. For example, "[AEIOU]" matches any single uppercase vowel.

[^list]

Any character not in list. For example, "[^ ]" matches any character except a space.

[A-Z]

Any character between 'A' and 'Z'. For example, "[0-9]" matches any single digit.

?

Repeat previous character zero or one time. For example, "10?" matches "1" and "10".

*

Repeat previous character zero or more times. For example, "10*" matches "1", "10", "1000", etc.

+

Repeat previous character one or more times. For example, "10+" matches "10", "1000", etc.

\

Escape next character. This is required to any of the special characters that are part of the syntax. For example "\.\*\+\\" matches ".*+\". It is also required to encode some special non-printable characters (such as tabs) listed below.

 

In addition to the characters listed above, there are seven special characters encoded using the backslash. These are listed below:

 

\a Bell (Chr(7))

\b Backspace (Chr(8))

\f Formfeed (Chr(12))

\n New line (Chr(10), vbLf)

\r Carriage return (Chr(13), vbCr)

\t Horizontal tab (Chr(9), vbTab)

\v Vertical tab (Chr(11))

 

For example

"^stuff"                 ' any string starting with "stuff"

"stuff$"                 ' any string ending with "stuff"

"o.d"                    ' "old", "odd", "ord", etc

"o[ld]d"                 ' "old" or "odd" only

"o[^l]d"                 ' "odd", "ord", but not "old"

"od?"                    ' "o" or "od"

"od*"                    ' "o", "od", "odd"

"od+"                    ' "od", "odd", etc

"[A-Z][a-z]*"            ' any uppercase word

"[0-9]+"                 ' any stream of digits

"\."                     ' decimal point (needs escape character)

"[1-9]+[1-9]*"           ' any stream of digits not starting with 0

"[+\-]?[0-9]*[\.]?[0-9]*" ' any number with optional sign and decimal point (needs two escape characters)

"[A-Z0-9]+ [0-9]+[A-Z]+" ' extracts a UK post code

"[a-zA-Z0-9._-]+@[a-zA-Z0-9_-]+\.[a-zA-Z][a-zA-Z.]*[a-zA-Z]" ' extracts any email address

 

 

Additional ThinkAutomation Regular Expressions

In both the Look For and Then Look For fields you can include a number of control characters in addition to regular expressions:

<CR>

Carriage return

<LF>

Line feed

<CRLF>

Carriage return/line feed

<TAB>

Tab

<ESC>

Escape

*

When used on its own the * character finds the next non-space or control character.

<xxx>

Where xxx is the ASCII character code

 

This can be useful when searching for data. For example, suppose the text contains:

 

Your serial number is:

1234-5678

 

We want to extract the serial number, so we could look for 'Your serial number is:' and then look for '<CRLF>' - because the serial number is on the next line. Another way of doing the above would be to look for 'Your serial number is:' and then look for '*' - which would effectively look for anything after 'Your serial number is:'.

 

ThinkAutomation regular expressions are compatible with the Microsoft RegEx Library. For more regular expression examples, please see:

 

http://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx

 

http://regexone.com/

 

http://en.wikipedia.org/wiki/Regular_expression

 

 

ThinkAutomation © Parker Software 2016