348
01-28008-0013-20050204
Fortinet Inc.
Using Perl regular expressions
Spam filter
Regular expression vs. wildcard match pattern
In Perl regular expressions, ‘.’ character refers to any single character. It is similar to
the ‘?’ character in wildcard match pattern. As a result:
• fortinet.com not only matches fortinet.com but also matches fortinetacom,
fortinetbcom, fortinetccom and so on.
To match a special character such as '.' and ‘*’ use the escape character ‘\’. For
example:
• To mach fortinet.com, the regular expression should be: fortinet\.com
In Perl regular expressions, ‘*’ means match 0 or more times of the character before it,
not 0 or more times of any character. For example:
• forti*\.com matches fortiiii.com but does not match fortinet.com
To match any character 0 or more times, use ‘.*’ where ‘.’ means any character and
the ‘*’ means 0 or more times. For example, the wildcard match pattern forti*.com
should therefore be fort.*\.com.
Word boundary
In Perl regular expressions, the pattern does not have an implicit word boundary. For
example, the regular expression “test” not only matches the word “test” but also
matches any word that contains the “test” such as “atest”, “mytest”, “testimony”,
“atestb”. The notation “\b” specifies the word boundary. To match exactly the word
“test”, the expression should be \btest\b.
Case sensitivity
Regular expression pattern matching is case sensitive in the Web and Spam filters. To
make a word or phrase case insensitive, use the regular expression
/i
For example,
/bad language/i
will block all instances of “bad language” regardless of case.
Table 32: Perl regular expression formats
Expression
Matches
abc
abc (that exact character sequence, but anywhere in the string)
^abc
abc at the beginning of the string
abc$
abc at the end of the string
a|b
either of a and b
^abc|abc$
the string abc at the beginning or at the end of the string
ab{2,4}c
an a followed by two, three or four b's followed by a c
ab{2,}c
an a followed by at least two b's followed by a c
ab*c
an a followed by any number (zero or more) of b's followed by a c
ab+c
an a followed by one or more b's followed by a c
ab?c
an a followed by an optional b followed by a c; that is, either abc or ac
a.c
an a followed by any single character (not newline) followed by a c
a\.c
a.c exactly
[abc]
any one of a, b and c
Summary of Contents for FortiGate FortiGate-5020
Page 86: ...86 01 28008 0013 20050204 Fortinet Inc Dynamic IP System DHCP ...
Page 118: ...118 01 28008 0013 20050204 Fortinet Inc FortiManager System Config ...
Page 254: ...254 01 28008 0013 20050204 Fortinet Inc CLI configuration User ...
Page 318: ...318 01 28008 0013 20050204 Fortinet Inc CLI configuration Antivirus ...
Page 350: ...350 01 28008 0013 20050204 Fortinet Inc Using Perl regular expressions Spam filter ...
Page 370: ...370 01 28008 0013 20050204 Fortinet Inc CLI configuration Log Report ...
Page 382: ...382 01 28008 0013 20050204 Fortinet Inc Glossary ...
Page 402: ...402 01 28008 0013 20050204 Fortinet Inc Index ...