Introduction to
PowerShell Regex - Regular Expression
Typical jobs for Regex are to find patterns in text, and to replace
individual
characters or even whole words.
It's often when numbers mix with text that confusion occurs, and that's when you need a
PowerShell script to solve the problem. For example, telephone numbers and bank sort codes can
be tricky to process because they contain dashes, or a specific grouping of
numbers. Keep in mind that it's rare that you would use regex in
isolation; therefore, my examples are designed to master this one technique so
that you can incorporate pattern recognition in a bigger script.
It all started back in the days when DOS was king. How we loved
typing the wildcard * if we wanted to display all files. What PowerShell's
regex
does is refine 'all' so that you can filter a sub-set of data
into the output. It's defining the subset that makes regex so potent,
yet so difficult to control unless you are an expert in its logic and its
syntax.
The problem for beginners is that regex has a bewildering array of whacky
syntactic symbols.
As a newbie you may find that other people's examples do not make sense, furthermore you
realize that if you experiment, then one wrong character and the command will not
produce the desired results. My mission is to give you a grounding of
the basic structure of regex, from there it will be over to you to employ regex to solve your particular problem.
Start with -match
Before investigating Regex, it helps if you gain experience by testing
comparison operators such as, -match, -like or -contains. Please note
that my examples contain variables, while they aren't strictly necessary,
$Variables help me to identify different sections of the expression.
$Name = "Alan Thomas 1949" $Name -match "Alan"
$Matches # Command for built-in variable
Expected PowerShell result: True
Note in passing, PowerShell also creates a $Matches variable which I use
for troubleshooting unexpected results.
Regex::IsMatch() can be considered the same as -cmatch (case
sensitive match). When developers work with regular expressions they prefer to work with Regex::IsMatch(),
professionals say that this method is nearer the underlying PowerShell class System.Text.RegularExpressions.
However, Guy favours -match or -cmatch as they are shorter and seem to produce exactly
the same results.
$Name = "Alan Thomas 1949" [Regex]::IsMatch($Name,"Alan")
The expected result is True
In the above example [Regex] calls for the method IsMatch(). Then
it's up to us to
supply two values, the input string ($Name) followed by a comma, and the
pattern to match ("Alan"). While I have chosen to control the input
via a $Variable, you could simplify the expression thus:
[Regex]::IsMatch("Alan Thomas 1949","Alan")
Guy Recommends: SolarWinds Engineer's Toolset v10
The Engineer's Toolset v10 provides a
comprehensive console of utilities for troubleshooting computer problems. Guy says
it helps me monitor what's occurring on the network, and the tools
teaches me more about how the system literally operates.
There are so many good gadgets, it's like having free rein of a
sweetshop. Thankfully the utilities are displayed logically: monitoring, discovery, diagnostic, and Cisco tools.
Download your copy of the Engineer's Toolset v 10
Quotes, or "speech marks" play a key role with
Regex. In PowerShell's regular expression constructions it does not
matter if you use single or double quotes. The only difference between
the example above and the example below is the type of quotes.
However, double quotes
come into their own when the PowerShell expression contains a $variable that needs
expanding; single quotes would treat the $variable as a literal.
$Name = 'Alan Thomas 1949' [Regex]::IsMatch($Name,'Alan')
The style of brackets is always significant in PowerShell. IsMatch()
requires the rounded parenthesis bracket. Whereas a portion of the
search string [a-z] needs a pair of square brackets. If you see
pattern matching code with {curly} brackets, they often refer to
quantifiers, once again, follow the correct bracket syntax, or else you will
get unexpected results.
Let us suppose we wanted to test the data for either the name Alan or Alun.
Here is a simple pattern matching example where the third letter can be either
'a' or 'u'. Another method would be to employ the period '.', however
that would require "Al.n",
and not "Al[.]n". Other uses of this technique are if you want to check for
a number, for example [0-9]. In this case we use the dash to tell PowerShell
to expect a contiguous range of all numbers from zero to nine.
$Name = "Alan Thomas 1949" [Regex]::IsMatch($Name,"Al[au]n")
Use of + At first I could not see the point of incorporating the + symbol in regex
expressions, but then I had a particular problem, some people spell their
name Allan. How could I cope with this double ll? The answer was
to insert a plus into the pattern, thus:
$Name = "Allan Thomas 1949" [Regex]::IsMatch($Name,"Al+[au]n")
Summary, I now have a pattern that finds Alun, Alan and
Allan. Without the + it's particularly difficult to find Allan as it
contains 5 letters whereas Alan and Alun only have 4 letters. + means
1-n matches (* means 0-n matches). To see the power, and point of this
symbol,
try removing the plus.
Backslash has several roles in regex, firstly to introduce
special characteristics such as anchors like \b (word boundary). The
backslash is also used to introduce literals, for example the period '.' in
the IP example below. In terms of pattern matching, \ can also be used as a
escape character, for example \s means whitespace and not the letter 's'.
Here is an example which employs \d (decimal) to match the basic format
of an IP address, however, it does not test for numbers bigger than 254.
\d tests for a number (as opposed to a letter), while these curly
brackets {1,3} mean containing 1, 2, or 3 digits.
Guy Recommends: SolarWinds LANSurveyor
LANSurveyor will produce a neat diagram of your network topology. But that's
just the start;
LANSurveyor can
create an inventory of the hardware and software
of your machines and network devices. Other neat features include dynamic
update for when you add new devices to your network. I also love the ability to export
the diagrams
to Microsoft Visio.
Finally, Guy bets that if you take a free trial of LANSurveyor then you will
find a device on your network that you had forgotten about, or someone else
installed without you realizing!
Matching the Beginning and End of Strings - Anchors
In many countries Thomas could be a first name or a surname, if we wanted
to search for only a last name of "Thomas" then we would append the $, thus
we need Thomas$. Naturally, we have to assume that the surname
would be at the very end of the data input.
$strText = "The man called Mr Grey wore the big red coat."
$Pattern = "the" $matched = [regex]::matches($strText,
$pattern)
"Result of using the match method, we get the following:"
$matched |format-table index, length, value
-auto
Note: This example will only find one instance of 'the'. To make the
search case insensitive try introduce (?i) before 'the', thus:
$Pattern = "(?i)the". As a result you should find two
instances of 'the'.
Alternative Match Techniques
The purpose is to check that you have a single block of text with no spaces.
$strText = "The man wearing the gray overcoat" $Pattern = "Gray"
$New = "Grey" $strReplace = [regex]::replace($strText, $pattern, "$New")
"We will now replace $Pattern with $New :"
$strReplace
The key command here is replace, as in [Regex]::replace. Observe
how replace has three arguments, the input text, the pattern to search for
and finally, the pattern to replace.
Notice in passing that because we employ the double quotes PowerShell
expands the variables $Pattern and $New.
Regex is bigger and better than the old DOS * wildcard. The
only problem is that the increased ability to control regular expressions
brings greater complexity for the beginner. As ever, my advice is to
start slowly, choose a simple example and then build on success. The
key to mastering regex is to understand the syntax.
Please write in if you see errors of any kind. Please report any factual mistakes, grammatical errors or broken links, I will be happy to not only to correct the fault, but also to give you credit.
Guy
Recommends: Orion's NPM - Network Performance Monitor
Orion's performance monitor is designed for detecting network outages.
A network-centric
view make it easy to see what's working, and what needs your attention.
This utility guides you through troubleshooting by indicating whether the
root cause is faulty equipment or resource overload.