Regular expressions and metacharacters in PowerShell
Patterns
When administrators worked exclusively at the command line, they could impress the ordinary user with the endless rows of cryptic letter and number combinations (e.g., (\d{1,4}\.){4}(\d{1,4})
), which then changed entries in text files as if by magic.
Even though system administrators today do a large part of their work with graphical tools that provide a convenient interface, the use of regular expressions (regex) significantly facilitates the work. This is true, in particular, when you need to automate and simplify tasks with the use of PowerShell scripts.
PowerShell with Regex
If you have already developed or used some PowerShell scripts, you will typically have come into contact with regular expressions – even if you were perhaps not aware of it. The following example illustrates this very well:
$An_Array = @('somethingno1', 'somethingno2','morestuff') $An_Array | Where-Object {$_ -match 'something'}
Here, you first create an array of strings and then launch a query that only displays the first two elements of the array, because the third element does not match the 'something'
pattern. The -match
operator can also be used without the Where-Object
cmdlet. Thus, calling:
'somethingno1' -match 'something'
returns the value True because the search pattern was found in the string, whereas calling:
'somethingno1' -match 'nothing'
logically returns False
. The -replace
operator also works with regular expressions such as
'The book is good' -replace 'The book', 'The ITA book'
which then returns the string The ITA book is good
. The -replace
operator compares, finds the matching string The book
, and replaces it with the The ITA book
before output. Thus, the purpose of regular expressions is summarized as follows: They are mainly used for making comparisons or replacing values and characters. In addition to operators for direct comparison of values, such as -eq
(equals) and -gt
(greater than), the similarity operators -like
and -notlike
, the replacement operator -replace
, and the match operators -match
and -notmatch
all belong to the comparison operators category. The -replace
operator, as in the example here, and the -match
and -notmatch
operators can all handle regex. The two following calls thus produce exactly the same output on screen:
> Get-Service | where {$_.status -like "running"} > Get-Service | where {$_.status -match "running"}
Please note that this query is not case sensitive – it does not distinguish between upper- and lowercase. Both calls will find processes that are displayed as running
or Running
. If you need a comparison that is explicitly case sensitive, use the -cmatch
operator. To make it clear to any other user reading your shell script that you do not want to differentiate between upper- and lowercase, use the -imatch
operator, which works in the same way as -match
.
Both calls display all of the processes that are active (running) on the system, which could initially lead to confusion with many PowerShell beginners. However, the -like
operator works exclusively together with the asterisk (*
) metacharacter (or wildcard), which stands for any number of characters, excluding other metacharacters. Therefore, comparisons can be made in a far more accurate and meaningful way by using -match
with the help of metacharacters. Metacharacters are most responsible for the bad reputation of regular expressions, because they make your command line look like hieroglyphics.
Patterns and Metacharacters
Regular expressions are patterns (character strings) that describe data. Such an expression always represents a certain type of data in the search pattern and often include metacharacters. Some of the most important of these characters used in PowerShell scripts are:
. ^ $ [ ] { } * ? + \
This list is not exhaustive and only reflects a small selection of the metacharacters available in PowerShell. In many cases, you want to determine whether a string that stands for a file name starts with a specific letter or has a particular extension. Three metacharacters known as quantifiers are used here: the asterisk *
, plus +
, and question mark ?
. The asterisk stands for a character that occurs a random number of times, or not at all, which means the expression will be true even if the character you are looking for is not in the string. In contrast, the plus sign stands for a character that occurs at least once or an arbitrary number of times. Finally, the question mark stands for a character that might only be found in the string once or not at all. Thus, the call
> 'something.txt' -match 'i*'
returns the value True
because the i
pattern was found, followed by no or any number of characters in the string. In this type of search pattern with an *
metacharacter, it does not matter where the i
is found. The following call also returns True
:
> 'Thatistheone.txt' -match 'i*'
It would make more sense to determine whether the letter i
, for which you are searching, occurs at the start of the character string. To do so, use the ^
metacharacter (circumflex accent or hat). After calling
> 'itssomething.txt' -match '^i'
the shell returns True , whereas the call
> 'Thatistheone.txt' -match '^i'
returns the value False
. If you are looking for a character at the end of the string, you can use the dollar sign $
, which must then be specified after the comparison template. Thus,
> 'something.txt' -match 't$'
returns the value True.
You can read the $matches
variable, which is automatically created and filled by calling -match
and in which the corresponding hash table is stored. Enter
> 'something.txt' -match 't$' True ** > $matches Name ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** Value 0 ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** t
to check the $matches
variable.
Letters and Numbers
The regular expressions in PowerShell use character classes, such as those that are also available in Microsoft .NET Framework 3.5. If you want to use the -match
operator to determine whether, for example, an object is a letter, then use
> $Teststring='Programming' > $Teststring -match "\w"
which returns True
. In this case, it is important that you type a small w
after the escape character \
, which is used here to keep the w
from being interpreted as a normal single character by the shell. In contrast to PowerShell's normal behavior, upper- and lowercase are distinguished. If you use the call
> $Teststring -match "\W"
PowerShell checks for non-letters, which means the expression would be True
if the shell were to come across a number in the string, for example. However, because the first character is a letter, the comparison is cancelled immediately and a value of False
is returned. In this case, reading $matches
shows that the character P
was found. PowerShell always compares the pattern to be examined with the regex call, until the condition is met. This also works when comparing numbers, which you can do with the following call (Figure 1):
> $Teststring='Programm456ing > $Teststring -match "\d" True ** > $matches Name ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** Value 0 ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** 4
The shell stops the comparison after finding the first number in the string. Because this is not practical in many cases, the behavior of the comparison can be changed using a metacharacter.
> $Teststring -match "\d+" True ** > $matches Name ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** Value 0 ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** 456
Curly brackets help improve precision and can help you determine a number of characters that should be found in the string. The basic syntax of such a call is {no. of min. characters, no. of max. characters}
. If you just have one number between the curly brackets, PowerShell checks for at least this number of characters,
> $Teststring -match "\d{2}" True ** > $matches Name ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** Value 0 ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** 45
whereas calling
> $Teststring -match "\d{2,3}"
returns True if at least two and at most three numbers exist in the character string.
Buy this article as PDF
(incl. VAT)