Shell practice: Introduction to the sed stream editor

Quick Edit

Sample Data

To exercise your sed skills you can use the textdata.txt file in Listing 1. This file contains empty lines, typos, and other errors. The second sample file I'll use in this article is called testlist.txt (Listing 2) and contains dates formatted in a number of different ways.

Listing 1

textdata.txt

chris hemsworth - Thor 0885465468798746
Scarlett Johansson - Black Widow 08755466584
Robert Downey - Iron Man 0987654321
Mark Ruffalo - Hulk 0405458765143321
Chris Evans - Captain America 0548/9988776655
Jeremy renner - Hawkeye 555/8812470
Tom Hiddleston  - Loki 87841487014848
Samuel Jackson - Nick Fury 043/956026386
Cobie Smulders - Maria Hill 23514560145
Hugh jackman - Wolverine 801539193
Paul Rudd - Ant Man 497349000

Listing 2

testlist.txt

22 April 1984
 7.04.1985
30 March 1986
19 April 1987
03.04.1988
26 March 1989
15 April 1990
31-March-1991
19 April 1992
11 April 1993
 3 April 1994
16. April 1995
 7 April 1996
30 March 1997
12 April 1998

Regular Expressions

Regular expressions are used in sed to describe string patterns. The more regex you use, the more complex the statement and the more confusing the command can be to understand. Some characters are valid both as special shell characters and as regex instructions, so you need to "escape" them with the \ character (Table 1). The construct [ABC] means "contains A or B or C," whereas the construct /ABC/ means "contains exactly that string."

Table 1

Special Characters

Character Function
( Opens statement
) Ends statement
{ Opens optional statement
} Closes optional statement
[ Opens a list of characters
] Closes a list of characters
" Masks a statement in which shell variables are resolved
' Masks a statement in which shell variables not resolved
` Encloses a statement block
. Any character other than a newline
, Separates parameters, such as line items
: Sets labels (t and b command)
$ End of document, end of line or last line
& Placeholder for search patterns, included in the replacement statement
| OR (regex separator)
/ Separator in editing commands
^ Beginning of line, or negation in a search pattern
\ Escape character
! After a line number: do not output this line
* 0 or any number of times
+ Pattern present at least once
= Output line number
\n Newline, line feed
\t Tab character

Options and Editing Commands

Confusingly, sed has both options and commands with options. As is usual in Linux, options are preceded by the - character. The command options follow the command. Tables 2-4 provide an overview.

Table 2

Sed Options

Action Function
Execute command (can usually be omitted) -e
Disable data buffering -u
Treat files separately -s
Use extended regex -r
Create backup file -i [FILEEXTENSION]
Read and execute script file -f [SCRIPTFILE]
Suppress (unaffected) text areas -n
Show version -v

Table 3

Editing Commands

Action Command
Add lines above this one i
Add lines below this one a
Output this line p
Output this line with a maximum length l [LENGTH]
Replace signs with others y
End sed q
Replace text in this line c
Delete this line d
Search and replace s

Table 4

Editing Command Options

Action Option
Output line number =
All occurrences g
Outputs modified line with the s editing command p
Write the edited line in the file w

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus