Shell practice: Introduction to the sed stream editor

Quick Edit

Searching

The search function, among other things, can be used to replace text sections, in which the search query represents the addressing. You can also use regex for search patterns. Table 5 shows some of the possibilities, with some examples in Table 6, and Figures 2-7 show some of the results. In these tables, sed both pipes in a data stream and directly accesses a text file.

Table 5

Patterns and Addressing

Action Pattern
All lines  (null)
Line 25 25
Not line 25 25!
Lines 10 through 20 10,20
Last line $
Not pattern '/PATTERN/!'
Character at beginning of line ^CHAR
String /STRING/
Character set [CHARS]
Any character [:alpha:]
Lowercase [:lower:]
Uppercase [:upper:]
Alphanumeric [:alnum:]
Digit [:digit:]
Hexadecimal digit [:xdigit:]
Tab and space [:blank:]
Space [:space:]
Control character [:cntrl:]
Printable characters (no control characters) [:print:]
Visible characters (without spaces) [:graph:]
Punctuation [:punct:]

Table 6

Sample Searches and Patterns

Search for Pattern Example Figure
Term, Name '/TERM/' cat textdata.txt | sed -n '/Evans/p'
All lines containing 'man' or 'Man' '/[Mm]an/p' sed -n '/[Mm]an/p' textdata.txt 2
All lines except 3 through 5 '3,5!' sed -n '3,5!'p textdata.txt 3
All lines except those containing 'Man' '/Man/!' sed -n '/Man/!'p textdata.txt 4
Lines containing 'H' or 'G' '/[H|G]/' sed -n '/[H|G]/'p textdata.txt
Lines not containing 'H' or 'G' '/[H]\|[G]/!' sed -n '/[H]\|[G]/!'p textdata.txt 5
Line 3 3 cat textdata.txt | sed -n '3p'
Last line '$p' cat textdata.txt | sed -n '$p'
Multiple patterns: Do not output lines containing an 'R' somewhere and an 'M' somewhere else '/[R]./,/[M]./!' sed -n '/[R]./,/[M]./!'p textdata.txt 6
All lines containing some alphanumeric characters (i.e., not all spaces) '/[:alnum:]/' cat textdata.txt | sed -n '/[:alnum:]/'p 7
Figure 3: Output all lines except the third through fifth.
Figure 4: Output of lines except those containing 'Man'.
Figure 5: Output all lines except those containing 'H' or 'G'.
Figure 6: Output lines except those containing an 'R' somewhere and an 'M' somewhere else.
Figure 7: Output all lines that contain alphanumeric characters (no empty lines or lines containing only spaces, tabs, etc.).

Note the output of the following command:

sed -n '/[C]/,/[c]/!'p textdata.txt
chris hemsworth - Thor ...
Scarlett Johansson - Black Widow ...
Robert Downey - Iron Man ...
Mark Ruffalo - Hulk ...
Paul Rudd - Ant Man ...

Only lines that don't fall between the first occurrence of C (Chris Evans /Cobie Smulders ) and c (Samuel Jackson /Hugh jackman ) are output. If you reverse the letters, putting the lowercase c before the uppercase,

sed -n '/[c]/,/[C]/!'p textdata.txt
Jeremy renner - Hawkeye ...
Tom Hiddleston - Loki ...

you only get two lines. Everything between chris and Chris (inclusive), between Jackson and Cobie (just those lines), and between jackman and EOF are suppressed.

If you want be absolutely certain that sed is doing what you want, you can combine several simple calls with pipes. The following command suppresses empty lines and lines with Man (Figure 8):

Figure 8: Processing several instances of sed using a pipe.
cat textdata.txt | sed -n '/[:alnum:]/'p | sed -n '/Man/!'p

Substituting and Removing

With the s command, you can replace matched expressions. The length of search-and-replace strings is irrelevant. The detailed syntax is shown in Figure 9.

Figure 9: Syntax of the search-and-replace statement.

You can limit the search and replace statement to specific lines by preceding the command with the line number, as shown here:

sed -n '5s/OLD/NEW/p' [TEXTFILE]

Or, for a range of lines, use:

sed -n '1,4/OLD/NEW/p' [TEXTFILE]

You can also suppress changes to certain lines using the exclamation mark:

sed -n '20-80!s/OLD/NEW/p' [TEXTFILE]

Furthermore, you can limit changes to lines that contain certain strings or patterns that are not the same as those used for the search-and-replace statement,

sed -n '/[STRING|PATTERN]/s/OLD/NEW/gp' [TEXTFILE]

and you can delete the matched string with an empty string.

The first occurrence of the search string on a line is processed. To replace all instances, add the g (greedy) option at the end of the statement. The stream editor can be a silent partner if the -n option is set, so if you want to see what's going on, add the p (print) option. You can also write results to an output file with w (write). Table 7 shows some short examples.

Table 7

Sample Search and Replace

Action Example Figure
Replace pattern at the first occurrence only cat textdata.txt | sed -n 's/e/E/p' 10
Replace pattern at every occurrence cat textdata.txt | sed -n 's/e/E/gp' 10
Delete the word 'Man' sed -n 's/Man//gp' textdata.txt 11
Replace 'Iron' with 'Tin' on line 4 cat textdata.txt | sed -n '4s/Iron/Tin/gp' 12
Replace '0' with '089' on all lines containing 'Man' or 'man' sed -n '/[Mm]an/s/0/089/gp' textdata.txt 13
Replace '0' with '089' on all lines except those containing 'Man' or 'man' sed -n '/[Mm]an/!s/0/089/gp' textdata.txt 14
Delete all numbers and slashes (/ ) and hyphens (- ) cat textdata.txt | sed -n s'/[0-9\/-]//'gp 15
Figure 10: Using the "greedy" (g) option.
Figure 11: Deleting the word 'Man'.
Figure 12: Limiting the search and replace to one line.
Figure 13: Limiting the search and replace to selected lines.
Figure 14: Excluding lines for the search-and-replace statement.
Figure 15: Deleting numbers and symbols from lines.

The more complex example in Figure 16 converts the inconsistently formatted date syntax in the testlist.txt file to a common, unified (European) date format DD/MM/YYYY . Be sure to press Enter immediately after the backslash at the line's end. Alternatively, you can omit the backslash and let the command wrap; the pipe character connects with the lines that follow; however, this results in a less readable screen display.

Figure 16: Formatting dates.

The list is read in the first line and the following lines each pipe their output to the next command: Take any present leading space characters and substitute the number  ; replace any minus signs in dates with spaces; substitute any month written as a word with its numeric value followed by a slash; substitute any two-digit number at the beginning of a line (^), 0 through 3 and any digit, and any space character with "itself" (&) followed by a slash.

To make the search pattern repeatable during the replacement, enclose it in parentheses – which you have to be sure to escape with \. The final sed statements delete all existing space characters (through s command's option g).

The uniq command on the last line ensures that all duplicate lines are uniquely output. You can also "carry over" all or part of the original string into the replacement patterns in the replacement statement. Check out the following example:

echo "happy" | sed -n s'/happy/un&/'p

This example replaces happy with unhappy . You can also convert characters from lowercase to uppercase:

cat textdata.txt | sed -n s'/\([[:lower:]]\)/\U&/'pg

The \U before the & indicates the output must be converted into uppercase. You can do the following:

cat textdata.txt | sed -n s'/\([[:upper:]]/\L&/'pg

to convert from uppercase to lowercase.

Character Replacement

For character filtering, use the y option. The pattern should contain all the characters that need to be replaced, and the replacement statement should have the same number of characters. The command structure should only have s, and -n should be omitted:

sed y'/[Search CHAR]/[Replacement CHAR]/'

Substitute the first character of only the lines in textdata.txt that begin with lowercase c with uppercase C (Figure 17).

Figure 17: One-for-one character replacement.

You use c to replace entire lines,

sed 'PATTERN'c'REPLACEMENT'

or like this:

sed [LINE(n)] c'REPLACEMENT'

The example in Figure 18 replaces an empty line with a series of dashes.

Figure 18: Replacing an entire line matched to a search pattern.

In place of a search pattern, you can use line numbers. Be aware that even if you specify multiple line numbers they will all be replaced by a single instance of the replacement string. If you choose three lines, for example, it will look like the first line is replaced and the second and third lines are deleted.

The top example in Figure 19 replaces the blank line with a series of hash marks. The bottom example removes lines 2 through 4 and inserts a given line of text.

Figure 19: Replacing whole lines by line number.

The d option deletes lines that match a pattern or line numbers:

sed '/PATTERN/'d
sed [LINE(n)]d

Using the commands in Figure 20, you can search for and delete an empty line and then delete the fourth line.

Figure 20: Deleting lines.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus