sed: Ignore whitespace while matching a pattern

In my last articles I had shared the arguments with sed which can be used to perform case insensitive actions (search, replace..) in a file and to delete all blank lines from the file.

Now suppose you have a set of words or line which want to match and perform some activity but you do not know the no of whitespace between each word or string.

For eg I have a file /tmp/file with below content

Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpuset
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpu
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpuacct

I would like to grep for "Initializing cgroup subsys cpuset"

but I do not want to risk the grep with the spaces between every word

If I do a normal search and replace like below

# sed 's/Initializing cgroup subsys cpuset/NEW CONTENT/g' /tmp/file
Dec 14 13:29:13 cc01-pgd-002a kernel: NEW CONTENT
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpu
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpuacct

It worked as expected but what if there were some extra spaces between the words like below

# cat /tmp/file
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing   cgroup subsys   cpuset
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpu
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpuacct

and trying the same command as above

# sed 's/Initializing cgroup subsys cpuset/NEW CONTENT/g' /tmp/file
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing   cgroup subsys   cpuset
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpu
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpuacct

It didn't worked.

So we have to try to ignore these whitespaces so that we have a perfect match

# sed 's/Initializings*cgroups*subsyss*cpuset/NEW CONTENT/g' /tmp/file
Dec 14 13:29:13 cc01-pgd-002a kernel: NEW CONTENT
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpu
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpuacct

It worked so below is what you have to use

s   This is equivalent to [[:blank:]]
*    means 0 or higher

So search for all whitespace (0 or higher) when used

# sed 's/Initializing[[:blank:]]*cgroup[[:blank:]]*subsys[[:blank:]]*cpuset/NEW CONTENT/g' /tmp/file
Dec 14 13:29:13 cc01-pgd-002a kernel: NEW CONTENT
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpu
Dec 14 13:29:13 cc01-pgd-002a kernel: Initializing cgroup subsys cpuacct

 

Leave a Comment