Skip to content

Commit 5d5b07b

Browse files
examples for multicharacter RS
1 parent 1eac064 commit 5d5b07b

File tree

1 file changed

+80
-0
lines changed

1 file changed

+80
-0
lines changed

gnu_awk.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
* [Case Insensitive filtering](#case-insensitive-filtering)
1515
* [Changing record separators](#changing-record-separators)
1616
* [Paragraph mode](#paragraph-mode)
17+
* [Multicharacter RS](#multicharacter-rs)
1718
* [Substitute functions](#substitute-functions)
1819
* [Inplace file editing](#inplace-file-editing)
1920
* [Using shell variables](#using-shell-variables)
@@ -443,6 +444,18 @@ No doubt you like it too
443444

444445
$ # if extra newline at end is undesirable, can use
445446
$ awk -v RS= '/it/{print c++ ? "\n" $0 : $0}' sample.txt
447+
448+
$ # based on number of lines in each paragraph
449+
$ awk -F'\n' -v RS= -v ORS='\n\n' 'NF==1' sample.txt
450+
Hello World
451+
452+
$ awk -F'\n' -v RS= -v ORS='\n\n' 'NF==2 && /do/' sample.txt
453+
Just do-it
454+
Believe it
455+
456+
Much ado about nothing
457+
He he he
458+
446459
```
447460

448461
* Re-structuring paragraphs
@@ -472,6 +485,73 @@ Much ado about nothing. He he he
472485

473486
```
474487

488+
<br>
489+
490+
#### <a name="multicharacter-rs"></a>Multicharacter RS
491+
492+
* Some marker like `Error` or `Warning` etc
493+
494+
```bash
495+
$ cat report.log
496+
blah blah
497+
Error: something went wrong
498+
more blah
499+
whatever
500+
Error: something surely went wrong
501+
some text
502+
some more text
503+
blah blah blah
504+
505+
$ awk -v RS='Error:' 'NR==1' report.log
506+
blah blah
507+
508+
$ # filter 'Error:' block matching particular string
509+
$ # to preserve formatting, use: '/whatever/{print RS $0}'
510+
$ awk -v RS='Error:' '/whatever/' report.log
511+
something went wrong
512+
more blah
513+
whatever
514+
515+
$ # blocks with more than 3 lines
516+
$ # splitting string with 3 newlines will yeild 4 fields
517+
$ awk -F'\n' -v RS='Error:' 'NF>4{print RS $0}' report.log
518+
Error: something surely went wrong
519+
some text
520+
some more text
521+
blah blah blah
522+
523+
```
524+
525+
* Regular expression based `RS`
526+
* the `RT` variable will contain string matched by `RS`
527+
* Note that entire input is treated as single string, so `^` and `$` anchors will apply only once - not every line
528+
529+
```bash
530+
$ s='Sample123string54with908numbers'
531+
$ printf "$s" | awk -v RS='[0-9]+' 'NR==1'
532+
Sample
533+
534+
$ # note the relationship between record and separators
535+
$ printf "$s" | awk -v RS='[0-9]+' '{print NR " : " $0 " - " RT}'
536+
1 : Sample - 123
537+
2 : string - 54
538+
3 : with - 908
539+
4 : numbers -
540+
541+
$ # need to be careful of empty records
542+
$ printf '123string54with908' | awk -v RS='[0-9]+' '{print NR " : " $0}'
543+
1 :
544+
2 : string
545+
3 : with
546+
$ # and newline at end of input
547+
$ printf '123string54with908\n' | awk -v RS='[0-9]+' '{print NR " : " $0}'
548+
1 :
549+
2 : string
550+
3 : with
551+
4 :
552+
553+
```
554+
475555
* See also [gawk manual - Records](https://www.gnu.org/software/gawk/manual/html_node/Records.html#Records)
476556

477557
<br>

0 commit comments

Comments
 (0)