|
14 | 14 | * [Case Insensitive filtering](#case-insensitive-filtering) |
15 | 15 | * [Changing record separators](#changing-record-separators) |
16 | 16 | * [Paragraph mode](#paragraph-mode) |
| 17 | + * [Multicharacter RS](#multicharacter-rs) |
17 | 18 | * [Substitute functions](#substitute-functions) |
18 | 19 | * [Inplace file editing](#inplace-file-editing) |
19 | 20 | * [Using shell variables](#using-shell-variables) |
@@ -443,6 +444,18 @@ No doubt you like it too |
443 | 444 |
|
444 | 445 | $ # if extra newline at end is undesirable, can use |
445 | 446 | $ awk -v RS= '/it/{print c++ ? "\n" $0 : $0}' sample.txt |
| 447 | + |
| 448 | +$ # based on number of lines in each paragraph |
| 449 | +$ awk -F'\n' -v RS= -v ORS='\n\n' 'NF==1' sample.txt |
| 450 | +Hello World |
| 451 | + |
| 452 | +$ awk -F'\n' -v RS= -v ORS='\n\n' 'NF==2 && /do/' sample.txt |
| 453 | +Just do-it |
| 454 | +Believe it |
| 455 | + |
| 456 | +Much ado about nothing |
| 457 | +He he he |
| 458 | + |
446 | 459 | ``` |
447 | 460 |
|
448 | 461 | * Re-structuring paragraphs |
@@ -472,6 +485,73 @@ Much ado about nothing. He he he |
472 | 485 |
|
473 | 486 | ``` |
474 | 487 |
|
| 488 | +<br> |
| 489 | + |
| 490 | +#### <a name="multicharacter-rs"></a>Multicharacter RS |
| 491 | + |
| 492 | +* Some marker like `Error` or `Warning` etc |
| 493 | + |
| 494 | +```bash |
| 495 | +$ cat report.log |
| 496 | +blah blah |
| 497 | +Error: something went wrong |
| 498 | +more blah |
| 499 | +whatever |
| 500 | +Error: something surely went wrong |
| 501 | +some text |
| 502 | +some more text |
| 503 | +blah blah blah |
| 504 | + |
| 505 | +$ awk -v RS='Error:' 'NR==1' report.log |
| 506 | +blah blah |
| 507 | + |
| 508 | +$ # filter 'Error:' block matching particular string |
| 509 | +$ # to preserve formatting, use: '/whatever/{print RS $0}' |
| 510 | +$ awk -v RS='Error:' '/whatever/' report.log |
| 511 | + something went wrong |
| 512 | +more blah |
| 513 | +whatever |
| 514 | + |
| 515 | +$ # blocks with more than 3 lines |
| 516 | +$ # splitting string with 3 newlines will yeild 4 fields |
| 517 | +$ awk -F'\n' -v RS='Error:' 'NF>4{print RS $0}' report.log |
| 518 | +Error: something surely went wrong |
| 519 | +some text |
| 520 | +some more text |
| 521 | +blah blah blah |
| 522 | + |
| 523 | +``` |
| 524 | + |
| 525 | +* Regular expression based `RS` |
| 526 | + * the `RT` variable will contain string matched by `RS` |
| 527 | +* Note that entire input is treated as single string, so `^` and `$` anchors will apply only once - not every line |
| 528 | + |
| 529 | +```bash |
| 530 | +$ s='Sample123string54with908numbers' |
| 531 | +$ printf "$s" | awk -v RS='[0-9]+' 'NR==1' |
| 532 | +Sample |
| 533 | + |
| 534 | +$ # note the relationship between record and separators |
| 535 | +$ printf "$s" | awk -v RS='[0-9]+' '{print NR " : " $0 " - " RT}' |
| 536 | +1 : Sample - 123 |
| 537 | +2 : string - 54 |
| 538 | +3 : with - 908 |
| 539 | +4 : numbers - |
| 540 | + |
| 541 | +$ # need to be careful of empty records |
| 542 | +$ printf '123string54with908' | awk -v RS='[0-9]+' '{print NR " : " $0}' |
| 543 | +1 : |
| 544 | +2 : string |
| 545 | +3 : with |
| 546 | +$ # and newline at end of input |
| 547 | +$ printf '123string54with908\n' | awk -v RS='[0-9]+' '{print NR " : " $0}' |
| 548 | +1 : |
| 549 | +2 : string |
| 550 | +3 : with |
| 551 | +4 : |
| 552 | + |
| 553 | +``` |
| 554 | + |
475 | 555 | * See also [gawk manual - Records](https://www.gnu.org/software/gawk/manual/html_node/Records.html#Records) |
476 | 556 |
|
477 | 557 | <br> |
|
0 commit comments