awk Cheat Sheet
I needed to crunch some data quickly and decided awk was the right tool to do so. But every time I use awk, I have to go read the manual, so I decided it’s time for a cheat sheet.
Structure of an awk script
Invoke awk with a script like so:
Matching
Match every line: awk will match each record against the instructions in the script. It will execute all matching instructions.
Match blank lines:
Match on columns:
Relational operators to match columns:
Negate match:
Input and Output
Awk splits the input into records on theRS(RecordSeparator).
Each input record is split into fields via theFSvariable (FieldSeparator)
or via-Fcommand line flag.
Individual fields can be addressed with$<field index>, for example$1returns
the first field,$2the second and so on.$0returns the whole record.
Similarly toRSandFSawk supports record and field separators for output formatting
calledORS(OutputRecordSeparator) andOFS(OutputFieldSeparator).
Theprintffunction allows more control over formatting:
Variables
Variables can simply be assigned by a name, the assignment operator, and an expression:
Variables have both a numeric and string value and awk will use whatever is appropriate. Strings
have a numeric value of0.
Variables can be passed into awk at the beginning of the execution as a parameter:
These variables are not available inBEGINblocks, but you can specify variable bindings at startup with-v var=value:
Arrays can be used just like variables and don’t require initialization. Arrays are associative, i.e. both numbers and strings can be used as index.
Predefined Variables
RS: Record separator
FS: Field separator
NR: number of records in input processed so far, aka line number
NF: number of fields in current record
ORS: Output record separator
OFS: Output field separator
Control Flow
Awk supportsif,if-else,if-else-if-else, and the ternary operatorexpr ? action : other action:
In terms of loops awk haswhile,do-while, andforloops. Theforloop can be used like a traditional C style for loop:
or as in a simplified form for traversing array’s indexes:
Furthermore awk has thecontinueandbreakkeywords which do exactly what you would think. There’s also theexitandnextkeywords.exitdoes what you would expect and exits the script,ENDblocks will still be executed though..nextcauses the
next record to be read.