AWK is a versatile and sophisticated text-processing language that is commonly used for data manipulation and analysis. Its ability to interface with shell scripting is one of its primary benefits which allows the users to combine the power of AWK with the flexibility of shell variables. However, in order for this connection to be seamless, it is necessary to grasp how to correctly pass and use the shell variables within AWK programmes.
This article walks us through the process of allowing AWK to utilize the shell variables, as well as provides a full guide on utilizing this integration. We’ll go over the techniques and needed commands to send the shell variables to AWK, allowing for efficient data processing and modification. By the end of this tutorial, we will have a basic grasp on how to use the combined strength of AWK and shell scripting to tackle difficult data analysis jobs with ease.
Embed a Shell Variable in AWK
Naturally, one approach to use the value of a shell variable in AWK is to immediately interpolate it within the context of a string, i.e., the script:
$ awk 'BEGIN { print "shell_var='"$var"'" }'
Output:
awk: It runs the AWK command-line utility.
‘BEGIN print “shell_var='”$var”‘” ‘: This is the AWK script, surrounded by single quotes.
BEGIN: This is an AWK pattern that specifies the actions to be taken before the input processing begins.
print “shell_var='”$var”‘”: The action is to print the shell_var=’ text which is concatenated with the value of the “var” shell variable and closes the single quotation.
Taking the Direct Value as an Input
Many shells support passing the data to the commands using a here-string mechanism:
Output:
Predefined Variables in AWK
Before running a script, AWK allows us to add values to internal variables. Notably, the names of the shell and internal variables can differ.
If the variable values include an escape sequence, AWK interprets it as follows:
Output:
ue
Regular expression characters such as the “|” pipe symbol and related meta characters must be escaped twice. For example:
Let’s look into both possibilities.
1. Using -v: A space, a variable name, an “=” equals sign, and the variable value are all followed by the -v flag of AWK. The latter, crucially, can be an interpolated shell value:
$ awk -v Hard="$Hard" 'BEGIN { print "shell_Hard=" Hard }'
Output:
Each variable can then be utilized normally throughout the script.
We supply -v as many times as we need for multiple variables:
$ Boy='Girl'
$ awk -v Happy="$Happy" -v Boy="$Boy" 'BEGIN { print "Happy=" Happy "\n" "Boy=" Boy }'
Output:
Boy=Girl
2. Direct Variables: Similar to the preceding option, the variables and their values can be specified straight after the script or script file:
$ echo $'row1\nrow2' | awk 'BEGIN { print var } { print $0, NR "*" var "=" NR*var }' var="$var"
Output:
row2 2*20=40
The first output line is crucially empty. This is because predeclaring the variables in this manner makes them unavailable in the BEGIN block. As a result, var is undefined in the initial print but outputs appropriately thereafter. This method is equal to using -v as long as we don’t need the values in a BEGIN block.
Passing Multiple Shell Variables to AWK Using ARGV
We can supply the command-line arguments that are initialized with interpolated shell variables as usual:
$ var2='Lucy2'
$ awk 'BEGIN { print "shell_var1=" ARGV[1] "\n" "shell_var2=" ARGV[2] }' "$var1" "$var2"
Output:
shell_var2=Lucy2
This is a reliable method to capture any variable value without having to worry about escaping.
This is a robust way of capturing any variable value without worrying about escaping.
Accessing the Shell Variables in AWK Using ENVIRON
All shell variables in a string-indexed array are accessible via the ENVIRON special variable:
$ awk 'BEGIN { print ENVIRON["sourav"] }'
One slight disadvantage is that we must export the variable:
$ export var2='Sad'
$ var3='ok'
$ awk 'BEGIN { print "shell_var1=" ENVIRON["var1"] "\n" "shell_var2=" ENVIRON["var2"] "\n" "shell_var3=" ENVIRON["var3"] }'
Output:
shell_var2=Sad
shell_var3=
As a result, the value of the non-exported “$var3” variable cannot be obtained using ENVIRON.
Conclusion
This article looks at how to input the shell variables to AWK and then use them efficiently in the programs. We unlock the full potential of AWK and shell scripting. By understanding this technique, it allows us to handle the difficult data analysis jobs with ease.