Apache

How to Allow Awk to Use Shell Variables

AWK is a versatile and sophisticated text-processing language that is commonly used for data manipulation and analysis. Its ability to interface with shell scripting is one of its primary benefits which allows the users to combine the power of AWK with the flexibility of shell variables. However, in order for this connection to be seamless, it is necessary to grasp how to correctly pass and use the shell variables within AWK programmes.

This article walks us through the process of allowing AWK to utilize the shell variables, as well as provides a full guide on utilizing this integration. We’ll go over the techniques and needed commands to send the shell variables to AWK, allowing for efficient data processing and modification. By the end of this tutorial, we will have a basic grasp on how to use the combined strength of AWK and shell scripting to tackle difficult data analysis jobs with ease.

Embed a Shell Variable in AWK

Naturally, one approach to use the value of a shell variable in AWK is to immediately interpolate it within the context of a string, i.e., the script:

$ var='DOG'

$ awk 'BEGIN { print "shell_var='"$var"'" }'

Output:

shell_var=DOG

awk: It runs the AWK command-line utility.

‘BEGIN print “shell_var='”$var”‘” ‘: This is the AWK script, surrounded by single quotes.

BEGIN: This is an AWK pattern that specifies the actions to be taken before the input processing begins.

print “shell_var='”$var”‘”: The action is to print the shell_var=’ text which is concatenated with the value of the “var” shell variable and closes the single quotation.

Taking the Direct Value as an Input

Many shells support passing the data to the commands using a here-string mechanism:

$ cat <<< 'sunny'

Output:

sunny

Predefined Variables in AWK

Before running a script, AWK allows us to add values to internal variables. Notably, the names of the shell and internal variables can differ.

If the variable values include an escape sequence, AWK interprets it as follows:

$ awk -v JOY='\tval\nue' 'BEGIN { print "JOY=" JOY }'

Output:

JOY= val

ue

Regular expression characters such as the “|” pipe symbol and related meta characters must be escaped twice. For example:

Let’s look into both possibilities.

1. Using -v: A space, a variable name, an “=” equals sign, and the variable value are all followed by the -v flag of AWK. The latter, crucially, can be an interpolated shell value:

$ Hard='Rock'

$ awk -v Hard="$Hard" 'BEGIN { print "shell_Hard=" Hard }'

Output:

shell_Hard=Rock

Each variable can then be utilized normally throughout the script.

We supply -v as many times as we need for multiple variables:

$ Happy='Sad'

$ Boy='Girl'

$ awk -v Happy="$Happy" -v Boy="$Boy" 'BEGIN { print "Happy=" Happy "\n" "Boy=" Boy }'

Output:

Happy=Sad

Boy=Girl

2. Direct Variables: Similar to the preceding option, the variables and their values can be specified straight after the script or script file:

$ var=20

$ echo $'row1\nrow2' | awk 'BEGIN { print var } { print $0, NR "*" var "=" NR*var }' var="$var"

Output:

row1 1*20=20

row2 2*20=40

The first output line is crucially empty. This is because predeclaring the variables in this manner makes them unavailable in the BEGIN block. As a result, var is undefined in the initial print but outputs appropriately thereafter. This method is equal to using -v as long as we don’t need the values in a BEGIN block.

Passing Multiple Shell Variables to AWK Using ARGV

We can supply the command-line arguments that are initialized with interpolated shell variables as usual:

$ var1='Lucy'

$ var2='Lucy2'

$ awk 'BEGIN { print "shell_var1=" ARGV[1] "\n" "shell_var2=" ARGV[2] }' "$var1" "$var2"

Output:

shell_var1=Lucy

shell_var2=Lucy2

This is a reliable method to capture any variable value without having to worry about escaping.

This is a robust way of capturing any variable value without worrying about escaping.

Accessing the Shell Variables in AWK Using ENVIRON

All shell variables in a string-indexed array are accessible via the ENVIRON special variable:

$ export sourav='data'

$ awk 'BEGIN { print ENVIRON["sourav"] }'

One slight disadvantage is that we must export the variable:

$ export var1='Happy'

$ export var2='Sad'

$ var3='ok'

$ awk 'BEGIN { print "shell_var1=" ENVIRON["var1"] "\n" "shell_var2=" ENVIRON["var2"] "\n" "shell_var3=" ENVIRON["var3"] }'

Output:

shell_var1=Happy

shell_var2=Sad

shell_var3=

As a result, the value of the non-exported “$var3” variable cannot be obtained using ENVIRON.

Conclusion

This article looks at how to input the shell variables to AWK and then use them efficiently in the programs. We unlock the full potential of AWK and shell scripting. By understanding this technique, it allows us to handle the difficult data analysis jobs with ease.

Similar Posts