How to master the AWK command to process texts under Linux?

Q: How to debug an AWK script?

To debug an AWK script, you can use the -W option with the lint parameter, which displays warning messages about potential errors in the script. You can also use the -W option with the dump-variables parameter, which displays the variable values at the end of the script execution.

Q: How to use the AWK command to sort data?

To use the AWK command to sort data, you can use the sort command in combination with AWK. For example, if you want to sort users in the /etc/passwd file by their UIDs, you can use the following command: awk -F: '{print $1, $3}' /etc/passwd | sort -n -k2 By combining AWK with other commands (here sort), you can easily go much further in displaying and organizing data.

Francis

The AWK command is a powerful and versatile tool for processing and transforming text data in Linux. Whether extracting information, filtering rows, reformatting output, or performing calculations, AWK can make your life easier with just a few lines of code. In this article, you will learn how to use the AWK command for text manipulation in Linux.

What is the AWK command?

The AWK command is an interpreted programming language that runs in the Linux terminal. Its name comes from the initials of its creators: Alfred Aho, Peter Weinberger and Brian Kernighan. AWK was originally designed to process files structured into fields separated by delimiters, such as CSV files or /etc/passwd files. But AWK can also handle more complex text files, such as HTML or XML files. AWK is not an object-oriented programming language , but it allows you to define local or global functions and variables. It also has control structures like loops and conditions.

The general syntax of the AWK command is as follows:

awk [options] 'program' [files]

The program is a series of instructions that define patterns to search for in each line of the file and actions to perform when a pattern is found. The options allow you to modify the behavior of the AWK command, such as the choice of field delimiter or the output format.

How to print text with the AWK command?

The AWK command can be used to print a message to the terminal based on a pattern in the text. If you run the AWK command without any reason and just a print command, AWK prints the message every time you press Enter.

For example, if you type:

awk '{print "Hello"}'

And you press Enter several times, you get:

Good morning, good morning, good morning

To stop the AWK command, you can press Ctrl+C.

If you want to print the contents of a file with the AWK command, you can use the BEGIN , which runs before reading the file, and the END , which runs after reading the file. For example, if you have a file named test.txt that contains:

This is a test AWK is a great tool Linux is the best operating system

You can print the contents of the file with the following command:

awk 'BEGIN {print "Here is the content of the test.txt file:"} {print} END {print "End of file"}' test.txt

Which give :

Here is the content of the test.txt file: This is a test AWK is a great tool Linux is the best operating system End of file

{print} command with no arguments prints the entire line. You can also print a specific field using the $n , where n is the field number. By default, fields are separated by spaces or tabs, but you can change the delimiter with the -F .

For example, if you want to print the first and third fields of the /etc/passwd , which are separated by a colon ( :) , you can use the following command:

awk -F: '{print $1 " " $3}' /etc/passwd

Which gives something like:

root 0 daemon 1 bin 2 sys 3 sync 4 games 5 man 6 lp 7 mail 8 news 9 uucp 10 proxy 13 www-data 33 ...

You can also print arithmetic expressions or character strings with the AWK command. For example, if you want to print the square of the second field of the test.txt , you can use the following command:

awk '{print $2^2}' test.txt

Which give :

is 16 is

If you want to print the number of lines in the test.txt , you can use the special variable NR , which contains the number of the current line. For example, you can use the following command:

awk 'END {print NR}' test.txt

Which give :

How to filter text with the AWK command?

The AWK command can be used to filter text based on patterns or conditions. If you specify a pattern before an action, AWK only performs the action if the pattern is found in the row. The pattern can be a regular expression, a comparison, a logical operation, or a combination of these.

For example, if you want to print lines from the test.txt that contain the word Linux , you can use the following command:

awk '/Linux/ {print}' test.txt

Which give :

Linux is the best operating system

If you want to print lines from the /etc/passwd that have a UID greater than 1000, you can use the following command:

awk -F: '$3 > 1000 {print}' /etc/passwd

Which gives something like:

systemd-coredump:x:997:997:systemd Core Dumper:/:/usr/sbin/nologin tss:x:131:142:TPM software stack,,,:/var/lib/tpm:/bin/false _rpc: x:132:65534::/run/rpcbind:/usr/sbin/nologin statd:x:133:65534::/var/lib/nfs:/usr/sbin/nologin libvirt-qemu:x:64055:139: Libvirt Qemu,,,:/var/lib/libvirt:/usr/sbin/nologin libvirt-dnsmasq:x:134:144:Libvirt Dnsmasq,,,:/var/lib/libvirt/dnsmasq:/usr/sbin/nologin snapd-range-524288-root:x:524288:524288::/nonexistent:/bin/false snap_daemon:x:584788:584788::/nonexistent:/bin/false ...

You can also use the logical operators && (and), || (or) and ! (no) to combine patterns. For example, if you want to print lines from the /etc/passwd that have a UID greater than 1000 and a shell other than /usr/sbin/nologin , you can use the following command:

awk -F: '$3 > 1000 && $7 != "/usr/sbin/nologin" {print}' /etc/passwd

Copy

Which gives something like:

tss:x:131:142:TPM software stack,,,:/var/lib/tpm:/bin/false _rpc:x:132:65534::/run/rpcbind:/usr/sbin/nologin statd:x: 133:65534::/var/lib/nfs:/usr/sbin/nologin libvirt-qemu:x:64055:139:Libvirt Qemu,,,:/var/lib/libvirt:/usr/sbin/nologin libvirt-dnsmasq :x:134:144:Libvirt Dnsmasq,,,:/var/lib/libvirt/dnsmasq:/usr/sbin/nologin snapd-range-524288-root:x:524288:524288::/nonexistent:/bin/false snap_daemon:x:584788:584788::/nonexistent:/bin/false ...

How to edit text with the AWK command?

The AWK command can be used to modify text using built-in functions or special variables. For example, if you want to replace spaces with hyphens in the test.txt , you can use the gsub , which replaces all occurrences of one string with another. You can also use the special OFS , which defines the output field separator. For example, you can use the following command:

awk '{print strftime ("%d/%m/%y%h:%m:%s", $ 1 "" $ 2)}' test.txt

Which give :

30/10/2021 16:13:49
31/10/2021 17:14:50
01/11/2021 18:15:51

You can consult the AWK command manual to find out the other available functions and variables.

How to use for loop with AWK command?

The AWK command can be used to perform for loops over fields or lines in a file. The syntax of the for loop is as follows:

for (variable in array) action

Where variable is the name of the variable that successively takes the values from the array , and action is the action to perform in each iteration.

For example, if you want to print the fields of a file in reverse order, you can use the for loop with the special variable NF , which contains the number of fields in the current line. For example, if you have a test.txt that contains:

This is a test AWK is a great tool Linux is the best operating system

You can reverse the order of fields with the following command:

awk '{for (i=NF; i>0; i--) print $i}' test.txt

Which give :

Test one is this great tool one is the best system operating awk is linux

You can also use the for loop to iterate through the lines of a file with the special variable FNR , which contains the line number relating to the current file. For example, if you want to print the even line numbers of the test.txt , you can use the following command:

awk 'FNR%2==0 {print FNR}' test.txt

Which give :

2
4

How to run an AWK script?

.awk extension and give it execution rights with the chmod +x . Then you can run the script with the command ./script_name.awk [files] .

For example, if you have a script named hello.awk that contains:

#!/usr/bin/awk -f BEGIN {print "Hello"}

You can run the script with the following command:

./hello.awk

Which give :

Good morning

How to pass arguments to an AWK script?

To pass arguments to an AWK script, you can use two methods:

The first method is to use the -v with the variable=value . For example, if you want to pass two arguments named var1 and var2 to your hello.awk , you can use the following command:

awk -v var1=hello -v var2=world -f hello.awk

And in your hello.awk you can access the arguments with variables $var1 and $var2 . For example, if your script contains:

#!/usr/bin/awk -f BEGIN {print $var1 " " $var2}

You obtain :

Bonjour Monde

The second method is to use the special ARGV , which contains the arguments passed to the script. For example, if you want to pass two unnamed arguments to your hello.awk , you can use the following command:

awk -f hello.awk hello world

And in your hello.awk ARGV[1] and ARGV[2] indices . For example, if your script contains:

#!/usr/bin/awk -f BEGIN {print ARGV[1] " " ARGV[2]}

You obtain :

Bonjour Monde

FAQs

What is the difference between AWK and GAWK?

GAWK is a GNU implementation of AWK, which adds additional features to the original language, such as support for extended regular expressions, multidimensional arrays or predefined functions.

How to debug an AWK script?

To debug an AWK script, you can use the -W lint parameter , which displays warning messages about potential errors in the script. You can also use the -W with the dump-variables , which displays the variable values at the end of the script execution.

How to use the AWK command to sort data?

To use the AWK command to sort data, you can use the sort in combination with AWK. For example, if you want to sort users in the /etc/passwd by their UIDs, you can use the following command:

awk -F: '{print $1, $3}' /etc/passwd | sort -n -k2

By combining AWK with other commands (here sort ), you can easily go much further in displaying and organizing data.

How to print the word count of a file with the AWK command?

To print the number of words in a file with the AWK command, you can use the special variable NF , which contains the number of fields in the current line, and the special variable NR , which contains the number of the current line. Using a for loop, you can count the number of words in each line and add them to a total . Using the special pattern END you can print the final result. For example, if you have a file named test.txt that contains:

This is a test AWK is a great tool Linux is the best operating system

You can print the word count of the file with the following command:

awk '{for (i=1; i<=NF; i++) total++} END {print total}' test.txt

Which give :

How to use the AWK command to extract data from a CSV file?

To use the AWK command to extract data from a comma-separated values (CSV) file, you can use the -F to set the field separator to a comma. For example, if you have a file named test.csv that contains:

name, first name, age Alice, Dupont, 25 Bob, Martin, 32 Charles, Durand, 28

You can extract the name and age of people with the following command:

awk -F"," '{print $1 " " $3}' test.csv

Which give :

name age Alice 25 Bob 32 Charles 28

How to filter data with the AWK command?

The AWK command allows you to filter data based on patterns, which are regular expressions or logical conditions. Patterns are placed before actions, separated by curly brackets. For example, if you want to display lines in the test.csv that contain the name Alice , you can use the following pattern:

awk -F"," '/Alice/ {print}' test.csv

Which give :

Alice,Dupont,25

If you want to display lines in the test.csv that are older than 30 years, you can use the following pattern:

awk -F"," '$3 > 30 {print}' test.csv

Which give :

Bob,Martin,32

You can combine multiple patterns with the logical operators && (and), || (or) and ! (No). For example, if you want to display lines in the test.csv that have a name starting with C or an age less than 10 years old, you can use the following pattern:

awk -F"," '($1 ~ /^C/) || ($3 < 10) {print}' test.csv

Which give :

CAMILLE,M,7 CLARA,F,11 CLEMENT,M,7

How to calculate statistics with the AWK command?

The AWK command allows you to calculate statistics on numerical data in a file, such as sum, average, minimum or maximum. To do this, simply use variables to store intermediate values and update them on each line. Using the special END , we can display the final result. For example, if you want to calculate the sum and average of the ages in the test.csv , you can use the following program:

awk -F"," 'NR>1 {sum+=$3; count++} END {print "Sum: " sum; print 'Average: 'sum/count}' test.csv

Which give :

Sum: 110 Average: 18.3333

Explanations:

We use the -F"," to define the field separator as a comma.
We use the condition NR>1 to ignore the first line of the file, which contains the column names.
We use the sum and count to accumulate the sum and number of ages. We use the += to increment the variables with the value of the third field ( $3 ).
We use the END to display the final result. We use the / to calculate the average by dividing the sum by the number.

Likewise, if you want to calculate the minimum and maximum ages of the test.csv , you can use the following program:

awk -F"," 'NR>1 {if (min=="") min=max=$3; if ($3 max) max=$3} END {print "Min: " min; print 'Max: 'max}' test.csv

Which give :

Min: 6 Max: 32

Explanations:

We use the -F"," to define the field separator as a comma.
We use the condition NR>1 to ignore the first line of the file, which contains the column names.
We use the min and max to store the minimum and maximum ages. We initialize these variables with the value of the third field ( $3 ) if they are empty ( "" ). We use the < and > to compare values and update variables if necessary.
We use the END to display the final result.

Conclusion

The AWK command is an essential tool for manipulating texts under Linux. It allows you to perform complex tasks in a few lines of code, such as extracting, filtering, modifying or calculating data. It offers great flexibility thanks to its patterns, actions, functions and variables. It can be combined with other Linux commands to expand its possibilities. For example, you can use the Echo command on Windows to display a message or variable on the screen. If you want to learn more about the AWK command, you can consult the manual or the many tutorials available on the Internet.

What is the AWK command?

How to print text with the AWK command?

How to filter text with the AWK command?

How to edit text with the AWK command?

How to use for loop with AWK command?

How to run an AWK script?

How to pass arguments to an AWK script?

FAQs

What is the difference between AWK and GAWK?

How to debug an AWK script?

How to use the AWK command to sort data?

How to print the word count of a file with the AWK command?

How to use the AWK command to extract data from a CSV file?

How to filter data with the AWK command?

How to calculate statistics with the AWK command?

Conclusion

Similar articles Learn more about the author

Things to Remember while Performing RAID Data Recovery

Cleaning the ink absorber of your Canon printer

How to make money with your laptop?