
The AWK command is a text-processing utility built into Linux and Unix systems. It reads input line by line, splits each line into fields, and applies pattern-action rules you define to filter, transform, or report on that data. Named after its creators Alfred Aho, Peter Weinberger, and Brian Kernighan, AWK has been part of Unix since 1977 and remains one of the most reliable tools in a Linux administrator’s toolkit.
AWK is especially useful when you need to extract columns from structured text, parse log files, process CSV data, or produce quick reports directly in a shell pipeline, without writing a full script.
In this tutorial, you will learn the AWK syntax, work through practical examples covering pattern matching, field separators, and built-in variables, and see how AWK compares to other text tools like grep and sed.
Key Takeaways:
$1, $2, …, $NF) using a field separator you specify.awk 'pattern { action }' filename. Omitting the pattern applies the action to every input line.-F flag to set the field separator (for example, -F, for CSV, -F: for colon-delimited files).NR (current line number), NF (number of fields in the current line), FS (input field separator), OFS (output field separator), RS (input record separator), and ORS (output record separator).BEGIN block runs before any lines are processed and is ideal for initializing variables or printing headers.END block runs after all lines are processed, perfect for printing summaries, totals, or final calculations.grep, for in-place edits use sed, but for anything involving fields, column math, or custom formatted output, AWK is usually the best tool.Run your Linux workloads on DigitalOcean Droplets. Scalable, reliable cloud compute built for developers.
To follow this tutorial, you will need:
gawk) by default. You can confirm with awk --version.AWK is a pattern-scanning and text-processing language. You give it a set of rules in the form pattern { action }, and it applies those rules to every line of its input. When a line matches the pattern, AWK runs the action. When there is no pattern, the action runs on every line.
GNU AWK (gawk) is the most widely used implementation today and is the default awk on most Linux distributions. It extends the original POSIX AWK standard with additional features such as better Unicode support and advanced string functions. You can verify which version you have with:
awk --version
The output on a typical Linux system looks like:
GNU Awk 5.2.1, API 3.2, PMA Avon 8-g1, (GNU MPFR 4.2.1, GNU MP 6.3.0)
All three tools process text, but they each have a different focus. grep searches for lines that match a pattern and prints them. sed applies substitution and editing operations to a stream of lines. AWK does both and goes further: it understands fields, supports arithmetic, and can produce formatted reports.
A quick comparison:
| Tool | Best used for | Field-aware | Arithmetic | Multiline logic |
|---|---|---|---|---|
grep |
Finding lines that match a pattern | No | No | No |
sed |
Search and replace, line editing | No | No | Limited |
awk |
Column extraction, math, reporting | Yes | Yes | Yes |
For more detail on each tool, see the DigitalOcean tutorials on the grep command and the sed command.
Working with structured text files on Unix or Linux often means repeating the same tedious parsing steps. AWK offers a streamlined approach: you specify the patterns and actions, and let the tool handle the rest.
The basic AWK syntax is:
awk options 'pattern { action }' input-file
You can also redirect output to a file:
awk options 'pattern { action }' input-file > output-file
Common options include:
-F sets the field separator (for example, -F: for colon-delimited files).-f reads AWK rules from a script file instead of the command line.-v assigns a variable before processing begins.Every AWK rule has two parts: a pattern and an action block wrapped in curly braces. Both are optional:
# Print every line (no pattern, print is the default action)
awk '{ print }' file.txt
# Print only lines that match a pattern (no explicit action needed)
awk '/error/' file.txt
AWK is handy for working with structured text, especially when you need to select, transform, or summarize columns quickly. Before diving into more advanced examples, let’s look at the basic way to run AWK commands in Linux.
The examples in this tutorial use a file called file.txt. Create it now so you can follow along:
cat > file.txt << 'EOF'
Item Model Country Cost
Phone iPhone USA 999
Laptop MacBook USA 1299
Watch Galaxy Korea 299
Tablet iPad USA 499
Camera Sony Japan 799
EOF
The file has four columns: Item, Model, Country, and Cost.
Pass the filename as the last argument:
awk '{ print $0 }' file.txt
$0 refers to the entire line. The output is:
Item Model Country Cost
Phone iPhone USA 999
Laptop MacBook USA 1299
Watch Galaxy Korea 299
Tablet iPad USA 499
Camera Sony Japan 799
You can pipe input directly into AWK instead of reading from a file:
echo "Alice Engineering 95000" | awk '{ print $1, "works in", $2 }'
The output is:
Alice works in Engineering
For longer or reusable AWK programs, save your rules to a file and pass it with the -f flag:
# Save rules to a script file
echo '{ print $1, $4 }' > print_item_cost.awk
# Run it
awk -f print_item_cost.awk file.txt
The output is:
Item Cost
Phone 999
Laptop 1299
Watch 299
Tablet 499
Camera 799
AWK splits each input line into fields based on a separator. By default, that separator is any whitespace (spaces or tabs).
Fields are numbered starting from $1. $0 always refers to the entire line.
To print the second and third columns:
awk '{ print $2 "\t" $3 }' file.txt
The output is:
Model Country
iPhone USA
MacBook USA
Galaxy Korea
iPad USA
Sony Japan
Use the -F flag when your data uses a delimiter other than whitespace. This is common with CSV files or system files like /etc/passwd, which uses colons.
# Print the first field (username) from /etc/passwd
awk -F: '{ print $1 }' /etc/passwd | head -5
The output is:
root
daemon
bin
sys
sync
$NF is a special variable that always refers to the last field, regardless of how many fields a line has.
# Print the first and last fields
awk '{ print $1, $NF }' file.txt
The output is:
Item Cost
Phone 999
Laptop 1299
Watch 299
Tablet 499
Camera 799
AWK provides several built-in variables that give you information about the current input or let you control output formatting.
| Variable | Description |
|---|---|
NR |
The current record (line) number, counting from 1 |
NF |
The number of fields in the current line |
FS |
The input field separator (default: whitespace) |
RS |
The input record separator (default: newline) |
OFS |
The output field separator (default: space) |
ORS |
The output record separator (default: newline) |
Print each line with its line number using NR:
awk '{ print NR, $0 }' file.txt
The output is:
1 Item Model Country Cost
2 Phone iPhone USA 999
3 Laptop MacBook USA 1299
4 Watch Galaxy Korea 299
5 Tablet iPad USA 499
6 Camera Sony Japan 799
Use OFS to change the output delimiter. The following example outputs comma-separated values:
awk 'BEGIN { OFS="," } { print $1, $2, $3 }' file.txt
The output is:
Item,Model,Country
Phone,iPhone,USA
Laptop,MacBook,USA
Watch,Galaxy,Korea
Tablet,iPad,USA
Camera,Sony,Japan
Print the number of fields in each line using NF:
awk '{ print NF, $0 }' file.txt
The output is:
4 Item Model Country Cost
4 Phone iPhone USA 999
4 Laptop MacBook USA 1299
4 Watch Galaxy Korea 299
4 Tablet iPad USA 499
4 Camera Sony Japan 799
Let’s look at some common ways you can match and process text using AWK pattern matching.
To print all lines that contain the letter o:
awk '/o/ { print $0 }' file.txt
The output is:
Item Model Country Cost
Phone iPhone USA 999
Laptop MacBook USA 1299
Watch Galaxy Korea 299
Camera Sony Japan 799
To count how many lines match a pattern, use a counter and an END block:
awk '/a/ { ++cnt } END { print "Count = ", cnt }' file.txt
The output is:
Count = 4
AWK supports full regular expressions inside the /pattern/ syntax. To match lines that start with an uppercase letter followed by lowercase letters:
awk '/^[A-Z][a-z]/ { print $0 }' file.txt
The output is:
Phone iPhone USA 999
Laptop MacBook USA 1299
Watch Galaxy Korea 299
Tablet iPad USA 499
Camera Sony Japan 799
To match a specific field rather than the entire line, use a comparison instead of a regex:
# Print lines where Country (field 3) is USA
awk '$3 == "USA" { print $0 }' file.txt
The output is:
Phone iPhone USA 999
Laptop MacBook USA 1299
Tablet iPad USA 499
You can also filter by numeric value:
# Print lines where Cost (field 4) is greater than 500
awk '$4 > 500 { print $0 }' file.txt
The output is:
Item Model Country Cost
Phone iPhone USA 999
Laptop MacBook USA 1299
Camera Sony Japan 799
To print only lines with more than 20 characters, use the built-in length function:
awk 'length($0) > 20' file.txt
AWK gives you more than line-by-line filtering. The BEGIN and END blocks let you set things up before any input arrives and wrap up with summary actions after everything’s processed.
BEGIN and END are special patterns in AWK:
BEGIN block runs once before AWK reads any input. Use it to print headers, set variables, or configure separators.END block runs once after AWK has finished reading all input. Use it to print summaries or totals.awk 'BEGIN { print "Item\tCost"; print "----\t----" } NR > 1 { print $1 "\t" $4 }' file.txt
The output is:
Item Cost
---- ----
Phone 999
Laptop 1299
Watch 299
Tablet 499
Camera 799
awk 'BEGIN { print "Starting AWK processing..." } { print $1 } END { print "Done. Processed " NR " lines." }' file.txt
The output is:
Starting AWK processing...
Item
Phone
Laptop
Watch
Tablet
Camera
Done. Processed 6 lines.
Sometimes, data processing requires going beyond basic matching, and AWK gives you the tools for it. When your workflow demands summary statistics, record-keeping, or setup before the main actions, AWK’s control flow features (like blocks and logic) help you get the job done efficiently and flexibly.
AWK supports if, if-else, and nested conditionals inside action blocks.
# Label items as expensive or affordable based on Cost
awk 'NR > 1 { if ($4 > 500) print $1, "is expensive"; else print $1, "is affordable" }' file.txt
The output is:
Phone is expensive
Laptop is expensive
Watch is affordable
Tablet is affordable
Camera is expensive
You can combine multiple conditions using && (and) or || (or):
# Print USA items that cost more than 500
awk '$3 == "USA" && $4 > 500 { print $0 }' file.txt
The output is:
Phone iPhone USA 999
Laptop MacBook USA 1299
AWK supports for and while loops. This is useful when you want to iterate over fields in a line:
# Print each field on its own line, with its field number
awk 'NR == 2 { for (i = 1; i <= NF; i++) print "Field " i ": " $i }' file.txt
The output is:
Field 1: Phone
Field 2: iPhone
Field 3: USA
Field 4: 999
A while loop works similarly:
awk 'BEGIN { i = 1; while (i <= 5) { print "Line:", i; i++ } }'
The output is:
Line: 1
Line: 2
Line: 3
Line: 4
Line: 5
Let’s take a look at some practical examples to see how AWK can help with common data tasks right in your terminal.
The /etc/passwd file uses colons as delimiters and stores one user per line. AWK with -F: makes it easy to extract specific fields.
# Print the username and default shell for each user
awk -F: '{ print $1, $7 }' /etc/passwd | head -5
The output (on a typical Ubuntu system) is:
root /bin/bash
daemon /usr/sbin/nologin
bin /usr/sbin/nologin
sys /usr/sbin/nologin
sync /bin/sync
To print only users whose shell is /bin/bash:
awk -F: '$7 == "/bin/bash" { print $1 }' /etc/passwd
Create a sample CSV file to work with:
cat > employees.csv << 'EOF'
Name,Department,Salary
Alice,Engineering,95000
Bob,Marketing,72000
Charlie,Engineering,88000
Diana,HR,65000
Eve,Marketing,78000
EOF
Print each employee’s name and department, skipping the header row:
awk -F, 'NR > 1 { print $1, "works in", $2 }' employees.csv
The output is:
Alice works in Engineering
Bob works in Marketing
Charlie works in Engineering
Diana works in HR
Eve works in Marketing
Create a sample log file:
cat > access.log << 'EOF'
192.168.1.1 GET /index.html 200 1024
192.168.1.2 POST /login 404 512
192.168.1.3 GET /about.html 200 2048
192.168.1.1 GET /contact.html 500 256
192.168.1.4 GET /index.html 200 1024
EOF
Print all IP addresses that returned a 200 status code:
awk '$4 == 200 { print $1 }' access.log
The output is:
192.168.1.1
192.168.1.3
192.168.1.4
Count the number of 404 errors:
awk '$4 == 404 { count++ } END { print "404 errors:", count }' access.log
The output is:
404 errors: 1
To add up all values in the Cost column of file.txt (skipping the header row):
awk 'NR > 1 { sum += $4 } END { print "Total cost:", sum }' file.txt
The output is:
Total cost: 3895
To also compute the average:
awk -F, 'NR > 1 { sum += $3; count++ } END { print "Average salary:", sum / count }' employees.csv
The output is:
Average salary: 79600
If you want to save the output of your AWK command, use the > redirection operator:
awk '/a/ { print $3 "\t" $4 }' file.txt > output.txt
Verify the result with cat:
cat output.txt
The output is:
USA 1299
Korea 299
USA 499
Japan 799
| Scenario | Best tool |
|---|---|
| Search for lines containing a pattern | grep |
| Find and replace text across a file | sed |
| Extract a specific column from structured text | awk |
| Perform arithmetic on column values | awk |
| Produce formatted reports or summaries | awk |
| Simple one-line substitutions in a script | sed |
| Filter log lines by a fixed string | grep |
| Parse CSV or colon-delimited files | awk |
Use grep when you just need to know whether a pattern exists and want matching lines returned. Use sed when your goal is to substitute or delete text. Reach for awk when you need to work with fields, run comparisons, or produce output that depends on values across multiple columns.
For deeper reading, see the DigitalOcean tutorials on the grep command and the sed command.
AWK is a powerful text-processing utility available on Unix and Linux systems. It is designed for pattern scanning and reporting, allowing users to process files and streams line by line. For each input line, AWK automatically splits the line into fields based on a delimiter (by default, whitespace) and lets you apply user-defined rules or operations—called pattern-action pairs—that can filter records, extract columns, perform arithmetic, generate reports, and more. AWK’s ability to work directly with columns makes it particularly useful for managing structured or delimited data such as logs, CSV, or configuration files.
To run AWK, you typically use the syntax:
awk 'pattern { action }' filename
This means that for each line in the file, AWK evaluates the pattern and, if it’s matched (or omitted), executes the associated action. You can also stream data into AWK by piping output from another command, as in command | awk '{ action }', making it easy to integrate AWK into shell pipelines. AWK can accept multiple input files at once, and you can use command-line options like -F to specify field separators or -f to load more complex AWK programs from an external file. This flexibility makes AWK useful in many automation and text processing scenarios.
awk '{ print $1 }' do?The command awk '{ print $1 }' prints the first field (column) from every input line. AWK divides each line into fields using whitespace as the default delimiter, so $1 refers to the first separated word or value.
For example, given a line like Name Age Country, running this command would print only the Name part from each line. This is commonly used to quickly extract the first column from structured data.
awk '{ print $2 }' do?awk '{ print $2 }' extracts and prints the second field from each line of input. For instance, if processing a file with lines like John 30 Engineer, the command prints 30 from that line (since $2 identifies the second field). This simple mechanism allows you to pull out any column from tabular or space-separated data without much effort.
The -F flag in AWK sets the field separator, which controls how AWK splits each line into fields. By default, AWK uses whitespace, but you can use -F to specify any delimiter character or regular expression. For example, awk -F: '{ print $1 }' /etc/passwd instructs AWK to use a colon (:) as the field separator, which is useful for files like /etc/passwd where fields are colon-delimited. This makes AWK extremely flexible for working with different types of structured data.
NR and NF are two of AWK’s most important built-in variables. NR stands for “Number of Record” and contains the current line (record) number, incrementing as AWK processes each line of input. This is useful for numbering output lines or performing actions only on specific lines.
NF stands for “Number of Fields” and stores the number of fields (columns) in the current line, which is especially useful when lines have a variable number of columns, or you want to access the last field using $NF. Both variables play a key role in filtering, formatting, and controlling AWK’s flow.
gawk is the GNU implementation of the AWK language, and it is the default AWK interpreter on most modern Linux distributions. While traditional AWK conforms to the POSIX standard, gawk extends the language with additional features, such as better Unicode and multibyte character support, more advanced string and arithmetic functions, built-in networking, and enhanced command-line options. Most scripts written for standard AWK will work in gawk, but gawk also enables more complex text-processing scenarios thanks to its extra capabilities.
AWK is ideal for quick, one-liner tasks involving field extraction, column-based filtering, or inline text manipulation—especially within shell pipelines or when dealing with structured tabular data. If your task requires advanced programming constructs, complex data manipulation, or integration with other systems, Python might be a better choice due to its rich libraries and maintainability for larger scripts. On the other hand, sed excels at simple, line-by-line substitutions or edits within streams or text files. In summary, use AWK for fast, field-aware processing; Python for complex scripting; and sed for simple text replacements.
AWK is a straightforward but powerful tool for text processing on Linux and Unix systems. Whether you need to pull a specific column from a file, sum up values, parse a log, or compare AWK against grep and sed for the right job, the examples in this tutorial give you a solid foundation to work from.
For more on Linux text processing and shell tools, check out the following tutorials:
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
I help Businesses scale with AI x SEO x (authentic) Content that revives traffic and keeps leads flowing | 3,000,000+ Average monthly readers on Medium | Sr Technical Writer(Team Lead) @ DigitalOcean | Ex-Cloud Consultant @ AMEX | Ex-Site Reliability Engineer(DevOps)@Nutanix
Java and Python Developer for 20+ years, Open Source Enthusiast, Founder of https://www.askpython.com/, https://www.linuxfordevices.com/, and JournalDev.com (acquired by DigitalOcean). Passionate about writing technical articles and sharing knowledge with others. Love Java, Python, Unix and related technologies. Follow my X @PankajWebDev
With over 6 years of experience in tech publishing, Mani has edited and published more than 75 books covering a wide range of data science topics. Known for his strong attention to detail and technical knowledge, Mani specializes in creating clear, concise, and easy-to-understand content tailored for developers.
I want to replace the strings “www.example.com” with https://linuxbuz.com in all files inside /etc directory. How can i achieve this with awk command? Thanks in advance.
- sima chavda
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.