Bash: How to Calculate the Average of a Column


You can use the following syntax in Bash to calculate the average value of a column in a specific file:

awk '{ sum_values += $2 } END { print sum_values/NR }' team_data.txt

This particular example will return the average of the values in the second column of the file named team_data.txt.

Here is what the various commands do:

  • For each line in the file, add the value in column 2 to a variable named sum_values.
  • Print the sum of the values divided by the total number of records (NR).

To calculate the mean of a different column in the file, simply change $2 to a different value.

For example, use $4 to calculate the mean value of the fourth column in the file.

The following example shows how to use this syntax in practice.

Example: How to Calculate Average of Column Using Bash

Suppose we have a file named team_data.txt that contains information about various basketball teams, including:

  • The name of the team
  • The total points scored by the team
  • The total assists recorded by the team

First, we can use the cat command to view the contents of this file:

Suppose that we would like to calculate the average value of the second column in the file, which represents the points scored by each team.

We can use the following syntax to do so:

awk '{ sum_values += $2 } END { print sum_values/NR }' team_data.txt

The following screenshot shows how to use this syntax in practice:

Bash calculate average value of column

This returns a value of 92.375, which represents the average value in the second column of the file.

We can verify this is correct by manually calculating the average:

  • Average: (78+85+93+90+91+103+99+100) / 8 = 92.375

This matches the value that we calculated.

If you would like to calculate the average value for a specific range of elements in a given column, you can also do so.

For example, you could use the following syntax to calculate the average value in the second column only for elements 2 through 5:

awk 'NR==2, NR==5 { sum_values += $2 } END { print sum_values/4}' team_data.txt

The following screenshot shows how to use this syntax in practice:

Bash calculate average of specific elements in column

This returns a value of 92.375, which represents the average value in the second column of the file.

We can verify this is correct by manually calculating the average of these elements:

  • Average: (85+93+90+91) / 4 = 89.75

This matches the value that we calculated.

Note: Instead of dividing by NR (total number of records) we manually changed this value to 4, since we were calculating the average of four elements.

Related Tutorials

The following tutorials explain how to perform other common tasks in Bash:

Bash: How to Count Number of Unique Lines in File
Bash: How to Count Number of Characters in String
Bash: How to Count Number of Columns in File

Leave a Reply