Bash: How to Delete Duplicate Lines in File


Often you may want to use Bash to delete duplicate lines in a particular file.

You can use the following syntax with awk to do so:

awk '!seen[$0]++' city_names.txt

This particular example will delete duplicate lines in the text file named city_names.txt.

Note that if you’d like to modify the file in-place, then you can use the -i flag as follows:

awk -i inplace '!seen[$0]++' city_names.txt

This will delete duplicate lines in the city_names.txt file in-place.

The following example shows how to use this syntax in practice.

Example: Use Bash to Delete Duplicate Lines in File

Suppose that we have a file named city_names.txt that contains the names of various U.S. cities.

We can use the cat command to view the contents of this file:

To view which lines in the file have duplicates, we can first use the following syntax:

sort city_names.txt | uniq -d

The following screenshot shows how to use this syntax in practice:

This tells us that the following city names occur on more than one line in the file:

  • Atlanta
  • Dallas

We can then use the following syntax to delete duplicate lines in the file:

awk -i inplace '!seen[$0]++' city_names.txt

The following screenshot shows how to use this syntax in practice:

Bash delete duplicate lines

Notice that when we use the cat command again to view the contents of the file that all duplicate lines have been deleted and there is now only one occurrence of both Dallas and Atlanta.

Related Tutorials

The following tutorials explain how to perform other common tasks in Bash:

Bash: How to List All Files in Folder by Size
Bash: How to List All Files in Folder Without Extension
Bash: How to List All Files Matching a Specific Pattern

Leave a Reply