View the Most Wanted LQ Wiki articles.
LinuxQuestions.org > Linux Wiki > Uniq

From LQWiki

Jump to: navigation, search

Uniq is a command for removing successive duplicate lines from files.

Usage: uniq [options] [input_file] [output_file]

Strangely, uniq takes input and output arguments without requiring redirection. For instance,

sort - foo

will read stdin and store output in file "foo" while

sort foo -

will read from file "foo" and display on stdout.

It is usual to pipe data from sort to uniq, since uniq only checks sequential lines.

foo
foo
baz
bar
baz
Baz

will be processed as

foo
baz
bar
baz
Baz

with an unsorted and case sensitive input.

Among the many options (see the man pages for full details) -c is useful for counting how many duplicated of each line there are. -i renders the matching case-insensitive or ignores case.

~
1052>> sort -f foobarbaz | uniq -ci | sort -r                                  
      3 Baz
      2 foo
      1 bar

(Note that to get case-insensitive and sorted output, sort must be rendered insensitive as well, or else tr or sed could just avoid the issue by transforming the lines, because sort would otherwise alphabetize 'Baz' before 'bar'. Indeed, if counting is not an issue and other options are not wanted, sort -fu foobarbaz would eliminate the need for piping to uniq at all.)

The -s, -f and -w options give more flexibility to sorting columnar lines or ignoring beginnings or endings of lines.

The -d, -D, and -u options control output.


Personal tools