Write Your Own uniq Tool
This challenge is to build your own version of the Unix command line tool uniq.
I’m a big fan of the Unix Philosophy and the command line tools it has spawned. By chaining them together you can create complex and powerful software from simple building blocks. That’s a great model to follow for the functions and modules in the software you write and the components you use in the systems you build as a software developer.
The Challenge - Building uniq
Let’s check out the man page on uniq to find out what it does:
The uniq utility reads the specified input_file comparing adjacent lines, and writes a copy of each unique input line to the output_file. If input_file is a single dash (‘-’) or absent, the standard input is read. If output_file is absent, standard output is used for output. The second and succeeding copies of identical adjacent input lines are not written. Repeated lines in the input will not be detected if they are not adjacent, so it may be necessary to sort the files first.
Like most programming languages we’re zero indexed!
For this step, I’ll leave you to setup your IDE / editor of choice and programming language of choice. After that here’s what I’d like you to do to be ready to test your solution.
Download the test data from my dropbox here and unzip it.
In this step your goal is to read an input file and and handle the default behaviour, which is to remove duplicate adjacent lines.
For example the input:
line1 line2 line2 line3 line4
% uniq test.txt line1 line2 line3 line4
You can test your program using the above or on the countries list:
% uniq countries.txt | wc -l 246
In this step your goal is to read from std input and write out to a file. When the input argument is - then you should read from standard input. When an output file is specified you should write to that rather than standard out.
For example, read from standard input and write to standard out:
% cat test.txt| uniq - line1 line2 line3 line4
Read from standard input and write to a file:
% cat test.txt| uniq - out.txt % cat out.txt line1 line2 line3 line4
In this step your goal is to support the count option:
--count. This adds a new first column that contains a count of the number of times a line appears in the input file.
% uniq -c test.txt 1 line1 2 line2 1 line3 1 line4
In this step your goal is to implement the repeated option:
--repeated. When this option is specified your program should output only repeated lines.
% uniq -d test.txt line2
Try the command
uniq -d countries.txt to find out which countries cuisine my kids love!
In this step your goal is to output only uniq lines - that is the
-u option. When implemented you should be able to do this:
% uniq -u test.txt line1 line3 line4
In this step your goal is to support a combination of command line options, i.e.
uniq -c -d countries.txt Each of the repeated countries should appear twice.
Once you’ve done that - congratulations you’ve built a lite version of the unix command line tool uniq!
Help Others by Sharing Your Solutions!
If you think your solution is an example other developers can learn from please share it, put it on GitHub, GitLab or elsewhere. Then let me know - ping me a message via Twitter or LinkedIn or just post about it there and tag me.
Get The Challenges By Email
If you would like to recieve the coding challenges by email, you can subscribe to the weekly newsletter on SubStack here: