Apia - Needle in a Haystack
#diff #bash #awk
Type: Do
Description: In a directory /home/admin/data, there are multiple files, all of them with same content. One of these files has been modified, a word was added. You need to identify which word it is and put it in the solution file (both newline terminated or not are accepted).
Test: md5sum /home/admin/solution should return 55aba155290288b58e9b778c8f616560 or 2eeefea9fc4b16ea624bed5c67a49d80
Check My Solution: The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can
Notes and solution:
First, we see how many files are in the target directory.
cd /home/admin/data
ls

We'll use the file #0 as a basis to compare the rest of the files. If file1.txt is the odd one out, the diff command will show that all of the files are different; if not, it'll only show the file that's different from the others.
We'll list all the files with the ls command and pass them as arguments (using xargs) to the diff command
ls | xargs diff -c --color --from-file file0.txt

where
--colorhighlights the changes between files (green meaning a line was added and red, deleted).-cadds context to the result of the command, like where it was found and in which file.--from-filemakes it possible to compare file0.txt with multiple files.
And now we see that the file that's different from the rest is file76.txt.
Additionally, each line consists of a lot of text so it is difficult to find the difference between each one.
At this point, it's only a matter of looking word by word which to find the odd one out. Who has time for that?
The proposed solution is to separate each word in a line, save that list in a new file and then compare it to the second converted file using diff.
- Separate text into a list of words: To achieve this, we'll use the
awkcommand with thegsubutility to substitute each space with a new line.
cat file0.txt | awk '{ gsub(" ", "\n") } 1'

where the 1 at the end makes the command print $0 at the end of execution.
- Save the result in a temporary file, to
diffthem later.
cat file0.txt | awk '{ gsub(" ", "\n") } 1' > tempfile1.txt
cat file76.txt | awk '{ gsub(" ", "\n") } 1' > tempfile2.txt
- Compare the two new files together
diff --color tempfile1.txt tempfile2.txt

And now we see that the secret word is eureka.
