Saint Paul - Merge Many CSVs files
#bash #csv #tail #head
Type: Do
Description: Join (merge) all the 338 files in /home/admin/polldayregistrations_enregistjourduscrutin?????.csv into one single /home/admin/all.csv file with the contents of all the CSV files in any order. There should be only one line with the names of the columns as a header.
Test: The "Check My Solution" button runs the script /home/admin/agent/check.sh, which you can see and execute.
Notes and solution
To merge of of the CSV files it might be tempting to just concatenate all of their contents together in the all.csv solution file
cat *.csv > all.csv
but this leads to headers being repeated all thorough the resulting file.
So, to fix this, instead of concatenating all lines of all files, concatenate the common header (from the first file) and then the content from each file using a recursive search.
head -n 1 polldayregistrations_enregistjourduscrutin10001.csv > all.csv && tail -n+2 -q *.csv >> all.csv
Let's break down this command
- First add the header from the first file to the target file
head -n 1 polldayregistrations_enregistjourduscrutin10001.csv > all.csv
- Then, add the content from all of the files present in the folder at the same time
&&
, this is to prevent adding the target file (all.csv) in this operation, thus adding the header two times.
tail -n+2 -q *.csv >> all.csv
Now a merged file was correctly created without repeating headers.