Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Answers
This is much simpler than the linked question. All you need is:
VIJAY|ACTIVE|2 TAHA|ACTIVE
TAHA|ACTIVE|3
Above scenario I need to delete the records if col1 of File1=col2 of File2 and col1 of File1 not
equal to col2 of File2 the output should be File1 after removing the unwanted records.
Answer :
VIJAY|ACTIVE|2
AWK command to compare two columns of different files and
print required columns from both files.
compare_columns_file.sh
awk -F',' 'NR==FNR{label[$1]=$1;date[$1]=$2;next}; ($2==label[$2]){print $0 ","
date[$2]}' <(sort -k1 file2.csv) <(sort -k2 file1.csv) &> file3.csv
compare_columns_in_file_question.txt
#Question
> I need to match strings between the two files and print to a third file. Data look like this:
#File 1
dbID labnumber myID Status
CMV_1235 LAB06 56-1 Fail
CMV_1236 LAB14 57-1 Fail
CMV_2137 LAB84 54-4 Pass
CMV_2238 LAB85 50-3
CMV_C131 LAB21 51-2 Pass
#File 2
labnumber date
LAB06 18/01/2016
LAB14 27/04/2016
LAB18 10/01/2016
LAB21 9/02/2016
LAB69 4/03/2016
LAB84 18/02/2016
LAB22 18/03/2016
LAB85 27/03/2016
(Not totally overlapping: there may be samples in file 1 but not file 2 and vice versa)
So, If labnumber matches in file 1 and file 2, print all of that line in file 2 then print
relevant date from that line in file 1, into a third file
file1.csv
dbID labnumber myID Status
CMV_1235 LAB06 56-1 Fail
CMV_1236 LAB14 57-1 Fail
CMV_2137 LAB84 54-4 Pass
CMV_2238 LAB85 50-3
CMV_C131 LAB21 51-2 Pass
file2.csv
labnumber date
LAB06 18/01/2016
LAB14 27/04/2016
LAB18 10/01/2016
LAB21 9/02/2016
LAB69 4/03/2016
LAB84 18/02/2016
LAB22 18/03/2016
LAB85 27/03/2016
Answer
Explanation
Solution
---------
How it works:
FNR==NR
When you have two (or more) input files to awk, FNR will reset back to 1 on the first line of
the next file whereas NR will continuing incrementing from where it left off. By checking
FNR==NR we are essentially checking to see if we are currently parsing the first file.
a[$1]++
If we are parsing the first file (see above) then create an associative array with the first field
$1 as the key and post increment the value by 1. This essentially lets us create a 'seen' list.
next
This command tells awk not to process any further commands and to read in the next record
and start over. We do this because file1 is only meant to set the associative array
!a[$1]
This line only executes when FNR==NR is false, i.e. we are not parsing file1 and thus must
be parsing file2. We then use the first field $1 of file2 as the key to index into our 'seen' list
created earlier. If the value returned is 0 it means we didn't see it in file1 and therefore we
should print this line. Conversely, if the value is non-zero then we did see it in file1 and thus
we should not print its value.
Note that !a[$1] is equivalent to !a[$1]{print} because the default action when one is not
given is to print the entire line.