Hi,
I have a csv file containing links to a site and number of time it was hit. It contains lakhs of such entries. Some link entries are duplicate like the following entry is present in the file multiple times:
http://www.xyz.com/link1, 15
http://www.xyz.com/link1, 13
http://www.xyz.com/link1, 17
http://www.xyz.com/link1, 18
http://www.xyz.com/link1, 11
Now I want to combine these link into a single entry in file like:
http://www.xyz.com/link1, 74
I have a program which reads lines from one file and writes them into second file. Whenever some duplicate comes it does not enter it into the second file but just add its value to number of hits field.
In the end second file is created with unique links and number of times they were hit.
But As I have to read single link from first file and compare it with each link in second file in loop. This is taking lot of time. (in days)
Can anyone suggest me a better solution to this. Someone who has worked with regex might help.
If I have posted this in wrong place in the site please send me the link where i should post this topic.
Thanks in advance.