I receive excel file (xslx) with multiple sheets for my project. The number of records on these sheets ranges from 15k to 70k per sheet. I need to perform following tasks on this data and then convert it to CSV. Or covert to CSV and then process the data either way its fine.
call_no uniq_no Type Strength Description 2456 15 TX SomeSting SomeSting 5263 15 BLL SomeSting SomeSting 4263 162 TX SomeSting 2369 215 LH SomeSting 4269 426 BLL SomeSting SomeSting 7412 162 TX SomeSting SomeSting
As per the requirement i need to
- Find duplicate values in column 'uniq_no' and delete all duplicate records except the original record (first record).
- Replace blanks with data. (Just simple find blank and replace with value logic)
- Remove space/tab space in any cell. (This point is not important, its just like a side-quest)
call_no uniq_no Type Strength Description 2456 15 TX SomeSting SomeSting 4263 162 TX **NewDATA** SomeSting 2369 215 LH SomeSting **NewDATA** 4269 426 BLL SomeSting SomeSting
This is a routine task for me. I have fair knowledge of shell scripting. So if anyone can guide me even with rough outline of a script for this then i can do tweaks at my end. Please help.