If you can thoroughly define what a junk character is, it's possible to come up with a good regular expression that can find the junk characters and remove them (or even just parse anything read in by hand to remove them). Are they anything that don't look like a path? Or is it anything proceeded by a comma (including the comma)? Are junk characters always separated from the rest of the good characters by white-spaces (i'm guessing not since there are some
,8 stuff after the paths, but I don't know if they're really junk or not)?