Bill de hÓra has a nice writeup on parsing junk markup. I think anyone who has tried to extract anything out of markup will have faced this problem. It seems idealistic to deny reading junk markup as it is the responsibility of the author. [Continue]

