Okay, so. I have a #PDF and a #DOCX file. And I’d like to compare them. And since I’m a programmer, I don’t want to compare them visually, but with a #diff. But how?
Like this.
alias pdfcat='gs -q -sDEVICE=txtwrite -o-'
alias doccat='pandoc -t plain'
pdfcat a.pdf > a.txt
doccat b.docx > b.txt
git diff --no-index --word-diff a.txt b.txt
And since we’re using --word-diff, it doesn’t matter that the two files use _wildly_ different line wrapping.