I used to do this for a living.
I built a cloud based document management system that would take scanned pages, OCR them, and store them as PDFs.
We used Google Cloud Vision, which was overkill, but my CEO had a hard-on for Google.
Tesseract should be all you need: https://guides.library.illinois.edu/c.php?g=347520&p=4121426
Although, may I suggest using DjVu instead of PDF. DjVu is a better archival format. It’s much simpler usually results on smaller fine sizes. Many PDF viewers already support it. But I don’t know exactly what your use case is, so that may not be an option