Bobinas P4G
  • Login
  • Public

    • Public
    • Groups
    • Popular
    • People

Notices by Dan Jones (danjones000@fedi.absturztau.be)

  1. Dan Jones (danjones000@fedi.absturztau.be)'s status on Monday, 14-Feb-2022 08:28:13 UTC Dan Jones Dan Jones
    in reply to
    • Amolith

    @amolith

    I used to do this for a living.

    I built a cloud based document management system that would take scanned pages, OCR them, and store them as PDFs.

    We used Google Cloud Vision, which was overkill, but my CEO had a hard-on for Google.

    Tesseract should be all you need: https://guides.library.illinois.edu/c.php?g=347520&p=4121426

    Although, may I suggest using DjVu instead of PDF. DjVu is a better archival format. It’s much simpler usually results on smaller fine sizes. Many PDF viewers already support it. But I don’t know exactly what your use case is, so that may not be an option

    In conversation Monday, 14-Feb-2022 08:28:13 UTC from fedi.absturztau.be permalink

    Attachments

    1. LibGuides: Introduction to OCR and Searchable PDFs: Using Tesseract
      from Scholarly Commons
      Learn OCR best practices and how to begin an OCR project using ABBYY FineReader, Adobe Acrobat Pro, or Tesseract with this guide.
  2. Dan Jones (danjones000@fedi.absturztau.be)'s status on Tuesday, 05-Oct-2021 18:31:09 UTC Dan Jones Dan Jones
    in reply to
    • muesli
    • Mr. Teatime

    @fribbledom @Mr_Teatime

    I like the term “willfully ignorant”

    In conversation Tuesday, 05-Oct-2021 18:31:09 UTC from fedi.absturztau.be permalink

User actions

    Dan Jones

    Dan Jones

    Husband, father, Christian, Mormon, PHP Developer, etc.

    Tags
    • (None)
    ActivityPub
    Remote Profile

    Following 0

      Followers 0

        Groups 0

          Statistics

          User ID
          25491
          Member since
          5 Oct 2021
          Notices
          2
          Daily average
          0

          Feeds

          • Atom
          • Help
          • About
          • FAQ
          • Privacy
          • Source
          • Version
          • Contact

          Bobinas P4G is a social network. It runs on GNU social, version 2.0.1-beta0, available under the GNU Affero General Public License.

          Creative Commons Attribution 3.0 All Bobinas P4G content and data are available under the Creative Commons Attribution 3.0 license.