If you wish to add a single language pack, you could do the following:
* Download the desired ``.trainedata`` file from the `tessdata <https://github.com/tesseract-ocr/tessdata>`_ repository. Let's use Hebrew in this example (``heb.traineddata``)
* Copy the file to ``/home/user/downloads/heb.traineddata``.
* Create a new container based on the ocrmypdf-tess4 image and jump into it with a terminal:
..code-block:: bash
host$ docker run -v /home/user/downloads:/home/docker -it --entrypoint /bin/bash ocrmypdf-tess4
* The latest version of Ghostscript (9.19 as of this writing) has unfixed bugs in Unicode handling that generate invalid character maps, so Ghostscript cannot be used for PDF/A conversion
* The default "hocr" PDF renderer does not handle Asian fonts properly