Christine Straub
237d04c896
feat: improve natural reading order by filtering OCR results ( #1768 )
...
### Summary
Some `OCR` elements with only spaces in the text have full-page width in
the bounding box, which causes the `xycut` sorting to not work as
expected. Now the logic to parse OCR results removes any elements with
only spaces (more than one space).
---------
Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com>
Co-authored-by: christinestraub <christinestraub@users.noreply.github.com>
2023-10-16 23:05:55 +00:00
..
2023-10-12 16:14:53 +00:00
2023-10-16 23:05:55 +00:00
2023-10-12 21:33:10 +00:00
2023-09-27 21:05:55 +00:00
2023-09-21 11:51:08 -07:00
2023-10-06 18:49:29 +00:00
2023-08-30 07:21:04 +00:00
2023-10-13 00:38:08 +00:00
2023-09-21 11:51:08 -07:00
2023-09-21 11:51:08 -07:00
2023-09-21 11:51:08 -07:00
2023-09-21 11:51:08 -07:00
2023-10-13 00:38:08 +00:00
2023-10-07 02:18:37 +00:00
2023-08-29 23:23:14 +00:00
2023-08-29 23:23:14 +00:00
2023-10-06 18:49:29 +00:00
2023-10-16 17:59:35 -05:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-12 20:27:30 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-07 19:34:32 -07:00
2023-10-06 18:49:29 +00:00
2023-10-12 21:33:10 +00:00
2023-10-06 18:49:29 +00:00
2023-10-12 16:14:53 +00:00
2023-10-12 17:31:23 +00:00
2023-10-06 18:49:29 +00:00
2023-10-10 17:41:18 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-12 17:33:25 +00:00
2023-10-16 14:26:30 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-13 00:38:08 +00:00
2023-10-13 00:38:08 +00:00
2023-10-06 18:49:29 +00:00
2023-10-06 18:49:29 +00:00
2023-10-12 20:27:30 +00:00