Michele Dolfi
|
1de2e4f924
|
feat: export document pages as multimodal output (#54)
* feat: export document pages as multimodal output
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* create a single parquet output
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* add loading into HF datasets library
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* renaming
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
* cleanup
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
---------
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
|
2024-09-03 15:05:35 +02:00 |
|