gagb
c0127af120
Merge pull request #72 from CharlesCNorton/patch-1
...
Fix LLM terms
2024-12-16 17:06:24 -08:00
gagb
33cb5015eb
Merge branch 'main' into patch-1
2024-12-16 17:04:44 -08:00
gagb
cf13b7e657
Merge pull request #73 from CharlesCNorton/patch-2
...
Fix LLM terminology in code
2024-12-16 17:04:33 -08:00
gagb
874eba6265
Merge branch 'main' into patch-2
2024-12-16 16:59:22 -08:00
gagb
c3fa2934b9
Run pre-commit
2024-12-16 16:56:52 -08:00
gagb
736e7d9a7e
Merge branch 'main' into patch-1
2024-12-16 16:53:58 -08:00
gagb
19c111251b
Merge pull request #60 from madduci/main
...
Added Dockerfile
2024-12-16 16:42:26 -08:00
gagb
360c2dd95f
Merge branch 'main' into main
2024-12-16 16:35:50 -08:00
kevinbabou
87846cf5f8
rm setup.py
2024-12-16 16:28:44 -08:00
kevinbabou
33638f1fe6
feature: add argument parsing and setup.py file for cli tool capability
2024-12-16 16:28:44 -08:00
gagb
73776b2c0f
Merge pull request #50 from narumiruna/youtube-transcript-languages
...
Support specifying YouTube transcript language
2024-12-16 16:23:20 -08:00
gagb
2d3ffeade1
Merge branch 'main' into youtube-transcript-languages
2024-12-16 16:20:35 -08:00
gagb
51c1453699
Merge pull request #48 from Soulter/main
...
Fix: pass the kwargs to _convert method when converting an url file
2024-12-16 16:19:09 -08:00
gagb
ae4669107c
Merge branch 'main' into main
2024-12-16 16:01:59 -08:00
gagb
dbc727615d
Merge branch 'main' into main
2024-12-16 15:48:49 -08:00
gagb
b0115cf971
Merge branch 'main' into youtube-transcript-languages
2024-12-16 15:47:38 -08:00
gagb
5cf8474f37
Merge pull request #44 from Y-Kim-64/main
...
Exclude test files from language statistics using linguist-vendored
2024-12-16 15:35:19 -08:00
gagb
83dc81170b
Merge branch 'main' into main
2024-12-16 15:29:33 -08:00
gagb
e7a2e20d93
Merge pull request #39 from SH4DOW4RE/main
...
Catching pydub's warning of ffmpeg or avconv missing
2024-12-16 15:28:53 -08:00
gagb
980abd3a60
Merge branch 'main' into main
2024-12-16 15:24:58 -08:00
afourney
afaff11ef0
Merge branch 'main' into main
2024-12-16 14:40:58 -08:00
afourney
6587e0f097
Merge branch 'main' into patch-1
2024-12-16 14:27:26 -08:00
afourney
978c8763aa
Merge pull request #38 from VillePuuska/support-comments-in-docx
...
Add passing style_map kwarg to Mammoth when converting docx to allow keeping comments
2024-12-16 14:26:55 -08:00
afourney
e7636656d8
Merge branch 'main' into support-comments-in-docx
2024-12-16 14:23:14 -08:00
afourney
ddc1bebea4
Merge branch 'main' into patch-2
2024-12-16 14:20:16 -08:00
afourney
fa1f496d51
Merge branch 'main' into patch-1
2024-12-16 14:18:20 -08:00
afourney
da779dd125
Merge pull request #33 from nyosegawa/feature/add-pptx-chart-support
...
Add PPTX chart support
2024-12-16 14:11:49 -08:00
afourney
12ce5e95b2
Merge branch 'main' into feature/add-pptx-chart-support
2024-12-16 14:06:14 -08:00
gagb
6dad1cca96
Merge pull request #22 from Josh-XT/main
...
Add zip handling
2024-12-16 13:56:25 -08:00
gagb
9e6a19987b
Merge branch 'main' into main
2024-12-16 13:51:39 -08:00
gagb
ed91e8b534
Merge pull request #19 from brc-dd/fix/18
...
Fix character decoding issues with text-like files
2024-12-16 13:49:48 -08:00
gagb
aeff2cb5ae
Merge branch 'main' into fix/18
2024-12-16 13:46:17 -08:00
gagb
c9c7d98d30
Merge pull request #11 from simonw/patch-2
...
CLI usage instructions
2024-12-16 13:45:05 -08:00
gagb
e7d9b5546a
Merge branch 'main' into patch-2
2024-12-16 13:42:28 -08:00
CharlesCNorton
ed651aeb16
Fix LLM terminology in code
...
Replaced all occurrences of mlm_client and mlm_model with llm_client and llm_model for consistent terminology when referencing Large Language Models (LLMs).
2024-12-16 16:23:52 -05:00
CharlesCNorton
3d9f3f3e5b
Fix LLM terms
...
Updated all instances of mlm_client and mlm_model to llm_client and llm_model in the readme. The previous terms (mlm_client and mlm_model) are incorrect in the context of configuring Large Language Models (LLMs), as "MLM" typically refers to Masked Language Models, which is unrelated to the intended functionality. This change aligns the documentation with standard naming conventions for LLM configuration parameters and improves clarity for users integrating with LLMs like OpenAI's GPT models.
2024-12-16 16:23:03 -05:00
Om Gupta
a3208f2bd0
feat: Add IpynbConverter
...
- Implemented IpynbConverter class for converting Jupyter Notebook (.ipynb) files into Markdown format.
- Supports markdown cells, code cells and raw cells.
- First markdown heading is used as the title if no title is found in notebook metadata.
- Created a test notebook (`test_notebook.ipynb`) to verify the functionality of the converter.
2024-12-17 01:00:41 +05:30
Divit
ad01da308d
fix issue #65
2024-12-16 21:48:33 +05:30
CyberNobie
010f841008
Ensure hatch is installed before running tests
2024-12-16 18:47:24 +05:30
Michele Adduci
5fc03b6415
Added UID as argument
2024-12-16 13:11:13 +01:00
Michele Adduci
013b022427
Added Docker Image for using markitdown in a sandboxed environment
2024-12-16 13:08:15 +01:00
narumi
695100d5d8
Support specifying YouTube transcript language
2024-12-16 13:16:00 +08:00
Soulter
d66ef5fcca
Update README to introduce the customized mlm_prompt
2024-12-16 12:08:51 +08:00
Soulter
c168703d5e
Pass the kwargs to _convert method when converting an url file
2024-12-16 11:41:39 +08:00
Yeonjun
3548c96dd3
Create .gitattributes
...
Mark test files as linguist-vendored
2024-12-16 09:21:07 +09:00
SH4DOW4RE
1559d9d163
pre-commit ran
2024-12-15 22:15:20 +01:00
SH4DOW4RE
b7f5662ffd
PR: Catching pydub's warning of ffmpeg or avconv missing
2024-12-15 17:29:14 +01:00
Ville Puuska
0a7203b876
add style_map prop to MarkItDown class
2024-12-15 17:23:57 +02:00
Ville Puuska
0704b0b6ff
pass 'style_map' kwarg to mammoth when converting docx
2024-12-15 16:59:21 +02:00
sakasegawa
0dd4e95584
Remove _is_chart
2024-12-15 21:14:58 +09:00