6 Commits

Author SHA1 Message Date
Ronny H
d80abf0714
Reorganized the Examples section in Documentation & add Databricks example (#1855)
To test:
> cd docs && make html

Change logs:
* Examples are reorganized to have its own page
* Removed two old examples, ie. "file-utils" & "sentiment analysis".
* Added two examples: "RAG with Unstructured, LangChain, and ChromaDB" &
"Multi-Files Processing with S3 Connector and API"
* Reorganized and added detailed API documentation: (i) usage, (ii)
SDKs, (iii) Azure Marketplace, (iv) AWS Marketplace, (v) parameters and
validation errors
2023-11-30 01:24:43 +00:00
Matt Robinson
5db94fdee6
docs: add getting started section and remove outdated docs (#277)
* add getting started section to the docs

* remove old examples

* update example notebook

* change to convert_to_dict

* various and sundry edits
2023-02-27 15:10:53 +00:00
Matt Robinson
eba4c80b1e
feat: get_directory_file_info for exploring a directory of files (#142)
* added python-pptx to requirements

* added filetype detection for powerpoint

* add more filetypes to detect

* more tests

* added tests for filetype

* reorder document types

* tests for get_directory_file_info

* added docs for get_directory_file_info

* bump version

* Word -> Office

* added test for filetype

* add group by filetype example
2023-01-11 12:40:50 -05:00
Matt Robinson
b14f6ac9bd
feat: extract metadata from .docx, .xlsx, and .jpg (#113)
* add python-docx dependency

* added function for extracting metadata from word documents

* add openpyxl

* added get_jpg_metadata; fixed typing

* bump changelog

* added pillow to dependencies
2022-12-26 09:34:36 -05:00
Matt Robinson
836f156582
docs: Add example LabelStudio sentiment analysis example (#24)
* added documentation on how to use unstructured with labelstudio

* hard code risk narrative for docs

* link to create project call
2022-10-10 08:27:01 -04:00
Matt Robinson
5f40c78f25 Initial Release 2022-09-26 14:55:20 -07:00