mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-07-12 11:35:53 +00:00

### Description This PR is two-fold: **Embeddings:** * Embeddings incorporated into the sharepoint source connector, which will now call out to OpenAI and create embeddings if the flag is passed in and the api key provided. **Writing vector content (embeddings) to Azure cognitive search index:** * The schema for the index expected to exist in Azure has been updated to include the vector field type and a test script has been added to test the new content being produced from the Sharepoint connector to push the embedding content. Some important notes about other changes in here: * The embedding code had to be updated to patch the `to_dict` method on elements to add `embeddings` to the dict output if that was added. While the code originally added the embedding content, when `to_dict` was called to save the content as json, this was lost.