mirror of
https://github.com/run-llama/llama-hub.git
synced 2025-08-31 20:07:51 +00:00
47 lines
2.2 KiB
Markdown
47 lines
2.2 KiB
Markdown
# Confluence Loader
|
|
|
|
This loader loads pages from a given Confluence cloud instance. The user needs to specify the base URL for a Confluence
|
|
instance to initialize the ConfluenceReader - base URL needs to end with `/wiki`. The user can optionally specify
|
|
OAuth 2.0 credentials to authenticate with the Confluence instance. If no credentials are specified, the loader will
|
|
look for `CONFLUENCE_USERNAME` and `CONFLUENCE_API_TOKEN` environment variables to proceed with basic authentication.
|
|
|
|
For more on authenticating using OAuth 2.0, checkout:
|
|
- https://atlassian-python-api.readthedocs.io/index.html
|
|
- https://developer.atlassian.com/cloud/confluence/oauth-2-3lo-apps/
|
|
|
|
User then specify a list of `page_ids` and/or `space_key` to load in the corresponding pages into Document objects, if
|
|
both are specified the union of both sets will be returned. User can also specify a boolean `include_attachments` to
|
|
include attachments, this is set to `False` by default, if set to `True` all attachments will be downloaded and
|
|
ConfluenceReader will extract the text from the attachments and add it to the Document object. Currently supported attachment types are: PDF, PNG, JPEG/JPG, SVG, Word and Excel.
|
|
|
|
Hint: `space_key` and `page_id` can both be found in the URL of a page in Confluence - https://yoursite.atlassian.com/wiki/spaces/<space_key>/pages/<page_id>
|
|
|
|
## Usage
|
|
|
|
Here's an example usage of the ConfluenceReader.
|
|
|
|
```python
|
|
from llama_index import download_loader
|
|
|
|
ConfluenceReader = download_loader('ConfluenceReader')
|
|
|
|
token = {
|
|
access_token: "<access_token>",
|
|
token_type: "<token_type>"
|
|
}
|
|
oauth2_dict = {
|
|
"client_id": "<client_id>",
|
|
"token": token
|
|
}
|
|
|
|
base_url = "https://yoursite.atlassian.com/wiki"
|
|
|
|
page_ids = ["<page_id_1>", "<page_id_2>", "<page_id_3"]
|
|
space_key = "<space_key>"
|
|
|
|
reader = ConfluenceReader(base_url=base_url, oauth2=oauth2_dict)
|
|
documents = reader.load_data(space_key=space_key, page_ids=page_ids, include_attachments=True)
|
|
```
|
|
|
|
This loader is designed to be used as a way to load data into [LlamaIndex](https://github.com/jerryjliu/gpt_index/tree/main/gpt_index) and/or subsequently used as a Tool in a [LangChain](https://github.com/hwchase17/langchain) Agent. See [here](https://github.com/emptycrown/llama-hub/tree/main) for examples.
|