llama-hub/loader_hub/google_docs/README.md

# Google Doc Loader

This loader takes in IDs of Google Docs and parses their text into `Document`s. You can extract a Google Doc's ID directly from its URL. For example, the ID of `https://docs.google.com/document/d/1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec/edit` is `1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec`.

As a prerequisite, you will need to register with Google and generate a `credentials.json` file in the directory where you run this loader. See [here](https://developers.google.com/workspace/guides/create-credentials) for instructions.

## Usage

To use this loader, you simply need to pass in an array of Google Doc IDs.

```python
from llama_index import download_loader

GoogleDocsReader = download_loader('GoogleDocsReader')

gdoc_ids = ['1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec']
loader = GoogleDocsReader()
documents = loader.load_data(document_ids=gdoc_ids)
```

## Examples

This loader is designed to be used as a way to load data into [LlamaIndex](https://github.com/jerryjliu/gpt_index/tree/main/gpt_index) and/or subsequently used as a Tool in a [LangChain](https://github.com/hwchase17/langchain) Agent.

### LlamaIndex

```python
from llama_index import GPTVectorStoreIndex, download_loader

GoogleDocsReader = download_loader('GoogleDocsReader')

gdoc_ids = ['1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec']
loader = GoogleDocsReader()
documents = loader.load_data(document_ids=gdoc_ids)
index = GPTVectorStoreIndex.from_documents(documents)
index.query('Where did the author go to school?')
```

### LangChain

Note: Make sure you change the description of the `Tool` to match your use-case.

```python
from llama_index import GPTVectorStoreIndex, download_loader
from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI
from langchain.chains.conversation.memory import ConversationBufferMemory

GoogleDocsReader = download_loader('GoogleDocsReader')

gdoc_ids = ['1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec']
loader = GoogleDocsReader()
documents = loader.load_data(document_ids=gdoc_ids)
index = GPTVectorStoreIndex.from_documents(documents)

tools = [
    Tool(
        name="Google Doc Index",
        func=lambda q: index.query(q),
        description=f"Useful when you want answer questions about the Google Documents.",
    ),
]
llm = OpenAI(temperature=0)
memory = ConversationBufferMemory(memory_key="chat_history")
agent_chain = initialize_agent(
    tools, llm, agent="zero-shot-react-description", memory=memory
)

output = agent_chain.run(input="Where did the author go to school?")
```
Google Doc rough example 2023-01-31 22:53:56 -08:00			`# Google Doc Loader`

Package working 2023-02-01 16:42:50 -08:00			This loader takes in IDs of Google Docs and parses their text into `Document`s. You can extract a Google Doc's ID directly from its URL. For example, the ID of `https://docs.google.com/document/d/1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec/edit` is `1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec`.

Added files 2023-02-01 17:35:33 -08:00			As a prerequisite, you will need to register with Google and generate a `credentials.json` file in the directory where you run this loader. See [here](https://developers.google.com/workspace/guides/create-credentials) for instructions.
Google Doc rough example 2023-01-31 22:53:56 -08:00
			`## Usage`

			`To use this loader, you simply need to pass in an array of Google Doc IDs.`

			```python
swap out gpt_index imports for llama_index imports (#49) * cr * cr * cr --------- Co-authored-by: Jerry Liu <jerry@robustintelligence.com> Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com> 2023-02-20 21:46:58 -08:00			`from llama_index import download_loader`
Fix current READMEs 2023-02-03 21:15:15 -08:00
			`GoogleDocsReader = download_loader('GoogleDocsReader')`
Google Doc rough example 2023-01-31 22:53:56 -08:00
			`gdoc_ids = ['1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec']`
Package working 2023-02-01 16:42:50 -08:00			`loader = GoogleDocsReader()`
			`documents = loader.load_data(document_ids=gdoc_ids)`
Google Doc rough example 2023-01-31 22:53:56 -08:00			```

			`## Examples`

swap out gpt_index imports for llama_index imports (#49) * cr * cr * cr --------- Co-authored-by: Jerry Liu <jerry@robustintelligence.com> Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com> 2023-02-20 21:46:58 -08:00			`This loader is designed to be used as a way to load data into [LlamaIndex](https://github.com/jerryjliu/gpt_index/tree/main/gpt_index) and/or subsequently used as a Tool in a [LangChain](https://github.com/hwchase17/langchain) Agent.`
Google Doc rough example 2023-01-31 22:53:56 -08:00
swap out gpt_index imports for llama_index imports (#49) * cr * cr * cr --------- Co-authored-by: Jerry Liu <jerry@robustintelligence.com> Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com> 2023-02-20 21:46:58 -08:00			`### LlamaIndex`
Google Doc rough example 2023-01-31 22:53:56 -08:00
			```python
Update after refactoring away parsers in LlamaIndex, also update docs to 0.6.0 API (#264) 2023-05-16 23:26:33 -04:00			`from llama_index import GPTVectorStoreIndex, download_loader`
Fix current READMEs 2023-02-03 21:15:15 -08:00
			`GoogleDocsReader = download_loader('GoogleDocsReader')`
Google Doc rough example 2023-01-31 22:53:56 -08:00
			`gdoc_ids = ['1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec']`
Package working 2023-02-01 16:42:50 -08:00			`loader = GoogleDocsReader()`
			`documents = loader.load_data(document_ids=gdoc_ids)`
Update after refactoring away parsers in LlamaIndex, also update docs to 0.6.0 API (#264) 2023-05-16 23:26:33 -04:00			`index = GPTVectorStoreIndex.from_documents(documents)`
Google Doc rough example 2023-01-31 22:53:56 -08:00			`index.query('Where did the author go to school?')`
			```

			`### LangChain`

			Note: Make sure you change the description of the `Tool` to match your use-case.

			```python
Update after refactoring away parsers in LlamaIndex, also update docs to 0.6.0 API (#264) 2023-05-16 23:26:33 -04:00			`from llama_index import GPTVectorStoreIndex, download_loader`
Google Doc rough example 2023-01-31 22:53:56 -08:00			`from langchain.agents import initialize_agent, Tool`
			`from langchain.llms import OpenAI`
			`from langchain.chains.conversation.memory import ConversationBufferMemory`

Fix current READMEs 2023-02-03 21:15:15 -08:00			`GoogleDocsReader = download_loader('GoogleDocsReader')`

Google Doc rough example 2023-01-31 22:53:56 -08:00			`gdoc_ids = ['1wf-y2pd9C878Oh-FmLH7Q_BQkljdm6TQal-c1pUfrec']`
Package working 2023-02-01 16:42:50 -08:00			`loader = GoogleDocsReader()`
			`documents = loader.load_data(document_ids=gdoc_ids)`
Update after refactoring away parsers in LlamaIndex, also update docs to 0.6.0 API (#264) 2023-05-16 23:26:33 -04:00			`index = GPTVectorStoreIndex.from_documents(documents)`
Google Doc rough example 2023-01-31 22:53:56 -08:00
			`tools = [`
			`Tool(`
			`name="Google Doc Index",`
			`func=lambda q: index.query(q),`
			`description=f"Useful when you want answer questions about the Google Documents.",`
			`),`
			`]`
			`llm = OpenAI(temperature=0)`
			`memory = ConversationBufferMemory(memory_key="chat_history")`
			`agent_chain = initialize_agent(`
			`tools, llm, agent="zero-shot-react-description", memory=memory`
			`)`
README 2023-02-01 16:02:30 -08:00
			`output = agent_chain.run(input="Where did the author go to school?")`
Google Doc rough example 2023-01-31 22:53:56 -08:00			```