fix: postgres destination connector serialization (#2411)

This fixes the serialization of the Elasticsearch destination connector.
Presence of the _client object breaks serialization due to TypeError:
cannot pickle '_thread.lock' object. This removes that object before
serialization.
This commit is contained in:
ryannikolaidis 2024-01-17 09:39:32 -08:00 committed by GitHub
parent ae24136238
commit f23f20c1dc
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 17 additions and 2 deletions

View File

@ -1,4 +1,4 @@
## 0.12.1-dev10
## 0.12.1-dev11
### Enhancements
@ -21,6 +21,7 @@
* **Pin version of unstructured-client** Set minimum version of unstructured-client to avoid raising a TypeError when passing `api_key_auth` to `UnstructuredClient`
* **Fix the serialization of the Pinecone destination connector.** Presence of the PineconeIndex object breaks serialization due to TypeError: cannot pickle '_thread.lock' object. This removes that object before serialization.
* **Fix the serialization of the Elasticsearch destination connector.** Presence of the _client object breaks serialization due to TypeError: cannot pickle '_thread.lock' object. This removes that object before serialization.
* **Fix the serialization of the Postgres destination connector.** Presence of the _client object breaks serialization due to TypeError: cannot pickle '_thread.lock' object. This removes that object before serialization.
* **Fix documentation and sample code for Chroma.** Was pointing to wrong examples..
## 0.12.0

View File

@ -1 +1 @@
__version__ = "0.12.1-dev10" # pragma: no cover
__version__ = "0.12.1-dev11" # pragma: no cover

View File

@ -1,9 +1,11 @@
import copy
import json
import typing as t
import uuid
from dataclasses import dataclass, field
from unstructured.ingest.enhanced_dataclass import enhanced_field
from unstructured.ingest.enhanced_dataclass.core import _asdict
from unstructured.ingest.error import DestinationConnectionError
from unstructured.ingest.interfaces import (
AccessConfig,
@ -68,6 +70,18 @@ class SqlDestinationConnector(BaseDestinationConnector):
connector_config: SimpleSqlConfig
_client: t.Optional[t.Any] = field(init=False, default=None)
def to_dict(self, **kwargs):
"""
The _client variable in this dataclass breaks deepcopy due to:
TypeError: cannot pickle '_thread.lock' object
When serializing, remove it, meaning client data will need to be reinitialized
when deserialized
"""
self_cp = copy.copy(self)
if hasattr(self_cp, "_client"):
setattr(self_cp, "_client", None)
return _asdict(self_cp, **kwargs)
@property
def client(self):
if self._client is None: