haystack/releasenotes/notes/document-backward-compatible-07c4151e98ef8511.yaml
Silvano Cerza 76d5142bb8
Refactor: Document serialization and backward compatibility (#6180)
* Rework Document serialisation

* Make Document backward compatible

* Fix InMemoryDocumentStore filters

* Fix InMemoryDocumentStore.bm25_retrieval

* Add release notes

* Fix pylint failures

* Enhance Document kwargs handling and docstrings

* cosmetics

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 17:03:06 +01:00

21 lines
1.1 KiB
YAML

---
prelude: >
The `Document` serialisation has been changed quite a bit, this will make it easier to implement
new Document Stores.
The most notable change is that the `Document.flatten()` method has been removed.
`Document.to_dict()` now has a `flatten` parameter, that defaults to `True` for backward compatibility.
It will flatten metadata keys at the same level of other `Document` fields when converting to `dict`.
`to_json` and `from_json` have been removed, as `to_dict` and `from_dict` already handle serialisation
of `dataframe` and `blob` fields.
Now `metadata` must only contain primitives that can be serialised to JSON with no custom encoders.
If any Document Store needs custom serialisation they can implement their own logic.
`Document` has also been made backward compatible so that it can be created using dictionaries
structured as the legacy 1.x `Document`. The legacy fields will be converted automatically to
their new counterparts, or ignored if there's none.
preview:
- |
Refactor Document serialisation and make it backward compatible with Haystack 1.x