mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-12-25 06:04:53 +00:00
### Description Move over all fsspec connectors to the new framework Variety of bug fixes found and fixed in this PR as well: * custom json mixin being used for the enhanced dataclass would break if typing was quoted. That was fixed. A check was also added to the enhanced dataclass to prevent `InitVar` from being used in the root dataclass since this breaks serialization. * hashing for partitioner was using the filename of the raw file being partitioned rather than the file name of the file data generated from indexing. This means that mutliple files could result in the same partition hash when recursive flag is passed in. This was updated to use the file data file name instead. --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: rbiseck3 <rbiseck3@users.noreply.github.com>
24 lines
2.3 KiB
JSON
24 lines
2.3 KiB
JSON
[
|
|
{
|
|
"type": "Table",
|
|
"element_id": "32bc8af17151389d3e80f65036f8e65b",
|
|
"text": "January 2023 ( Someone fed my essays into GPT to make something that could answer\nquestions based on them, then asked it where good ideas come from. The\nanswer was ok, but not what I would have said. This is what I would have said.) The way to get new ideas is to notice anomalies: what seems strange,\nor missing, or broken? You can see anomalies in everyday life (much\nof standup comedy is based on this), but the best place to look for\nthem is at the frontiers of knowledge. Knowledge grows fractally.\nFrom a distance its edges look smooth, but when you learn enough\nto get close to one, you'll notice it's full of gaps. These gaps\nwill seem obvious; it will seem inexplicable that no one has tried\nx or wondered about y. In the best case, exploring such gaps yields\nwhole new fractal buds.",
|
|
"metadata": {
|
|
"text_as_html": "<table><tr><td></td><td></td><td>January 2023 ( Someone fed my essays into GPT to make something that could answer<br/>questions based on them, then asked it where good ideas come from. The<br/>answer was ok, but not what I would have said. This is what I would have said.) The way to get new ideas is to notice anomalies: what seems strange,<br/>or missing, or broken? You can see anomalies in everyday life (much<br/>of standup comedy is based on this), but the best place to look for<br/>them is at the frontiers of knowledge. Knowledge grows fractally.<br/>From a distance its edges look smooth, but when you learn enough<br/>to get close to one, you'll notice it's full of gaps. These gaps<br/>will seem obvious; it will seem inexplicable that no one has tried<br/>x or wondered about y. In the best case, exploring such gaps yields<br/>whole new fractal buds.</td></tr></table>",
|
|
"languages": [
|
|
"eng"
|
|
],
|
|
"filetype": "text/html",
|
|
"data_source": {
|
|
"url": "gs://utic-test-ingest-fixtures/nested-2/nested/ideas-page.html",
|
|
"version": "113813498010717860141768546590661839404",
|
|
"record_locator": {
|
|
"protocol": "gs",
|
|
"remote_file_path": "gs://utic-test-ingest-fixtures/"
|
|
},
|
|
"date_created": "1687304904.586",
|
|
"date_modified": "1687304904.586"
|
|
}
|
|
}
|
|
}
|
|
] |