unstructured/CHANGELOG.md
Matt Robinson 08e091c5a9
chore: Reorganize partition bricks under partition directory (#76)
* move partition_pdf to partition folder

* move partition.py

* refactor partioning bricks into partition diretory

* import to nlp for backward compatibility

* update docs

* update version and bump changelog

* fix typo in changelog

* update readme reference
2022-11-21 22:27:23 +00:00

1.3 KiB

0.3.0-dev2

  • Removing the local PDF parsing code and any dependencies and tests.
  • Reorganizes the staging bricks in the unstructured.partition module

0.2.6

  • Small change to how _read is placed within the inheritance structure since it doesn't really apply to pdf
  • Add partitioning brick for calling the document image analysis API

0.2.5

  • Update python requirement to >=3.7

0.2.4

  • Add alternative way of importing Final to support google colab

0.2.3

  • Add cleaning bricks for removing prefixes and postfixes
  • Add cleaning bricks for extracting text before and after a pattern

0.2.2

  • Add staging brick for Datasaur

0.2.1

  • Added brick to convert an ISD dictionary to a list of elements
  • Update PDFDocument to use the from_file method
  • Added staging brick for CSV format for ISD (Initial Structured Data) format.
  • Added staging brick for separating text into attention window size chunks for transformers.
  • Added staging brick for LabelBox.
  • Added ability to upload LabelStudio predictions
  • Added utility function for JSONL reading and writing
  • Added staging brick for CSV format for Prodigy
  • Added staging brick for Prodigy
  • Added ability to upload LabelStudio annotations
  • Added text_field and id_field to stage_for_label_studio signature

0.2.0

  • Initial release of unstructured