sparkbrains 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2b88890210 
							
						 
					 
					
						
						
							
							docs: customize sphinx doc theme ( #192 )  
						
						... 
						
						
						
						* feature: adding a feature for customizing color theme of sphinx docs
* fix: adding changelog and comments
* Adding css for changing colors of sidebar
* fix: removing changelog description 
						
						
							
						
					 
					
						2023-02-06 17:30:55 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							782b4352ec 
							
						 
					 
					
						
						
							
							build(deps): weekly dependency update; reduce dependabot frequency ( #194 )  
						
						... 
						
						
						
						* deps: pip-compile to update dependencies
* bump version
* linting, linting, linting
* typo 
						
						
							
						
					 
					
						2023-02-06 16:39:29 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							014585e872 
							
						 
					 
					
						
						
							
							fix: preserve the order of shapes in partition_pptx output ( #193 )  
						
						... 
						
						
						
						* order the shapes top to bottom and left to right
* added tests for ordering
* update change log and bump version
* more tests
* don't need enumerate
* n -> on 
						
						
							
 
						
					 
					
						2023-02-03 22:12:33 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a7ca58e0bc 
							
						 
					 
					
						
						
							
							fix: more english words; split on punctuation ( #191 )  
						
						... 
						
						
						
						* add a bigger list of english words
* update thresholds and add tests
* update docs; bump version
* fix version
* add additional english words back in
* linting, linting, linting
* add slashes
* work -> word 
						
						
							
						
					 
					
						2023-02-02 17:25:47 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0589344ff7 
							
						 
					 
					
						
						
							
							fix: require a minimum prop of alpha characters for titles and narrative text ( #190 )  
						
						... 
						
						
						
						* added alpha ratio check
* added tests for alpha ratio
* bump changelog and update docs
* update changelog/version; update docs
* ofr -> or 
						
						
							
						
					 
					
						2023-02-02 14:59:04 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1230a163fd 
							
						 
					 
					
						
						
							
							feat: set a user controlled max word length for titles ( #189 )  
						
						... 
						
						
						
						* update the docs
* add option for title max word length
* bump version; update changelog
* change max length to 12
* docs updates
* to -> too 
						
						
							
						
					 
					
						2023-02-01 19:32:16 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2d08fcbf83 
							
						 
					 
					
						
						
							
							fix: titles and narrative text need at least one english word ( #188 )  
						
						... 
						
						
						
						* added check for english words
* update docs
* at least one word needs to have multiple characters
* bump change log 
						
						
							
						
					 
					
						2023-02-01 09:10:48 -05:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d0bf8904fa 
							
						 
					 
					
						
						
							
							docs: example notebooks from community repo ( #187 )  
						
						
						
						
							
						
					 
					
						2023-01-31 10:37:32 -05:00 
						 
				 
			
				
					
						
							
							
								sparkbrains 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							243bf7ed5e 
							
						 
					 
					
						
						
							
							test: Increase coverage ( #181 )  
						
						
						
						
							
						
					 
					
						2023-01-30 22:47:09 -08:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f36e514c6d 
							
						 
					 
					
						
						
							
							build(deps): weekly dependency bump ( #183 )  
						
						
						
						
							
						
					 
					
						2023-01-30 11:05:48 -05:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e6cfde5c4a 
							
						 
					 
					
						
						
							
							fix: no UserWarning when partition_pdf is called ( #179 )  
						
						
						
						
							
						
					 
					
						2023-01-27 12:08:18 -05:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							339c133326 
							
						 
					 
					
						
						
							
							fix: cleanup from live .docx tests ( #177 )  
						
						... 
						
						
						
						* add env var for cap threshold; raise default threshold
* update docs and tests
* added check for ending in a comma
* update docs
* no caps check for all upper text
* capture Text in html and text
* check category in Text equality check
* lower case all caps before checking for verbs
* added check for us city/state/zip
* added address type
* add address to html
* add address to text
* fix for text tests; escape for large text segments
* refactor regex for readability
* update comment
* additional test for text with linebreaks
* update docs
* update changelog
* update elements docs
* remove old comment
* case -> cast
* type fix 
						
						
							
						
					 
					
						2023-01-26 15:52:25 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1ce8447ba7 
							
						 
					 
					
						
						
							
							build(deps): bump unstructured inference; compile from setup.py ( #176 )  
						
						... 
						
						
						
						* bump unstructured inference; compile from setup.py
* bump version
* compile the local-inference extra
* linting, linting, linting 
						
						
							
 
						
					 
					
						2023-01-25 16:32:57 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							26a5546152 
							
						 
					 
					
						
						
							
							fix: handle xml filetype detection on amazon linux ( #173 )  
						
						... 
						
						
						
						* fix: handle xml filetype detection on amazon linux
* option for html or xml
* fix typo
* back to dev tag 
						
						
							
						
					 
					
						2023-01-25 11:20:01 -05:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3b6546515d 
							
						 
					 
					
						
						
							
							docs: add links to linkedin and slack ( #175 )  
						
						
						
						
							
						
					 
					
						2023-01-24 13:51:10 -08:00 
						 
				 
			
				
					
						
							
							
								qued 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d2909ac688 
							
						 
					 
					
						
						
							
							chore: update all deps ( #172 )  
						
						
						
						
							
						
					 
					
						2023-01-23 13:03:02 -06:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8b6c5fac9d 
							
						 
					 
					
						
						
							
							feat: basic PowerPoint parsing in partition_pptx ( #166 )  
						
						... 
						
						
						
						* parition pptx and tests
* add parition_pptx to auto
* update doc types in readme
* add pptx docs
* bump version
* remove extra whitespace
* partition -> partitioning 
						
						
							
						
					 
					
						2023-01-23 17:03:09 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8d3e616846 
							
						 
					 
					
						
						
							
							feat: add ability to parse LayoutElement lists ( #165 )  
						
						... 
						
						
						
						* added ability to split list items
* changelog and version bump
* retrigger ci 
						
						
							
						
					 
					
						2023-01-20 08:55:11 -05:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c1822911a5 
							
						 
					 
					
						
						
							
							chore: return Element objects in partition_pdf and partition_image ( #164 )  
						
						... 
						
						
						
						* helper function to convert to element
* test for element types
* fix for healthcheck url
* version bump
* note on coordinates
* mention FigureCaption
* test_shared -> test_common
* add check boxes for checkbox template
* update changelog 
						
						
							
						
					 
					
						2023-01-19 14:29:28 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							59f972d739 
							
						 
					 
					
						
						
							
							build(deps): add requests as a base dependency ( #162 )  
						
						... 
						
						
						
						* build(deps): add `requests` as a base dependency
* linting, linting, linting
* changelog typo 
						
						
							
 
						
					 
					
						2023-01-18 16:36:23 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							74ce2ae6e5 
							
						 
					 
					
						
						
							
							fix: update detect_filetype to properly handle older office files ( #161 )  
						
						
						
						
							
						
					 
					
						2023-01-18 11:18:20 -05:00 
						 
				 
			
				
					
						
							
							
								Mallori Harrell 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							08ccee0acb 
							
						 
					 
					
						
						
							
							chore: Fix parse received data ( #143 )  
						
						... 
						
						
						
						* fix parse_received data 
						
						
							
						
					 
					
						2023-01-17 16:36:44 -06:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							749f9c6be8 
							
						 
					 
					
						
						
							
							fix: avoid divide by zero in exceeds_cap_ratio ( #160 )  
						
						
						
						
							
						
					 
					
						2023-01-17 15:22:12 -05:00 
						 
				 
			
				
					
						
							
							
								gokullan 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5d9183dc99 
							
						 
					 
					
						
						
							
							chore: graceful exit if sed is an old version ( #157 )  
						
						
						
						
							
						
					 
					
						2023-01-17 18:11:14 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							9c3c14e94d 
							
						 
					 
					
						
						
							
							fix: resolves UnicodeDecodeError in partition_email for emails with attachments ( #158 )  
						
						... 
						
						
						
						* split emails by \n=
* added test for equivalence betweent html and plain text
* changelog and bump version
* add check for content disposition 
						
						
							
 
						
					 
					
						2023-01-17 11:33:45 -05:00 
						 
				 
			
				
					
						
							
							
								dependabot[bot] 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7ed5f71e30 
							
						 
					 
					
						
						
							
							build(deps): Bump packaging from 22.0 to 23.0 in /requirements ( #156 )  
						
						... 
						
						
						
						Bumps [packaging](https://github.com/pypa/packaging ) from 22.0 to 23.0.
- [Release notes](https://github.com/pypa/packaging/releases )
- [Changelog](https://github.com/pypa/packaging/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pypa/packaging/compare/22.0...23.0 )
---
updated-dependencies:
- dependency-name: packaging
  dependency-type: direct:production
  update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
						
						
							
						
					 
					
						2023-01-17 11:03:03 -05:00 
						 
				 
			
				
					
						
							
							
								dependabot[bot] 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							04c1813c7f 
							
						 
					 
					
						
						
							
							build(deps): Bump filelock from 3.8.2 to 3.9.0 in /requirements ( #152 )  
						
						... 
						
						
						
						Bumps [filelock](https://github.com/tox-dev/py-filelock ) from 3.8.2 to 3.9.0.
- [Release notes](https://github.com/tox-dev/py-filelock/releases )
- [Changelog](https://github.com/tox-dev/py-filelock/blob/main/docs/changelog.rst )
- [Commits](https://github.com/tox-dev/py-filelock/compare/3.8.2...3.9.0 )
---
updated-dependencies:
- dependency-name: filelock
  dependency-type: direct:production
  update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matt Robinson <mrobinson@unstructured.io> 
						
						
							
						
					 
					
						2023-01-17 15:40:26 +00:00 
						 
				 
			
				
					
						
							
							
								dependabot[bot] 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							49392a2955 
							
						 
					 
					
						
						
							
							build(deps): Bump requests from 2.28.1 to 2.28.2 in /requirements ( #154 )  
						
						... 
						
						
						
						Bumps [requests](https://github.com/psf/requests ) from 2.28.1 to 2.28.2.
- [Release notes](https://github.com/psf/requests/releases )
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md )
- [Commits](https://github.com/psf/requests/compare/v2.28.1...v2.28.2 )
---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
						
						
							
						
					 
					
						2023-01-17 10:28:23 -05:00 
						 
				 
			
				
					
						
							
							
								qued 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8abf1f119d 
							
						 
					 
					
						
						
							
							feat: partition image ( #144 )  
						
						... 
						
						
						
						Adds partition_image to partition image file types, which is integrated into the partition brick. This relies on the 0.2.2 version of unstructured-inference. 
						
						
							
						
					 
					
						2023-01-13 22:24:13 -06:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							419c0867d3 
							
						 
					 
					
						
						
							
							build(deps): bump unstructured_inference version range ( #151 )  
						
						... 
						
						
						
						* bump unstructured-inference to 0.2.3
* bump version 
						
						
							
 
						
					 
					
						2023-01-13 22:21:36 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f12240c5e7 
							
						 
					 
					
						
						
							
							feat: add support for .txt files in partition ( #150 )  
						
						... 
						
						
						
						* added partition_text for auto
* rename partition_text tests
* bump version and update docs 
						
						
							
						
					 
					
						2023-01-13 16:39:53 -05:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							eba4c80b1e 
							
						 
					 
					
						
						
							
							feat: get_directory_file_info for exploring a directory of files ( #142 )  
						
						... 
						
						
						
						* added python-pptx to requirements
* added filetype detection for powerpoint
* add more filetypes to detect
* more tests
* added tests for filetype
* reorder document types
* tests for get_directory_file_info
* added docs for get_directory_file_info
* bump version
* Word -> Office
* added test for filetype
* add group by filetype example 
						
						
							
 
						
					 
					
						2023-01-11 12:40:50 -05:00 
						 
				 
			
				
					
						
							
							
								qued 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7e3af6c609 
							
						 
					 
					
						
						
							
							chore: remove extra requirements.txt ( #140 )  
						
						
						
						
							
						
					 
					
						2023-01-10 22:12:10 -06:00 
						 
				 
			
				
					
						
							
							
								Mallori Harrell 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e0feba83f6 
							
						 
					 
					
						
						
							
							feat: Add Image element and find_embedded_image function ( #130 )  
						
						... 
						
						
						
						* add find_embedded_image 
						
						
							
						
					 
					
						2023-01-09 19:49:19 -06:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7b3b594ee5 
							
						 
					 
					
						
						
							
							fix: correct make install-ci target ( #138 )  
						
						... 
						
						
						
						* fix install-ci make target
* add note to readme about libmagic
* remove mydoc.docx
* remove local-inference 
						
						
							
						
					 
					
						2023-01-09 17:03:09 -05:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5376bc510f 
							
						 
					 
					
						
						
							
							feat: generic partition brick with filetype detection ( #132 )  
						
						... 
						
						
						
						* add python-magic
* first pass on filetype detection
* tests for filetype detection
* more tests for file detection
* added tests for error conditions
* install libmagic dev in github
* libmagic install instructions
* pattern for checking email files
* support reading .eml in rb mode
* add auto partition function
* auto tests for emal
* auto tests for docx
* added tests for html
* add pdf and html tests
* linting, linting, linting
* added docs for auto partitioning
* update readme with generic partition brick
* bumped version
* added test for bad type
* detect .docx files from application/octet-stream
* linting, linting, linting
* identify xlsx from octet stream
* install poppler in ci
* fix mocks; test for unknown type
* install poppler utils
* install in one line
* only poppler-utils
* file extension logic from application/octet-stream
* install local inference for ci
* install detectron2
* removing unused dockerfile 
						
						
							
						
					 
					
						2023-01-09 16:15:14 -05:00 
						 
				 
			
				
					
						
							
							
								Mallori Harrell 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d7a00046a9 
							
						 
					 
					
						
						
							
							feat: Add new functionality to parse text and header of emails ( #111 )  
						
						... 
						
						
						
						* partition_text function 
						
						
							
						
					 
					
						2023-01-09 17:08:08 +00:00 
						 
				 
			
				
					
						
							
							
								dependabot[bot] 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7fb8713527 
							
						 
					 
					
						
						
							
							build(deps): Bump black from 22.10.0 to 22.12.0 in /requirements ( #137 )  
						
						... 
						
						
						
						Bumps [black](https://github.com/psf/black ) from 22.10.0 to 22.12.0.
- [Release notes](https://github.com/psf/black/releases )
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md )
- [Commits](https://github.com/psf/black/compare/22.10.0...22.12.0 )
---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:production
  update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
						
						
							
						
					 
					
						2023-01-09 16:54:26 +00:00 
						 
				 
			
				
					
						
							
							
								dependabot[bot] 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5129809f00 
							
						 
					 
					
						
						
							
							build(deps): Bump numpy from 1.23.5 to 1.24.1 in /requirements ( #136 )  
						
						... 
						
						
						
						Bumps [numpy](https://github.com/numpy/numpy ) from 1.23.5 to 1.24.1.
- [Release notes](https://github.com/numpy/numpy/releases )
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/RELEASE_WALKTHROUGH.rst )
- [Commits](https://github.com/numpy/numpy/compare/v1.23.5...v1.24.1 )
---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
  update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
						
						
							
						
					 
					
						2023-01-09 16:44:09 +00:00 
						 
				 
			
				
					
						
							
							
								dependabot[bot] 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f06189b292 
							
						 
					 
					
						
						
							
							build(deps-dev): Bump jupyter-core from 5.1.0 to 5.1.3 in /requirements ( #134 )  
						
						... 
						
						
						
						Bumps [jupyter-core](https://github.com/jupyter/jupyter_core ) from 5.1.0 to 5.1.3.
- [Release notes](https://github.com/jupyter/jupyter_core/releases )
- [Changelog](https://github.com/jupyter/jupyter_core/blob/main/CHANGELOG.md )
- [Commits](https://github.com/jupyter/jupyter_core/compare/v5.1.0...v5.1.3 )
---
updated-dependencies:
- dependency-name: jupyter-core
  dependency-type: direct:development
  update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matt Robinson <mrobinson@unstructured.io> 
						
						
							
						
					 
					
						2023-01-09 16:32:31 +00:00 
						 
				 
			
				
					
						
							
							
								dependabot[bot] 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ca8e1ee9f3 
							
						 
					 
					
						
						
							
							build(deps): Bump pydantic from 1.10.2 to 1.10.4 in /requirements ( #133 )  
						
						... 
						
						
						
						Bumps [pydantic](https://github.com/pydantic/pydantic ) from 1.10.2 to 1.10.4.
- [Release notes](https://github.com/pydantic/pydantic/releases )
- [Changelog](https://github.com/pydantic/pydantic/blob/v1.10.4/HISTORY.md )
- [Commits](https://github.com/pydantic/pydantic/compare/v1.10.2...v1.10.4 )
---
updated-dependencies:
- dependency-name: pydantic
  dependency-type: direct:production
  update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
						
						
							
						
					 
					
						2023-01-09 11:22:14 -05:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fee95b643c 
							
						 
					 
					
						
						
							
							feat: add partition_docx for Word documents ( #131 )  
						
						... 
						
						
						
						* first pass on docx parsing
* linting, linting, linting
* test docx with filename
* added documentation
* more tests; version bump
* typo
* another typo
* another typo!
* it -> its
* save -> saved
* remove None since it's the default argument 
						
						
							
						
					 
					
						2023-01-05 20:13:39 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							33b983fbf0 
							
						 
					 
					
						
						
							
							docs: instructions on how to install on Windows + conda ( #129 )  
						
						... 
						
						
						
						* add environment.yml
* instructions on how to install base package and detectron2
* added instructions on paddleocr
* remove covers
* install -> to install
* specified the shell
* updated example snippets
* update environment.yml
* updated the repo reference
* no more ands! 
						
						
							
						
					 
					
						2023-01-05 16:21:44 +00:00 
						 
				 
			
				
					
						
							
							
								Sebastian Laverde Alfonso 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							5a47eb06e9 
							
						 
					 
					
						
						
							
							feat: new bricks for removing and extracting ordered bullets ( #128 )  
						
						... 
						
						
						
						* feat: new cleaning brick for ordered bullets
* test: add test for cleaning ordered bullets
* feat: new brick for extracting ordered bullets
* test: add test for extracting ordered bullets
* docs: update CHANGELOG and bump new dev version
* chore: change extract ordered bullets return type to tuple
* chore: made tidy
* chore: regex to split on pattern instead of built-in
* chore: catch ValueError, made tidy and fix incompatible type
* chore: assertion statements in one line of code
* docs: add documentation for new clean and extract bricks to bricks.rst
* docs: refactor CHANGELOG 0.3.5.dev5 to dev6 with new bullets
* docs: update CHANGELOG 0.3.6-dev0 changes and bump version
Co-authored-by: Sebastian Laverde <sebastian@unstructured.io> 
						
						
							
						
					 
					
						2023-01-05 17:06:26 +01:00 
						 
				 
			
				
					
						
							
							
								qued 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a75499d465 
							
						 
					 
					
						
						
							
							feat: local inference ( #125 )  
						
						... 
						
						
						
						Splits partition_pdf into two paths, one used for local inference when url is None, another for inference via api when url is a string. 
						
						
							
 
						
					 
					
						2023-01-04 16:19:05 -06:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							17045aed80 
							
						 
					 
					
						
						
							
							feat: add convert_to_dataframe staging brick ( #127 )  
						
						... 
						
						
						
						* add pandas to deps; pip-compile
* staging brick to convert elements to dataframe
* bump version
* add convert_to_dataframe docs
* bump wheel version
* typo fix
* typo fix 2! 
						
						
							
						
					 
					
						2023-01-04 12:04:59 -05:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							445533745c 
							
						 
					 
					
						
						
							
							feat: helper functions to identify and extract phone numbers ( #124 )  
						
						... 
						
						
						
						* added pattern for finding phone numbers
* added cleaning brick for extracting phone numbers
* add docs
* changelog and bump version
* switch to us phone numbers
* bump dev version 
						
						
							
						
					 
					
						2023-01-03 13:31:05 -05:00 
						 
				 
			
				
					
						
							
							
								Mallori Harrell 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							509ad4951c 
							
						 
					 
					
						
						
							
							feat: Add extract_attachment_info ( #112 )  
						
						... 
						
						
						
						* Adds function to extract attachments and their metadata from eml files 
						
						
							
						
					 
					
						2023-01-03 11:41:54 -06:00 
						 
				 
			
				
					
						
							
							
								dependabot[bot] 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							456735735c 
							
						 
					 
					
						
						
							
							build(deps): Bump pillow from 9.3.0 to 9.4.0 in /requirements ( #120 )  
						
						... 
						
						
						
						Bumps [pillow](https://github.com/python-pillow/Pillow ) from 9.3.0 to 9.4.0.
- [Release notes](https://github.com/python-pillow/Pillow/releases )
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst )
- [Commits](https://github.com/python-pillow/Pillow/compare/9.3.0...9.4.0 )
---
updated-dependencies:
- dependency-name: pillow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
						
						
							
						
					 
					
						2023-01-02 16:52:53 +00:00 
						 
				 
			
				
					
						
							
							
								dependabot[bot] 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f80d05c7b0 
							
						 
					 
					
						
						
							
							build(deps): Bump pytz from 2022.6 to 2022.7 in /requirements ( #122 )  
						
						... 
						
						
						
						Bumps [pytz](https://github.com/stub42/pytz ) from 2022.6 to 2022.7.
- [Release notes](https://github.com/stub42/pytz/releases )
- [Commits](https://github.com/stub42/pytz/compare/release_2022.6...release_2022.7 )
---
updated-dependencies:
- dependency-name: pytz
  dependency-type: direct:production
  update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> 
						
						
							
						
					 
					
						2023-01-02 16:42:49 +00:00