cragwolfe 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							13d3559fa4 
							
						 
					 
					
						
						
							
							chore: rename Element's "date" field to "last_modified" ( #997 )  
						
						... 
						
						
						
						Change the Element's date field name to the more specific last_modified so there is less room for confusion of what that field represents. 
						
						
					 
					
						2023-08-01 02:55:43 +00:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d9aed66b65 
							
						 
					 
					
						
						
							
							feat: add document date for remaining file types ( #930 ) ( #969 )  
						
						... 
						
						
						
						* feat: add document date for remaining file types (#930 )
* feat: add functions for getting modification date
* feat: add date field to metadata from csv file
* feat: add tests for csv patition
* feat: add date field to metadata from html file
* feat: add tests for html partition
* fix: return file name onlyif possible
* feat: add csv tests
* fix: renaming
* feat: add filed metadata_date  as date of last mod
* feat: add tests for partition_docx
* feat: add filed metadata_date  to .doc file
* feat: add tests for partition_doc
* feat: add metadata_date  to .epub file
* feat: add tests for partition_epub
* fix: fix test mocking
* feat: add metadata_date for image partition
* feat: add test for image partition
* feat: add coorrdinate system argument
* feat: add date to element metadata
* feat: add metadata_date for JSON partition
* feat: add test for JSON partition
* fix: rename variable
* feat: add metadata_date for md partition
* feat: add test for md partition
* feat: update doc string
* feat: add metadata_date for .odt partition
* feat: update .odt string
* feat: add metadata_date for .org partition
* feat: add tests for .org partition
* feat: add metadata_date for .pdf partition
* feat: add tests for .pdf partition
* feat: add metadata_date for .pptx partition
* feat: add metadata_date for .ppt partition
* feat: add tests for .ppt partition
* feat: add tests for .pptx partition
* feat: add metadata_date for .rst partition
* feat: add tests for .rst partition
* fix: get modification date after file checking
* feat: add tests for .rtf partition
* feat: add tests for .rtf partition
* feat: add metadata_date for .txt partition
* fix: rename argument
* feat: add tests for .txt partition
* feat: update doc string rst patrition function
* feat: add metadata_date for .tsv partition
* feat: add tests for .tsv partition
* feat: add metadata_date for .xlsx partition
* feat: add tests for .xlsx partition
* fix: clean up
* feat: add tests for .xml partition
* feat: add tests for .xml partition
* fix: use `or ` instead of `if`
* fix: fix epub tests
* fix: remove not used code
* fix: add try block for getting file name
* fix: applying linter changes
* fix: fix test_partition_file
* feat: add metadata_date for email
* feat: add test for email partition
* feat: add metadata_date for msg
* feat: add tests for msg partition
* feat: update CHANGELOG file
* fix: update partitions doc string
* don't push
* fix: clean up code
* linting, linting, linting
* remove unnecessary example doc
* update version and changelog
* ingest-test-fixtures-update
* set metadata date in test
---------
Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>
* ingest-test-fixtures-update
* Update ingest test fixtures (#970 )
Co-authored-by: MthwRobinson <MthwRobinson@users.noreply.github.com>
* Revert "Update ingest test fixtures (#970 )"
This reverts commit 1d182ae474b3545b15551fffc15977757d552cd2.
* remove date from metadata in outputs
* update docstring ordering
* remove print
* remove print
* remove print
* linting, linting, linting
* fix version and test
* fix changelog
* fix changelog
* update version
---------
Co-authored-by: kravetsmic <79907559+kravetsmic@users.noreply.github.com>
Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com>
Co-authored-by: MthwRobinson <MthwRobinson@users.noreply.github.com> 
						
						
					 
					
						2023-07-26 15:10:14 -04:00 
						 
				 
			
				
					
						
							
							
								John 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							dc6d7d7268 
							
						 
					 
					
						
						
							
							feat: add metadata_filename parameter across all partition functions ( #811 )  
						
						... 
						
						
						
						* fix conflicts
* add tests and clean metadata_filename in partitions
* fix test_email and remove comments
* make tidy/check
* update changelog and version
* fix tests
* make tidy again 
						
						
					 
					
						2023-07-05 16:02:22 -04:00 
						 
				 
			
				
					
						
							
							
								John 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e9fdbb0943 
							
						 
					 
					
						
						
							
							feat: add include_metadata across all partition functions ( #853 )  
						
						... 
						
						
						
						* add include_metadata kwarg and tests to parsers
add exclude_metadata to docx
add test for doc to exclude metadata
add include_metadata kwarg to email
add include_metadata kwarg to epub
add include_metadata kwarg to json
add exclude_metadata tests to md
add include_metadata kwarg and tests for msg parse
add include_metadata kwarg and tests for odt parse
add include_metadata kwarg and tests for org parse
add include_metadata kwarg and tests for ppt and pptx parse
add include_metadata kwarg and tests for rst parse
add include_metadata kwarg and tests for rtf parse
add include_metadata tests for text parse
add include_metadata tests for tsv parse
add include_metadata tests for xlsx parse
add include_metadata tests for xml parse
* WIP add include_metadata to partition_pdf
* add include_metadata tests to partition_pdf
* make tidy/check
* update changelog and version
* change test asserts and move docstring logic to process_metadata
* make tidy
* fix tests asserts
* linting, linting, linting
* sync versions
* skip api call test not on main
---------
Co-authored-by: Matt Robinson <mrobinson@unstructured.io>
Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io> 
						
						
					 
					
						2023-06-30 10:44:46 -04:00 
						 
				 
			
				
					
						
							
							
								Matt Robinson 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c82fdb6a89 
							
						 
					 
					
						
						
							
							feat: partition_rst for ReStructured Text documents ( #725 )  
						
						... 
						
						
						
						* add example rst file
* filetype detection for rst files
* add partition_rst function
* add partition_rst to auto
* update readme
* update docs
* changelog and version
* pandocs -> pandoc
* fix typo 
						
						
					 
					
						2023-06-12 19:31:10 +00:00