KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0171082cc5 
							
						 
					 
					
						
						
							
							fix create dialog bug ( #982 )  
						
						... 
						
						
						
						### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue) 
						
						
					 
					
						2024-05-30 09:25:05 +08:00 
						 
				 
			
				
					
						
							
							
								Zhedong Cen 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8dd45459be 
							
						 
					 
					
						
						
							
							Add support for HTML file ( #973 )  
						
						... 
						
						
						
						### What problem does this PR solve?
Add support for HTML file
### Type of change
- [x] New Feature (non-breaking change which adds functionality) 
						
						
					 
					
						2024-05-30 09:12:55 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7013d7f620 
							
						 
					 
					
						
						
							
							refine text decode ( #657 )  
						
						... 
						
						
						
						### What problem does this PR solve?
#651  
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue) 
						
						
					 
					
						2024-05-07 12:25:47 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8c07992b6c 
							
						 
					 
					
						
						
							
							refine code ( #595 )  
						
						... 
						
						
						
						### What problem does this PR solve?
### Type of change
- [x] Refactoring 
						
						
					 
					
						2024-04-28 19:13:33 +08:00 
						 
				 
			
				
					
						
							
							
								Jin Hai 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f1c98aad6b 
							
						 
					 
					
						
						
							
							Update version info ( #564 )  
						
						... 
						
						
						
						### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com> 
						
						
					 
					
						2024-04-26 20:07:26 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							369400c483 
							
						 
					 
					
						
						
							
							fix bug of table in docx ( #510 )  
						
						... 
						
						
						
						### What problem does this PR solve?
#509  
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue) 
						
						
					 
					
						2024-04-23 19:10:33 +08:00 
						 
				 
			
				
					
						
							
							
								chrysanthemum-boy 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							72384b191d 
							
						 
					 
					
						
						
							
							Add .doc file parser. ( #497 )  
						
						... 
						
						
						
						### What problem does this PR solve?
Add `.doc` file parser, using tika.
```
pip install tika
```
```
from tika import parser
from io import BytesIO
def extract_text_from_doc_bytes(doc_bytes):
    file_like_object = BytesIO(doc_bytes)
    parsed = parser.from_buffer(file_like_object)
    return parsed["content"]
```
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: chrysanthemum-boy <fannc@qq.com> 
						
						
					 
					
						2024-04-23 15:31:43 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							0dfc8ddc0f 
							
						 
					 
					
						
						
							
							enlarge docker memory usage ( #501 )  
						
						... 
						
						
						
						### What problem does this PR solve?
### Type of change
- [x] Refactoring 
						
						
					 
					
						2024-04-23 14:41:10 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a38e163035 
							
						 
					 
					
						
						
							
							remove doc from supported processing types ( #488 )  
						
						... 
						
						
						
						### What problem does this PR solve?
#474  
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue) 
						
						
					 
					
						2024-04-22 15:46:09 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ed6081845a 
							
						 
					 
					
						
						
							
							Fit a lot of encodings for text file. ( #458 )  
						
						... 
						
						
						
						### What problem does this PR solve?
#384 
### Type of change
- [x] Performance Improvement 
						
						
					 
					
						2024-04-19 18:02:53 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f6c7204002 
							
						 
					 
					
						
						
							
							refine log format ( #312 )  
						
						... 
						
						
						
						### What problem does this PR solve?
Issue link:#264
### Type of change
- [x] Documentation Update
- [x] Refactoring 
						
						
					 
					
						2024-04-11 10:13:43 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							fd7fcb5baf 
							
						 
					 
					
						
						
							
							apply pep8 formalize ( #155 )  
						
						
						
						
					 
					
						2024-03-27 11:33:46 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							f6aee7f230 
							
						 
					 
					
						
						
							
							add use layout or not option ( #145 )  
						
						... 
						
						
						
						* add use layout or not option
* trival 
						
						
					 
					
						2024-03-22 19:21:09 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							602038ac49 
							
						 
					 
					
						
						
							
							fix task cancling bug ( #98 )  
						
						
						
						
					 
					
						2024-03-05 16:33:47 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8a57f2afd5 
							
						 
					 
					
						
						
							
							change callback strategy, add timezone to docker ( #96 )  
						
						
						
						
					 
					
						2024-03-05 12:08:41 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7bfaf0df29 
							
						 
					 
					
						
						
							
							fix position extraction bug ( #93 )  
						
						... 
						
						
						
						* fix position extraction bug
* remove delimiter for naive parser 
						
						
					 
					
						2024-03-04 17:08:35 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							685b4d8a95 
							
						 
					 
					
						
						
							
							fix table desc bugs, add positions to chunks ( #91 )  
						
						
						
						
					 
					
						2024-03-04 14:42:26 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8a726fb04b 
							
						 
					 
					
						
						
							
							solve task execution issues ( #90 )  
						
						
						
						
					 
					
						2024-03-01 19:48:01 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							7fd1eca582 
							
						 
					 
					
						
						
							
							init README of deepdoc, add picture processer. ( #71 )  
						
						... 
						
						
						
						* init README of deepdoc, add picture processer.
* add resume parsing 
						
						
					 
					
						2024-02-23 18:28:12 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							cacd36c5e1 
							
						 
					 
					
						
						
							
							use onnx models, new deepdoc ( #68 )  
						
						
						
						
					 
					
						2024-02-21 16:32:38 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
						
						
							
						
						
							a8294f2168 
							
						 
					 
					
						
						
							
							Refine resume parts and fix bugs in retrival using sql ( #66 )  
						
						
						
						
					 
					
						2024-02-19 19:22:17 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
						
						
							
						
						
							407b2523b6 
							
						 
					 
					
						
						
							
							remove unused codes, seperate layout detection out as a new api. Add new rag methed 'table' ( #55 )  
						
						
						
						
					 
					
						2024-02-05 18:08:17 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
						
						
							
						
						
							51482f3e2a 
							
						 
					 
					
						
						
							
							Some document API refined. ( #53 )  
						
						... 
						
						
						
						Add naive chunking method to RAG 
						
						
					 
					
						2024-02-02 19:21:37 +08:00 
						 
				 
			
				
					
						
							
							
								KevinHuSh 
							
						 
					 
					
						
						
						
						
							
						
						
							e6acaf6738 
							
						 
					 
					
						
						
							
							Add Q&A and Book, fix task running bugs ( #50 )  
						
						
						
						
					 
					
						2024-02-01 18:53:56 +08:00