278 Commits

Author SHA1 Message Date
EmptyCrown
8948cff28d skip folders in s3 2023-02-10 16:09:05 -08:00
Jesse Zhang
adaa2d78a6
Update README.md 2023-02-10 15:57:04 -08:00
EmptyCrown
e5dc38be2b Account for nested files in s3 reader 2023-02-10 15:44:56 -08:00
EmptyCrown
cd3219de68 Remove mentions of verbose 2023-02-10 14:47:37 -08:00
EmptyCrown
201f15d9ee No need for pathlib in readmes 2023-02-10 14:38:18 -08:00
EmptyCrown
65be920cd7 Clean up knowledge base loader 2023-02-10 12:01:36 -08:00
Jesse Zhang
faaeab7599
S3 reader plus documentation (#18)
* Working s3 reader plus documentation

* Lint
2023-02-10 10:52:15 -08:00
Jesse Zhang
ff124e5b3f
Update README.md 2023-02-10 10:31:43 -08:00
EmptyCrown
7869c1a331 Update README 2023-02-10 09:04:27 -08:00
Jesse Zhang
a70e60c94b
Remote reader (#17)
* Small bug fixes

* Remote loader for pages/files

* Add to library
2023-02-09 17:27:20 -08:00
EmptyCrown
9fbe79afa3 Unstructured readme 2023-02-09 08:34:27 -08:00
EmptyCrown
0ff027d210 Fix bug with unstructured 2023-02-09 08:30:33 -08:00
EmptyCrown
6ec52ecd2f Small bug fixes 2023-02-09 00:55:35 -08:00
Jerry Liu
b53ab52c84
cr (#16)
Co-authored-by: Jerry Liu <jerry@robustintelligence.com>
2023-02-08 23:12:44 -08:00
Jesse Zhang
f2dbcdf76a
Quick addition of readthedocs formatter (#15) 2023-02-08 17:42:33 -08:00
Ajinkya Indulkar
6c194fad66
🐛 fix file loading in PandasCSVReader (#14) 2023-02-08 12:58:27 -08:00
EmptyCrown
457a53d811 Rename knowledge base 2023-02-07 22:14:38 -08:00
Jesse Zhang
e0fe338fe6
Unstructured.io loader (#12)
* Unstructured.io loader

* Formatting python in readme

* Added split_documents arg

* Readme tweak
2023-02-07 22:12:24 -08:00
Jason Fan
e01bd659ce
PR to add a loader that crawls and scrapes online knowledge bases (#13)
* add knowledgebase reader for web

* Create __init__.py

* Create README.md

* Update library.json

* Update README.md

* moved playwright import into function body

* added examples and details of use cases
2023-02-07 21:45:03 -08:00
Jerry Liu
579381f027
cr (#10)
Co-authored-by: Jerry Liu <jerry@robustintelligence.com>
2023-02-07 13:56:23 -08:00
EmptyCrown
4a418c02fe Refactors mbox loader 2023-02-06 23:54:33 -08:00
Jerry Liu
12d5ce89b8
cr (#8)
Co-authored-by: Jerry Liu <jerry@robustintelligence.com>
2023-02-06 23:34:00 -08:00
Jerry Liu
09006d989e
cr (#9)
Co-authored-by: Jerry Liu <jerry@robustintelligence.com>
2023-02-06 23:32:41 -08:00
Jerry Liu
1ef7129041
update README (#7)
* cr

* cr

---------

Co-authored-by: Jerry Liu <jerry@robustintelligence.com>
Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com>
2023-02-06 21:04:51 -08:00
EmptyCrown
e801266bc2 Change name to llama 2023-02-06 13:41:03 -08:00
Jesse Zhang
301bb4f477
Update README.md 2023-02-06 13:35:46 -08:00
Jesse Zhang
11dcd9a88c
Update README.md 2023-02-06 13:35:23 -08:00
EmptyCrown
7b56b2f16a Updated some READMEs 2023-02-05 23:35:08 -08:00
EmptyCrown
07f1ab0acd Format 2023-02-05 17:57:03 -08:00
EmptyCrown
07e97723da README updates 2023-02-05 17:56:28 -08:00
EmptyCrown
99b36697e3 Typo 2023-02-05 17:25:19 -08:00
EmptyCrown
e94b59bcf2 Fixed the READMEs 2023-02-05 09:48:56 -08:00
Jerry Liu
85f5c0bcda
cr (#6)
Co-authored-by: Jerry Liu <jerry@robustintelligence.com>
2023-02-04 23:23:21 -08:00
EmptyCrown
166a9ee91d Added to database readme 2023-02-04 16:28:47 -08:00
Jesse Zhang
f8f4188634
Merge pull request #4 from jerryjliu/jerry/add_readmes
[WIP] add readmes
2023-02-04 16:25:31 -08:00
Jerry Liu
9c4961172b cr 2023-02-04 15:47:33 -08:00
Jerry Liu
19241b026b Merge remote-tracking branch 'upstream/main' into jerry/add_readmes 2023-02-04 14:41:09 -08:00
EmptyCrown
15ec6fe991 Verbose setting 2023-02-04 13:00:54 -08:00
EmptyCrown
87ff1fa50d Fix 2023-02-04 12:36:35 -08:00
EmptyCrown
b9c172556e Added the two readers for arxiv and pubmed papers 2023-02-04 12:34:15 -08:00
EmptyCrown
e7c03d1a08 Added READMEs 2023-02-04 11:44:45 -08:00
EmptyCrown
2a66d09da3 fix tests 2023-02-04 01:47:14 -08:00
EmptyCrown
9155bb9512 author of docx 2023-02-04 01:46:48 -08:00
Jesse Zhang
dad96a0921
Merge pull request #5 from emptycrown/jerry/fix_ci
fix CI
2023-02-04 01:43:48 -08:00
Jerry Liu
cc0114e044 cr 2023-02-04 01:15:52 -08:00
EmptyCrown
96b7b329c5 No longer using this as a package (for now) 2023-02-04 01:13:33 -08:00
EmptyCrown
916c9dac6a Delete gpt index schemas 2023-02-04 01:12:40 -08:00
Jesse Zhang
98b5689a2a
Update README.md 2023-02-04 00:51:55 -08:00
Jesse Zhang
649bce058c
Merge pull request #2 from jerryjliu/jerry/add_gh_workflow
Add a unit test, Github workflow, formatted some files
2023-02-04 00:49:09 -08:00
Jerry Liu
f3fe198d17 delete lint 2023-02-04 00:16:01 -08:00