28 Commits

Author SHA1 Message Date
Yichen Yan
b5deccc84d
relax fixed requirements 2023-03-10 15:02:18 +08:00
Jerry Liu
aff12c3b3c
Merge pull request #91 from masatootake/fix_simple_web
Fix to support the latest version of langchain
2023-03-07 10:55:14 -08:00
Jesse Zhang
feb57ebf62
Fix substack 2023-03-07 10:50:48 -08:00
Jesse Zhang
84da0d7920
Update requirements.txt 2023-03-07 10:37:44 -08:00
Masato Otake
d2b9b4b031 Fix to support the latest version of langchain 2023-03-07 23:15:55 +09:00
Smyja
f082d4608d added include_url_in_text parameter 2023-03-03 23:07:10 +01:00
Smyja
f9c7f31f5f added logging 2023-03-02 22:08:17 +01:00
Smyja
ef9a6a2c07 added reference links 2023-03-02 19:52:14 +01:00
EmptyCrown
457e7888e9 Cleanup 2023-02-24 23:39:32 -08:00
EmptyCrown
9ed101e30b Fix substack 2023-02-24 10:49:50 -08:00
Jerry Liu
e97bb81915
swap out gpt_index imports for llama_index imports (#49)
* cr

* cr

* cr

---------

Co-authored-by: Jerry Liu <jerry@robustintelligence.com>
Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com>
2023-02-20 21:46:58 -08:00
EmptyCrown
615c21a2c8 Cleanup 2023-02-15 20:19:37 -08:00
EmptyCrown
6fb47cf7f6 Update bs4 readme 2023-02-15 09:20:06 -08:00
Smyja
23ae4928cb
Extended BeautifulSoup Reader loader. (#37)
* Extended BeautifulSoup Reader loader.

* removed link slice

* removed broken docs link list.

* lint and added metadata back

* fixed urljoin issue

* resolved the import issue and added typing
2023-02-15 09:17:34 -08:00
Jerry Liu
46176a1829
cr (#31)
Co-authored-by: Jerry Liu <jerry@robustintelligence.com>
2023-02-12 21:38:00 -08:00
EmptyCrown
65be920cd7 Clean up knowledge base loader 2023-02-10 12:01:36 -08:00
Jesse Zhang
a70e60c94b
Remote reader (#17)
* Small bug fixes

* Remote loader for pages/files

* Add to library
2023-02-09 17:27:20 -08:00
Jesse Zhang
f2dbcdf76a
Quick addition of readthedocs formatter (#15) 2023-02-08 17:42:33 -08:00
EmptyCrown
457a53d811 Rename knowledge base 2023-02-07 22:14:38 -08:00
Jason Fan
e01bd659ce
PR to add a loader that crawls and scrapes online knowledge bases (#13)
* add knowledgebase reader for web

* Create __init__.py

* Create README.md

* Update library.json

* Update README.md

* moved playwright import into function body

* added examples and details of use cases
2023-02-07 21:45:03 -08:00
EmptyCrown
7b56b2f16a Updated some READMEs 2023-02-05 23:35:08 -08:00
EmptyCrown
e7c03d1a08 Added READMEs 2023-02-04 11:44:45 -08:00
Jerry Liu
629dd1ee91 cr 2023-02-03 23:38:12 -08:00
EmptyCrown
bfdbf24330 Fix current READMEs 2023-02-03 21:15:15 -08:00
EmptyCrown
98156973a8 Requirements txt implemented 2023-02-03 20:41:20 -08:00
EmptyCrown
5cfa2a5098 Updated library 2023-02-02 23:31:03 -08:00
EmptyCrown
0de1c320bd Trafilatura readme 2023-02-02 00:53:20 -08:00
EmptyCrown
36568ca809 Added web and instructions 2023-02-01 22:44:43 -08:00