Nicolas
|
7a31306be5
|
Nick: url normalization + max metadata size
|
2024-12-30 20:04:22 -03:00 |
|
Nicolas
|
bf9d41d0b2
|
Nick: index exploration
|
2024-12-30 19:37:48 -03:00 |
|
Nicolas
|
0847a6038e
|
Merge pull request #1014 from mendableai/nsc/extract-url-trace
/extract URL trace
|
2024-12-30 19:00:58 -03:00 |
|
Gergő Móricz
|
71a8f7452c
|
fix(WebScraper/sitemap): await urlsHandler to fix race condition
v1.1.1
|
2024-12-30 16:09:22 +01:00 |
|
Nicolas
|
8ae34a0d31
|
Nick: rm .xml from isFile
|
2024-12-30 11:57:01 -03:00 |
|
Gergő Móricz
|
9005757de3
|
fix(queue-worker): do not follow redirect URLs if they are not allowed by the crawl options
|
2024-12-30 14:41:31 +01:00 |
|
Gergő Móricz
|
4d1f92f4c8
|
fix(scrapeURL/fetch): block loopback and link-local IPs
|
2024-12-29 17:35:14 +01:00 |
|
Nicolas
|
e255301005
|
Update index.ts
|
2024-12-27 21:31:29 -03:00 |
|
Nicolas
|
c1fa5a44ae
|
Merge pull request #1016 from mendableai/mog/mineru
feat(scrapeURL/pdf): switch to MU (FIR-356)
|
2024-12-27 21:19:48 -03:00 |
|
Nicolas
|
1eca61bffb
|
Update index.ts
|
2024-12-27 20:59:18 -03:00 |
|
Nicolas
|
f9d55efba8
|
Update index.ts
|
2024-12-27 20:54:26 -03:00 |
|
Nicolas
|
b8d7f9f257
|
Nick: we are using runpod
|
2024-12-27 19:59:05 -03:00 |
|
Nicolas
|
5fcf3fa97e
|
Merge branch 'main' into mog/mineru
|
2024-12-27 19:53:09 -03:00 |
|
Nicolas
|
a431cafa47
|
Merge pull request #991 from RutamBhagat/rust-sdk-conditionally-enforce-api-key
feat(rust-sdk): Make API key optional for self-hosted instances
|
2024-12-27 19:07:01 -03:00 |
|
Nicolas
|
65cf4cd74e
|
Merge pull request #1013 from yujunhui/main
fix: merge mock success data
|
2024-12-27 19:04:04 -03:00 |
|
Nicolas
|
05d5f84d87
|
Merge pull request #1018 from mendableai/feat/add-favicon-metadata
[FIR-37] feat: extract and return favicon URL during scraping
|
2024-12-27 17:44:03 -03:00 |
|
Nicolas
|
eba5fda9a1
|
Merge pull request #955 from mendableai/rafa/fix-default-on-schema-llm-extract
fixed optional+default bug on llm schema
|
2024-12-27 16:33:04 -03:00 |
|
Ademílson F. Tonato
|
a4cf814f70
|
feat: return favicon url when scraping
|
2024-12-27 19:18:53 +00:00 |
|
Gergő Móricz
|
0421f81020
|
Sitemap fixes (#1010)
* sitemap fixes iter 1
* feat(sitemap): dedupe improvements
---------
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
|
2024-12-27 19:59:26 +01:00 |
|
Nicolas
|
6851281beb
|
Update __init__.py
|
2024-12-27 15:46:00 -03:00 |
|
Nicolas
|
cd08be7f37
|
Merge pull request #990 from RutamBhagat/python-sdk-conditionally-enforce-api-key
feat(python-sdk): Make API key optional for self-hosted instances
|
2024-12-27 15:43:37 -03:00 |
|
Nicolas
|
c5b6495e48
|
Merge pull request #1015 from mendableai/nsc/improves-sitemap-fetching
Improves sitemap fetching
v1.1.0
|
2024-12-27 14:41:04 -03:00 |
|
Nicolas
|
2ea0e9a241
|
Merge pull request #1003 from RutamBhagat/credit-usage-api-docs
docs(credit-usage-api): add new endpoint documentation for credit usage
|
2024-12-27 13:59:54 -03:00 |
|
Nicolas
|
e8f0a22ebe
|
Update v1-openapi.json
|
2024-12-27 13:59:43 -03:00 |
|
Nicolas
|
f7cfbba651
|
Merge branch 'main' into pr/1003
|
2024-12-27 13:59:24 -03:00 |
|
Nicolas
|
1abb544e3e
|
Update index.test.ts
|
2024-12-27 13:59:09 -03:00 |
|
Gergő Móricz
|
4772951313
|
feat(scrapeURL/fire-engine): explicitly delete job after scrape
|
2024-12-27 16:44:41 +01:00 |
|
Gergő Móricz
|
0b55fb836b
|
feat(scrapeURL/pdf): switch to MinerU
|
2024-12-27 16:37:32 +01:00 |
|
Nicolas
|
ece95e97f4
|
Merge branch 'main' into nsc/extract-url-trace
|
2024-12-26 21:28:51 -03:00 |
|
Gergő Móricz
|
c543f4f76c
|
feat(scrapeURL/pdf): update mock Blob implementation to pass TypeScript
|
2024-12-26 20:31:51 +01:00 |
|
Gergő Móricz
|
f15ef0e758
|
feat(scrapeURL/fire-engine/chrome-cdp): handle file downloads
|
2024-12-26 20:29:09 +01:00 |
|
Nicolas
|
4451c4f671
|
Nick:
|
2024-12-26 13:51:20 -03:00 |
|
Nicolas
|
37f258b73f
|
Merge pull request #974 from mendableai/fix-sdk/next-in-when-502
[bug/JS-SDK]Added check for object and trycatch as workaround for 502s
|
2024-12-26 12:53:55 -03:00 |
|
Nicolas
|
bcc18e1c07
|
Merge branch 'main' into fix-sdk/next-in-when-502
|
2024-12-26 12:53:10 -03:00 |
|
Nicolas
|
4f65d350a3
|
Update package.json
|
2024-12-26 12:52:52 -03:00 |
|
Nicolas
|
4332f18a8f
|
Nick: making it optional for the user
|
2024-12-26 12:43:58 -03:00 |
|
Nicolas
|
233f347f5e
|
Nick: refactor
|
2024-12-26 12:41:37 -03:00 |
|
Nicolas
|
f467a3ae6c
|
Nick: init
|
2024-12-26 12:21:46 -03:00 |
|
yujunhui
|
2f39bdddd9
|
fix: merge mock success data
|
2024-12-26 17:56:30 +08:00 |
|
Nicolas
|
c911aad228
|
Update package.json
|
2024-12-23 18:48:03 -03:00 |
|
Nicolas
|
b1a5625b22
|
Revert "Merge pull request #997 from mendableai/feat/sdk-without-ws"
This reverts commit 53cda5f81c53d3de35925c610ce083923ca09fbe, reversing
changes made to 51f79b55efadc53243a8c22d86bb2d08d878d524.
|
2024-12-23 18:45:51 -03:00 |
|
Nicolas
|
18ceaf10a5
|
Update .gitignore
|
2024-12-23 18:42:05 -03:00 |
|
Nicolas
|
53cda5f81c
|
Merge pull request #997 from mendableai/feat/sdk-without-ws
feat: Support environments without ws by dynamically importing WebSocket module with error handling
|
2024-12-23 18:41:38 -03:00 |
|
Nicolas
|
0c1c4f2ede
|
Merge branch 'main' into feat/sdk-without-ws
|
2024-12-23 18:41:31 -03:00 |
|
Nicolas
|
51f79b55ef
|
Merge pull request #1005 from RutamBhagat/contributing-md-docker-compose
docs(CONTRIBUTING.md): Add Docker Compose setup instructions
|
2024-12-23 13:38:20 -03:00 |
|
Nicolas
|
67c643ad1c
|
Merge pull request #989 from RutamBhagat/js-sdk-conditionally-enforce-api-key
feat(js-sdk): Make API key optional for self-hosted instances
|
2024-12-23 12:56:38 -03:00 |
|
RutamBhagat
|
7366f36e39
|
docs(CONTRIBUTING.md): Add Docker Compose setup instructions to CONTRIBUTING.md
|
2024-12-21 07:03:16 -08:00 |
|
RutamBhagat
|
ca2d3dc6d2
|
docs(credit-usage-api): add new endpoint documentation for credit usage
|
2024-12-21 06:24:53 -08:00 |
|
Thomas Kosmas
|
199bd2d1f4
|
Merge branch 'main' into feat/sdk-without-ws
|
2024-12-21 02:32:09 +02:00 |
|
Thomas Kosmas
|
a9d31c8e42
|
Merge branch 'main' into feat/sdk-without-ws
|
2024-12-21 02:30:40 +02:00 |
|