rafaelsideguide
|
9ad06fdf56
|
added fire-engine fallback for getting sitemaps
|
2024-07-09 16:07:53 -03:00 |
|
Rafael Miller
|
f0f449fe51
|
Merge pull request #336 from snippet/allow-external-content-links
[Proposal] new feature allowExternalContentLinks
|
2024-07-02 09:45:21 -03:00 |
|
Jeff Pereira
|
a5fb45988c
|
new feature allowExternalContentLinks
|
2024-06-28 17:23:40 -07:00 |
|
Eric Ciarla
|
70fcf2ce03
|
init
|
2024-06-28 16:39:09 -04:00 |
|
rafaelsideguide
|
3ebdf93342
|
removed console.logs
|
2024-06-24 16:43:12 -03:00 |
|
rafaelsideguide
|
21d29de819
|
testing crawl with new.abb.com case
many unnecessary console.logs for tracing the code execution
|
2024-06-24 16:25:07 -03:00 |
|
Eric Ciarla
|
34e37c5671
|
Add unit tests to replace e2e
|
2024-06-15 16:43:37 -04:00 |
|
Eric Ciarla
|
a6b7197737
|
Fix for maxDepth
|
2024-06-14 19:40:37 -04:00 |
|
Eric Ciarla
|
2c5f5c0ea2
|
Merge branch 'main' into feat/maxDepthRelative
|
2024-06-14 11:49:12 -04:00 |
|
Rafael Miller
|
f9c7ca9388
|
Merge branch 'main' into feat/issue-266
|
2024-06-14 11:47:58 -03:00 |
|
Rafael Miller
|
3e2e76311c
|
Merge branch 'main' into feat/issue-205
|
2024-06-14 11:25:20 -03:00 |
|
Eric Ciarla
|
59451754f5
|
Add tests
|
2024-06-14 10:14:07 -04:00 |
|
Eric Ciarla
|
71c98d8b80
|
Update logic
|
2024-06-13 18:00:52 -04:00 |
|
Eric Ciarla
|
095951aa4d
|
Update test
|
2024-06-13 17:40:00 -04:00 |
|
Eric Ciarla
|
5e8aa92788
|
Update index.ts
|
2024-06-13 17:33:13 -04:00 |
|
Eric Ciarla
|
65d63bae45
|
Update index.ts
|
2024-06-13 17:17:44 -04:00 |
|
Eric Ciarla
|
32e814bedc
|
Update index.ts
|
2024-06-13 17:02:30 -04:00 |
|
rafaelsideguide
|
bb859ae9a7
|
Added metadata.pageStatusCode and metadata.pageError properties to the responses
|
2024-06-13 17:08:40 -03:00 |
|
rafaelsideguide
|
676d6e8ab5
|
Added pageOptions.removeTags
|
2024-06-13 10:51:05 -03:00 |
|
rafaelsideguide
|
e37d151404
|
added parsePDF option to pageOptions
user can decide if they are going to let us take care of the parse or they are going to parse the pdf by themselves
|
2024-06-12 15:06:47 -03:00 |
|
rafaelsideguide
|
dc6acbf1f0
|
Merge remote-tracking branch 'origin/main' into feat/allowbackwardcrawling-option
|
2024-06-12 11:01:05 -03:00 |
|
Nicolas
|
520739c9f4
|
Nick: fixed bugs associated with absolute path replacements
|
2024-06-11 12:43:16 -07:00 |
|
rafaelsideguide
|
ee282c3d55
|
Added allowBackwardCrawling option
|
2024-06-11 15:24:39 -03:00 |
|
Nicolas
|
f6b06ac27a
|
Nick: ignoreSitemap, better crawling algo
|
2024-06-10 18:12:41 -07:00 |
|
Nicolas
|
3091f0134c
|
Nick:
|
2024-06-10 16:27:10 -07:00 |
|
Nicolas
|
b4c6819a54
|
Nick:
|
2024-06-05 11:11:09 -07:00 |
|
Rafael Miller
|
02fe470e20
|
Merge pull request #148 from mendableai/nsc/improvemnts-fixes-misc
Better fallbacks for initial crawl start
|
2024-06-04 14:31:10 -03:00 |
|
rafaelsideguide
|
6920ec8a61
|
bugfixing. already on main
|
2024-06-04 11:05:50 -03:00 |
|
Nicolas
|
918059ee9e
|
Merge branch 'main' into nsc/improvemnts-fixes-misc
|
2024-06-03 16:46:02 -07:00 |
|
Nicolas
|
df6c3d1e7d
|
Merge branch 'main' into detect-pdfs
|
2024-05-17 09:55:51 -07:00 |
|
Nicolas
|
9d635cb2a3
|
Nick: docx support
|
2024-05-16 11:48:02 -07:00 |
|
Nicolas
|
098db17913
|
Update index.ts
|
2024-05-15 17:37:09 -07:00 |
|
Nicolas
|
6ca368327f
|
Merge branch 'main' into test/crawl-options
|
2024-05-15 17:18:25 -07:00 |
|
Nicolas
|
ade4e05cff
|
Nick: working
|
2024-05-15 17:13:04 -07:00 |
|
Nicolas
|
bfccaf670d
|
Nick: fixes most of it
|
2024-05-15 15:30:37 -07:00 |
|
rafaelsideguide
|
d91043376c
|
not working yet
|
2024-05-15 18:54:40 -03:00 |
|
rafaelsideguide
|
fa014defc7
|
Fixing child links only bug
|
2024-05-15 18:35:09 -03:00 |
|
Nicolas
|
2ba743fb1a
|
Merge pull request #27 from eltociear/patch-1
refactor: fix typo in WebScraper/index.ts
|
2024-05-15 13:28:38 -07:00 |
|
Nicolas
|
1b0d6341d3
|
Update index.ts
|
2024-05-15 11:48:12 -07:00 |
|
Nicolas
|
d10f81e7fe
|
Nick: fixes
|
2024-05-15 11:28:20 -07:00 |
|
Nicolas
|
87570bdfa1
|
Update index.ts
|
2024-05-15 11:06:03 -07:00 |
|
Ikko Eltociear Ashimine
|
e91c122c69
|
Merge branch 'main' into patch-1
|
2024-05-15 12:14:52 +09:00 |
|
Nicolas
|
a0fdc6f7c6
|
Nick:
|
2024-05-14 12:12:40 -07:00 |
|
Nicolas
|
7f31959be7
|
Nick:
|
2024-05-14 12:04:36 -07:00 |
|
Nicolas
|
8a72cf556b
|
Nick:
|
2024-05-13 21:10:58 -07:00 |
|
Nicolas
|
26a092f780
|
Update index.ts
|
2024-05-13 21:04:49 -07:00 |
|
Nicolas
|
8101cbee37
|
Update index.ts
|
2024-05-13 21:02:47 -07:00 |
|
Nicolas
|
86b8439844
|
Nick:
|
2024-05-13 20:51:42 -07:00 |
|
Nicolas
|
a96fc5b96d
|
Nick: 4x speed
|
2024-05-13 20:45:11 -07:00 |
|
rafaelsideguide
|
8eb2e95f19
|
Cleaned up
|
2024-05-13 16:13:10 -03:00 |
|