Nicolas
|
c00cd21308
|
Nick: adds support for mobile web scraping
|
2024-10-29 14:10:40 -03:00 |
|
Thomas Kosmas
|
bd55464b52
|
skipTlsVerification
|
2024-10-22 22:28:02 +03:00 |
|
Nicolas
|
b4f6a0f919
|
Nick: geolocation
|
2024-10-15 21:12:33 -03:00 |
|
Nicolas
|
db161ac55a
|
Nick: press + write
|
2024-09-20 19:45:23 -04:00 |
|
Gergő Móricz
|
d663bbf0ca
|
feat(actions): add scroll
|
2024-09-20 21:41:53 +02:00 |
|
Gergő Móricz
|
3dd912ec91
|
feat(actions): add typeText, pressKey, fix playwright screenshot/waitFor
|
2024-09-20 21:02:53 +02:00 |
|
Gergő Móricz
|
093c064bff
|
feat(v1): add public actions api
|
2024-09-18 20:39:25 +02:00 |
|
Gergő Móricz
|
42d677fe3c
|
feat(fire-engine): port waitFor and screenshot to use actions
|
2024-09-18 20:04:54 +02:00 |
|
rafaelsideguide
|
8c1097e9e1
|
fix: pageOptions
|
2024-09-05 14:16:31 -03:00 |
|
Nicolas
|
41eb620959
|
Nick: prompt option, still need to convert to new structured outputs
|
2024-08-29 21:00:57 -03:00 |
|
Nicolas
|
49e1cb7ca0
|
Nick:
|
2024-08-29 20:08:06 -03:00 |
|
Gergő Móricz
|
e7f267b6fe
|
Merge branch 'main' into v1-webscraper
|
2024-08-23 17:21:54 +02:00 |
|
rafaelsideguide
|
ecd472356b
|
added variables to beta customers
|
2024-08-19 16:41:54 -03:00 |
|
rafaelsideguide
|
7a61325500
|
map + search + scrape markdown bug
|
2024-08-16 17:57:11 -03:00 |
|
rafaelsideguide
|
3f998b688d
|
scrape ready
|
2024-08-16 15:14:37 -03:00 |
|
Gergő Móricz
|
29f0d9ec94
|
propagate priority to fire-engine
|
2024-08-15 19:04:46 +02:00 |
|
Nicolas
|
3321ca9398
|
Merge pull request #504 from mendableai/feat/fullpage-screenshot
[Feat] Added fullpagescreenshot capabilities
|
2024-08-06 13:52:29 -04:00 |
|
rafaelsideguide
|
3edc3a3d15
|
added fullpagescreenshot capabilities, wip on fire-engine side
|
2024-08-05 18:17:37 -03:00 |
|
rafaelsideguide
|
f32e8de156
|
fixes the empty excludes.filter undefined bug
|
2024-08-05 18:13:31 -03:00 |
|
Nicolas
|
4c9d62f6d3
|
Nick: fixing sitemap fallback
|
2024-07-26 18:25:44 -04:00 |
|
Gergo Moricz
|
7cd9bf92e3
|
feat: scrape event logging to DB
|
2024-07-24 14:31:25 +02:00 |
|
Caleb Peffer
|
d39d3be649
|
Caleb: now extracting and returning a list of all links on the page for a customer
|
2024-07-16 18:38:03 -07:00 |
|
Nicolas
|
e098e88ea7
|
Nick:
|
2024-07-12 22:02:08 -04:00 |
|
Rafael Miller
|
f0f449fe51
|
Merge pull request #336 from snippet/allow-external-content-links
[Proposal] new feature allowExternalContentLinks
|
2024-07-02 09:45:21 -03:00 |
|
Jeff Pereira
|
a5fb45988c
|
new feature allowExternalContentLinks
|
2024-06-28 17:23:40 -07:00 |
|
Eric Ciarla
|
87b54488d3
|
update to includeRawHtml
|
2024-06-28 17:07:47 -04:00 |
|
Eric Ciarla
|
70fcf2ce03
|
init
|
2024-06-28 16:39:09 -04:00 |
|
Nicolas
|
1d4907acc9
|
Nick:
|
2024-06-26 21:02:58 -03:00 |
|
Rafael Miller
|
f9c7ca9388
|
Merge branch 'main' into feat/issue-266
|
2024-06-14 11:47:58 -03:00 |
|
Rafael Miller
|
3e2e76311c
|
Merge branch 'main' into feat/issue-205
|
2024-06-14 11:25:20 -03:00 |
|
rafaelsideguide
|
bb859ae9a7
|
Added metadata.pageStatusCode and metadata.pageError properties to the responses
|
2024-06-13 17:08:40 -03:00 |
|
rafaelsideguide
|
676d6e8ab5
|
Added pageOptions.removeTags
|
2024-06-13 10:51:05 -03:00 |
|
rafaelsideguide
|
e37d151404
|
added parsePDF option to pageOptions
user can decide if they are going to let us take care of the parse or they are going to parse the pdf by themselves
|
2024-06-12 15:06:47 -03:00 |
|
rafaelsideguide
|
dc6acbf1f0
|
Merge remote-tracking branch 'origin/main' into feat/allowbackwardcrawling-option
|
2024-06-12 11:01:05 -03:00 |
|
Nicolas
|
520739c9f4
|
Nick: fixed bugs associated with absolute path replacements
|
2024-06-11 12:43:16 -07:00 |
|
rafaelsideguide
|
ee282c3d55
|
Added allowBackwardCrawling option
|
2024-06-11 15:24:39 -03:00 |
|
Nicolas
|
f6b06ac27a
|
Nick: ignoreSitemap, better crawling algo
|
2024-06-10 18:12:41 -07:00 |
|
Nicolas
|
b4c6819a54
|
Nick:
|
2024-06-05 11:11:09 -07:00 |
|
Nicolas
|
6bea803120
|
Nick:
|
2024-05-31 15:39:54 -07:00 |
|
Nicolas
|
6c939d534d
|
Nick: small refactor
|
2024-05-29 19:43:51 -07:00 |
|
Eric Ciarla
|
a0e404f94e
|
init commit
|
2024-05-29 18:56:57 -04:00 |
|
Nicolas
|
1b3547dcf2
|
Nick:
|
2024-05-28 12:56:24 -07:00 |
|
Nicolas
|
77a79b5a79
|
Nick: max num tokens for llm extract (for now) + slice the max
|
2024-05-20 17:07:38 -07:00 |
|
Nicolas
|
8a72cf556b
|
Nick:
|
2024-05-13 21:10:58 -07:00 |
|
Nicolas
|
a96fc5b96d
|
Nick: 4x speed
|
2024-05-13 20:45:11 -07:00 |
|
Nicolas
|
dcedb8d798
|
Merge branch 'main' into feat/max-depth
|
2024-05-07 10:20:49 -07:00 |
|
Nicolas
|
6505bf6bf2
|
Merge branch 'main' into feat/max-depth
|
2024-05-07 10:20:44 -07:00 |
|
Nicolas
|
bdbee963f7
|
Merge branch 'main' into nsc/cancel-job
|
2024-05-07 10:13:43 -07:00 |
|
rafaelsideguide
|
e1f52c538f
|
nested includeHtml inside pageOptions
|
2024-05-07 13:40:24 -03:00 |
|
rafaelsideguide
|
83f3408634
|
Added max depth option
|
2024-05-07 11:06:26 -03:00 |
|