123 Commits

Author SHA1 Message Date
Nicolas
3c2bfe2da2
Update single_url.ts 2024-09-17 01:58:47 -04:00
Nicolas
18b024c238 Update single_url.ts 2024-09-17 01:41:46 -04:00
Nicolas
a4039bd008 Revert "Update single_url.ts"
This reverts commit 0f8c0a570dca877d14d590e6002eaffd345a3927.
2024-09-16 23:36:38 -04:00
Nicolas
0f8c0a570d
Update single_url.ts 2024-09-16 21:44:56 -04:00
Nicolas
17e419a7fb Nick: 2024-09-09 21:06:23 -03:00
rafaelsideguide
8c1097e9e1 fix: pageOptions 2024-09-05 14:16:31 -03:00
rafaelsideguide
b301ffc922 added missing variables 2024-09-05 13:57:26 -03:00
Nicolas
08a9cb8db4 Merge branch 'main' into pr/516 2024-09-02 23:32:23 -03:00
Nicolas
49e1cb7ca0 Nick: 2024-08-29 20:08:06 -03:00
rafaelsideguide
ef2d8d012b Merge branch 'v1-webscraper' of https://github.com/mendableai/firecrawl into v1-webscraper 2024-08-28 14:07:31 -03:00
rafaelsideguide
5cbf0dcaf5 fix(v1): includeTags 2024-08-28 14:07:28 -03:00
Nicolas
ecd07be49e Nick: fixed issues 2024-08-28 13:17:22 -03:00
Nicolas
4d0acc9722 Merge branch 'main' into v1-webscraper 2024-08-26 16:22:05 -03:00
Nicolas
173f4ee1bf Nick: chrome cdp main | simple autoscaler 2024-08-23 20:09:59 -03:00
Gergő Móricz
e7f267b6fe Merge branch 'main' into v1-webscraper 2024-08-23 17:21:54 +02:00
rafaelsideguide
7473b74021 fix: html and rawlhtmls for pdfs 2024-08-22 15:15:45 -03:00
rafaelsideguide
fe2e8c0b7a includehtml fix 2024-08-21 15:54:00 -03:00
Gergő Móricz
1368f9a87f fix: treat existing screenshot as a scraper success condition 2024-08-20 22:24:18 +02:00
rafaelsideguide
ecd472356b added variables to beta customers 2024-08-19 16:41:54 -03:00
rafaelsideguide
7a61325500 map + search + scrape markdown bug 2024-08-16 17:57:11 -03:00
rafaelsideguide
3f998b688d scrape ready 2024-08-16 15:14:37 -03:00
Gergő Móricz
29f0d9ec94 propagate priority to fire-engine 2024-08-15 19:04:46 +02:00
Rafael Miller
76160a38db
Update single_url.ts 2024-08-12 17:57:00 -03:00
Rafael Miller
7c339ea125
Update single_url.ts 2024-08-12 17:55:10 -03:00
rafaelsideguide
c3aeed510b Update single_url.ts 2024-08-12 16:40:31 -03:00
Kevin Swiber
ba2af74adf
Ensuring USE_DB_AUTHENTICATION is true in single URL scraper. 2024-08-09 15:29:18 -07:00
Gergő Móricz
5fc7fcb77c
Merge branch 'main' into feat/queue-scrapes 2024-08-07 16:35:44 +02:00
Gergo Moricz
b60ee30dba fix(single_url): accept 500 2024-08-06 18:00:56 +02:00
rafaelsideguide
4d24a99d50 fix params 2024-08-06 09:34:43 -03:00
rafaelsideguide
3edc3a3d15 added fullpagescreenshot capabilities, wip on fire-engine side 2024-08-05 18:17:37 -03:00
Nicolas
7b813883ef Nick: first layer 2024-07-29 20:31:51 -04:00
rafaelsideguide
96cec2a673 fix checking scrape log success content length 2024-07-26 12:00:52 -03:00
Nicolas
f82ca3be17 Nick: 2024-07-25 19:53:29 -04:00
Nicolas
01fab6e036 Update single_url.ts 2024-07-25 17:51:41 -04:00
Nicolas
56042d090c Update single_url.ts 2024-07-25 17:48:44 -04:00
Nicolas
3242872503 Update single_url.ts 2024-07-25 17:43:55 -04:00
Gergo Moricz
4d35ad073c feat(monitoring/scrape): include url, worker, response_size 2024-07-24 16:43:39 +02:00
Gergo Moricz
64bcedeefc fix(monitoring): bad success check on scrape 2024-07-24 16:21:59 +02:00
Gergo Moricz
7cd9bf92e3 feat: scrape event logging to DB 2024-07-24 14:31:25 +02:00
rafaelsideguide
6208ecdbc0 added logger 2024-07-23 17:30:46 -03:00
Nicolas
d2de01d342 Nick: fixes 2024-07-18 13:19:44 -04:00
Nicolas
f11137352c Merge branch 'main' into feat/fire-engine-chrome-cdp 2024-07-18 12:48:42 -04:00
Caleb Peffer
c5d1e7260d Caleb: made changes per Rafaels requests 2024-07-17 11:29:05 -07:00
Caleb Peffer
d39d3be649 Caleb: now extracting and returning a list of all links on the page for a customer 2024-07-16 18:38:03 -07:00
Thomas Kosmas
5c65ec58e5 Support chrome-cdp and restructure sitemap fire-engine support. 2024-07-15 18:40:43 +03:00
Nicolas
066d92f643 Update single_url.ts 2024-07-03 18:38:17 -03:00
Nicolas
90c54c32fd Nick: refactor 2024-07-03 18:01:17 -03:00
Nicolas
90cf799a3c Update single_url.ts 2024-07-03 17:56:21 -03:00
Nicolas
b36406e465 Nick: log scrpaers 2024-07-03 17:28:53 -03:00
rafaelsideguide
7b7154ba1e bugfixed pageStatusCode 2024-07-02 10:51:35 -03:00