Nicolas
|
664ba69f08
|
Nick: f-eng monitoring test
|
2024-12-14 21:40:46 -03:00 |
|
Nicolas
|
ccbae4b155
|
Update auth.ts
|
2024-12-14 00:20:14 -03:00 |
|
Gergő Móricz
|
4b5014d7fe
|
feat(v1/batch/scrape): add ignoreInvalidURLs option
|
2024-12-14 01:11:43 +01:00 |
|
Gergő Móricz
|
e74e4bcefc
|
feat(runWebScraper): retry a scrape max 3 times in a crawl if the status code is failure
|
2024-12-14 00:54:05 +01:00 |
|
Nicolas
|
3b0d192d1b
|
Update types.ts
|
2024-12-12 18:14:11 -03:00 |
|
Nicolas
|
e22a0b596c
|
Nick: custom metadata
|
2024-12-12 13:30:00 -03:00 |
|
Nicolas
|
8a1c404918
|
Nick: revert trailing comma
|
2024-12-11 19:51:08 -03:00 |
|
Nicolas
|
52f2e733e2
|
Nick: fixes
|
2024-12-11 19:48:22 -03:00 |
|
Nicolas
|
00335e2ba9
|
Nick: fixed prettier
|
2024-12-11 19:46:11 -03:00 |
|
Gergő Móricz
|
85cbfbb5bb
|
fix(crawl): disable smart wait
This increases the reliability/deterministic-ness of crawls.
|
2024-12-10 21:12:31 +01:00 |
|
Gergő Móricz
|
6776aee1c3
|
feat(auth): extend rate limiter logging to make it easier to debug
|
2024-12-09 19:29:32 +01:00 |
|
Nicolas
|
4d287bb77f
|
Nick: moving acuc temp to read replica
|
2024-12-06 13:06:26 -03:00 |
|
Gergő Móricz
|
845c2744a9
|
feat(app): add extra crawl logging (app-side only for now)
|
2024-12-05 20:50:36 +01:00 |
|
Gergő Móricz
|
cce94289ee
|
fix(v1/batch/scrape): horrid memory usage
|
2024-12-05 20:49:28 +01:00 |
|
Gergő Móricz
|
f8e619b5df
|
fix(crawl-status): returnvalue filtering on active jobs
|
2024-12-05 18:20:21 +01:00 |
|
Gergő Móricz
|
41d859203f
|
feat(v1/batch/scrape): appendToId
|
2024-12-04 23:35:29 +01:00 |
|
Gergő Móricz
|
7bde034020
|
auth: log team id
|
2024-12-04 23:12:55 +01:00 |
|
Nicolas
|
64546f1259
|
Update types.ts
|
2024-12-04 18:00:51 -03:00 |
|
Nicolas
|
f7207f91b4
|
Nick: temp e-s-1
|
2024-12-04 16:25:43 -03:00 |
|
Gergő Móricz
|
88a16b18a3
|
fix(crawl-status): ts error
|
2024-12-04 17:55:51 +01:00 |
|
Gergő Móricz
|
d8613899e3
|
fix(crawl-status): handle failed jobs (oops)
|
2024-12-04 17:52:47 +01:00 |
|
Gergő Móricz
|
712a138404
|
fix(crawl-status): hard error bug
|
2024-12-04 17:47:37 +01:00 |
|
Nicolas
|
52806807a1
|
Nick: crawl fixes
|
2024-12-03 16:25:55 -03:00 |
|
Nicolas
|
1477ab2359
|
Nick: log clear ACUC cache
|
2024-12-03 12:15:09 -03:00 |
|
Nicolas
|
4bb46ed152
|
Nick: extract prompt fixes and limit the number of urls
|
2024-12-01 20:29:03 -03:00 |
|
rafaelmmiller
|
5ddb7eb922
|
parameter
|
2024-11-29 16:44:54 -03:00 |
|
rafaelmmiller
|
943bbae88d
|
fixed nested data inside extract
|
2024-11-27 18:29:37 -03:00 |
|
Nicolas
|
5522d6af7d
|
Update extract.ts
|
2024-11-26 15:01:42 -03:00 |
|
Nicolas
|
8a26f08b14
|
Update extract.ts
|
2024-11-24 20:37:58 -08:00 |
|
Nicolas
|
2513efc971
|
Update extract.ts
|
2024-11-24 20:31:38 -08:00 |
|
Nicolas
|
30def84c0a
|
Nick: scrape timeout + warnings
|
2024-11-24 19:44:51 -08:00 |
|
Nicolas
|
b693c6c23b
|
Update extract.ts
|
2024-11-24 19:36:18 -08:00 |
|
Nicolas
|
6fbfeafe38
|
Nick: fixed map settings
|
2024-11-20 16:51:13 -08:00 |
|
Nicolas
|
aaddbdc1bc
|
Update map.ts
|
2024-11-20 16:47:07 -08:00 |
|
Nicolas
|
c78dae178b
|
Merge branch 'main' into nsc/new-extract
|
2024-11-20 16:41:13 -08:00 |
|
Nicolas
|
945183ffbd
|
Update extract.ts
|
2024-11-20 16:40:55 -08:00 |
|
Nicolas
|
d196b9d93d
|
Update extract.ts
|
2024-11-20 13:16:36 -08:00 |
|
Nicolas
|
9512d81e05
|
Update extract.ts
|
2024-11-20 13:15:52 -08:00 |
|
Nicolas
|
3de4997f4d
|
Loggin num tokens
|
2024-11-20 13:09:46 -08:00 |
|
Nicolas
|
769f08c10d
|
Billing and log for extract
|
2024-11-20 13:08:09 -08:00 |
|
Nicolas
|
0e4e9a3b37
|
Nick:
|
2024-11-20 13:01:36 -08:00 |
|
Nicolas
|
67a2989874
|
Nick: fixes
|
2024-11-20 12:48:10 -08:00 |
|
Gergő Móricz
|
79a75e088a
|
feat(crawl): allowSubdomain
|
2024-11-19 18:38:59 +01:00 |
|
rafaelmmiller
|
53134b7c85
|
Rafa: removed throw error and added map to requests
|
2024-11-19 09:34:52 -03:00 |
|
rafaelmmiller
|
36cf49c959
|
Merge remote-tracking branch 'origin/main' into nsc/new-extract
|
2024-11-19 09:34:08 -03:00 |
|
rafaelmmiller
|
77e152cba8
|
added team_id to scrape-status endpoint
|
2024-11-18 15:02:00 -03:00 |
|
Gergő Móricz
|
1b032b05fa
|
fix(map): make sitemapOnly simpler
|
2024-11-15 21:14:32 +01:00 |
|
Gergő Móricz
|
a4d3dba865
|
fix(map): ignore limit when using sitemapOnly
|
2024-11-15 21:03:20 +01:00 |
|
Gergő Móricz
|
7b02c45dd0
|
fix(v1/types): better timeout primitives
|
2024-11-15 19:35:54 +01:00 |
|
Gergő Móricz
|
c95a4a26c9
|
fix(v1/batch/scrape): raise default timeout
|
2024-11-15 18:58:03 +01:00 |
|