Fix: batch improvements and bug fixes (9 commits)#105
Open
MorsH14 wants to merge 12 commits into
Open
Conversation
…nditions - youtubeURL.ts: add require.main guard to prevent auto-execution on import; also accept URLs from CLI args so the script is reusable - WebsiteScraping.ts: same require.main guard + accept URLs from CLI args or TRAIN_WEBSITES env var instead of hardcoded French medical sites - routes/api.ts: move /logout before requireAuth so expired tokens can still clear their own cookie; add targetAccount validation to GET and POST /scrape-followers (missing param caused navigation to instagram.com/undefined/) - TrainWithAudio.ts: only delete local audio file after successful processing — the finally block was destroying the file even on upload failure (data loss) - IgClient.ts: fix scrapeFollowers limit calculation so maxFollowers=0 returns all followers instead of empty array; add 'paid partnership with' to default ad markers to match .env.example - IG-bot/index.ts: remove schedulePost from IInstagramClient interface and class — the implementation always threw, making the interface contract unsatisfiable - api/agent/index.ts: remove the Express router that was defined but never mounted anywhere; keep only the getShouldExitInteractions/set exports - Agent/index.ts + utils/index.ts: fix concurrent API key rotation — module-level shared Set caused one request's key exhaustion to poison other requests; triedKeys is now per-call and passed through recursive retries Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tation - src/test/index.ts: check the err param in the download callback before attempting to upload media — previously a failed download still called twitterClient.v1.uploadMedia on a missing/incomplete file - src/models/Tweet.ts: make imageUrl optional (required: false, default: null) to match saveTweetData which stores imageUrl || null; required: true caused silent Mongoose validation failures when imageUrl was absent - package.json: add mime-types to dependencies — sample/Audio.ts imports it directly but it was only listed under @types/mime-types in devDependencies, causing a missing-module error at runtime - src/utils/index.ts: propagate triedKeys Set through handleError recursive retries so 429 rotation stops correctly after exhausting all keys; previously each handleError call created a fresh Set, allowing infinite key cycling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…riedKeys propagation - check-env.js: only require MONGODB_URI when MONGODB_REQUIRED != 'false'; previously blocked all non-MongoDB setups from passing the env check - logger.ts: use process.cwd()/logs instead of __dirname/../logs; the old path resolved to build/logs/ after tsc while DailyRotateFile wrote to cwd/logs/, making mkdirSync create a directory Winston never used - WebsiteScraping.ts: normalize baseUrl to end with '/' before startsWith check to prevent cross-domain crawl (e.g. example.com matching example.com.evil.com) - Agent/index.ts: pass triedKeys to handleError so 503 retry paths do not reset key rotation state and re-exhaust already-tried keys Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ze limit - secret/index.ts: replace manual cookie regex /token=([^;]+)/ with req.cookies.token; the regex matched substrings so a cookie named e.g. 'supertoken' appearing before 'token' in the header would return the wrong value (cookie-parser already runs before all routes) - routes/api.ts: sanitize targetAccount before embedding in Content-Disposition filename; unsanitized user input could inject quotes or other characters that break the header attribute syntax - app.ts: add 1mb limit to express.json() to prevent memory exhaustion from arbitrarily large JSON request bodies Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The inner loop guarded with `followers.length < maxFollowers`, but when maxFollowers is undefined (unlimited mode), that becomes `x < undefined` which is always false — so no followers were ever pushed. Using the already-computed `limit` variable (Infinity when unlimited, maxFollowers+4 when bounded) makes unlimited scraping work correctly and also ensures the bounded path collects enough entries to survive the slice(4) offset applied after the loop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e.ts) The module-level currentApiKeyIndex and triedApiKeys were shared across all concurrent calls, so parallel training requests corrupted each other's key rotation state — the same class of bug fixed earlier in Agent/index.ts. Replaced with caller-owned apiKeyIndex/triedKeys parameters (defaulting to 0/empty set) passed through all recursive retry calls, making each invocation fully isolated. Also removed unnecessary `await` on the synchronous getYouTubeTranscriptSchema() call. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without a transform configuration, Jest would attempt to execute .ts test files as plain JavaScript and fail immediately. The config wires up ts-jest as the TypeScript transformer and restricts the test glob to *.test.ts so that only proper test files are discovered. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
instagram-ai-network was declared external: true, which requires the network to exist before docker compose up is run — otherwise Docker errors with 'network declared as external, but could not be found'. Changed to driver: bridge so Compose creates the network automatically. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
closeIgClient('default') left Puppeteer browsers for all other
account keys open on SIGTERM/SIGINT. Added closeAllIgClients() that
iterates the igClients Map and closes every active client, then
updated the shutdown sequence to use it instead of the single-key call.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Owner
|
Thanks @MorsH14 — solid batch of fixes. A few are still worth landing on `main`:
Some overlap with other PRs: #103 (same `didRelogin` fix), #106 (body limits/CORS), #107 (input sanitization), #100 (session cookies — already partly on `main`). #101 / MongoDB / jest.config changes are superseded by recent `main` work. Could you rebase and open a focused PR with just the items above? Happy to merge once CI is green. |
- .env.example: keep TRAIN_WEBSITES= and add upstream rate limit env vars - docker-compose.yml: add restart: unless-stopped, drop obsolete network section - scripts/check-env.js: adopt upstream's DATABASE_URL check (MongoDB -> Postgres) - src/services/index.ts: use closeDB() instead of mongoose.disconnect() - src/utils/index.ts: take upstream's triedKeys propagation fix in handleError - src/client/Instagram.ts: trivial comment whitespace - src/client/IG-bot/IgClient.ts: keep our unlimited-followers fix (upstream's requestedCount=0 approach returns empty for unlimited requests) - src/routes/api.ts: use sanitizeFilename() and scrapeLimiter from upstream - src/models/Tweet.ts: accept upstream deletion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Run prettier --write . to fix formatting across all files (package-lock pins prettier@3.8.1; CI now uses the same version) - Remove jest.config.js (our addition) — superseded by upstream's jest.config.cjs + jest.ts-transformer.cjs which already covers .test.ts files All CI checks now green locally: lint: 0 errors (1 unused-var warning, not a failure) typecheck: clean format:check: all 143 files pass test: 116/116 pass across 15 suites Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains 9 bug fixes and improvements made today: