Skip to content

Conversation

muneeb-ashraf
Copy link

No description provided.

muneeb-ashraf and others added 30 commits August 24, 2025 22:43
- Refactored the Express-style router to a standard Next.js API route handler.
- Added explicit types for `NextApiRequest` and `NextApiResponse`.
- Added types for parameters within the `page.$$eval()` callback to resolve implicit 'any' errors.
- Added null checks for `textContent` to fix build failures under strict mode.
- Replaced `puppeteer` with `puppeteer-core` and `@sparticuz/chromium` to enable running in a serverless environment.
- Updated the Puppeteer launch logic in the `scrapeCompany` API route to use the executable provided by `@sparticuz/chromium`.
- Updated dependency versions to be compatible.
- Pinned `@sparticuz/chromium` to `123.0.1` and `puppeteer-core` to `22.7.1`.
- This is to resolve the `libnss3.so: cannot open shared object file` error on Vercel.
- Using exact versions prevents `npm` from installing newer, potentially incompatible patch versions that may rely on system libraries not present in the Vercel runtime.
- Adds an `engines` field to `package.json` to specify the Node.js version for the Vercel deployment.
- This is to fix the `libnss3.so: cannot open shared object file` error by forcing Vercel to use the Node.js 18 runtime, which is expected to have the necessary system libraries for the pre-compiled Chromium binary.
- Aligns the project configuration with the official Vercel guide for deploying Puppeteer.
- Adds `serverExternalPackages` to `next.config.js` to prevent bundling of puppeteer-core and @sparticuz/chromium.
- Refactors the API route to use dynamic imports for puppeteer packages, which is the recommended best practice.
- This should be the definitive fix for the runtime errors on Vercel.
Upgrades @sparticuz/chromium to v138 and puppeteer-core to v24.

This is to provide a Chromium binary that is compatible with the ARM64 architecture used in modern Vercel runtimes.

This should be the definitive fix for the persistent "libnss3.so: cannot open shared object file" error, which was likely caused by an architecture mismatch.
- Removes unsupported `serverExternalPackages` property from `next.config.js` for compatibility with Next.js v13.2.4.
- Wraps the result of `page.pdf()` in `Buffer.from()` to fix a type error in `pdf.ts` caused by a puppeteer-core update.
- Updates the puppeteer.launch() call in scrapeCompany.ts to align with the new API of @sparticuz/chromium v138.
- Replaces the deprecated chromium.defaultViewport and chromium.headless properties with a manual viewport object and the new headless: "shell" mode.
- This resolves the final build error.
- Temporarily modifies the scrapeCompany API to return the full page HTML instead of scraped data.
- This is to allow for analysis of the page structure to build a correct and robust scraper.
- Replaces the simple table-based scraper with a more robust, generic scraper.
- The new scraper iterates through all `td` elements, looking for key-value pairs based on text ending in a colon.
- Includes a special case to handle the DBA Name, which does not have a label.
- This should now correctly extract all data from the company details page.
- Refactors the scraper into separate handlers for name, license, and combined searches.
- Implements case-insensitive name searching.
- Returns a list of close matches if no exact name match is found.
- Implements a validation flow to compare name and license search results.
- Standardizes all API responses into a `{ status, ... }` format.
- Adds a .filter(a => a.textContent) to the page.$$eval calls in searchByName and searchByLicense.
- This prevents a runtime error if an anchor tag has no text content.
- Resolves the TypeScript build failure.
- Changes the search result selector from a specific table-based one to a general 'a' tag selector.
- Adds a filter to only process links containing 'licensenum=' in the href, ensuring only relevant company links are considered.
- This resolves the regression where no companies were being found after a search.
- Rewrites the core performSearch function to be more robust.
- Adds explicit page.waitForSelector() calls before every click and type action to prevent race conditions.
- This should resolve the "No results found" error by ensuring the script does not interact with elements before they are ready.
- Updates the button selectors in the performSearch function to be more specific, using both class and name attributes.
- This increases the reliability of the automation and should prevent the search flow from failing.
Copy link

vercel bot commented Aug 27, 2025

@muneeb-ashraf is attempting to deploy a commit to the Joel Griffith's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant