Conversation
Use the nextJS `robots.ts` setup to create separate `robots.txt` files per environment. For non-prod environments, set a disallow on everything to prevent google from indexing qa.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR implements environment-specific robots.txt generation using Next.js's robots.ts functionality to prevent search engine indexing of non-production environments while maintaining selective bot blocking in production.
Changes:
- Removed static robots.txt file from public directory
- Added dynamic robots.ts handler that generates environment-specific rules
- Configured complete disallow for non-production environments to prevent indexing
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| public/robots.txt | Removed static robots.txt file to enable dynamic generation |
| app/robots.ts | Added dynamic robots handler with environment-specific rules |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return { | ||
| rules: [ | ||
| { | ||
| userAgent: "Bytedance", | ||
| disallow: "/", | ||
| }, | ||
| { | ||
| userAgent: "Bytespider", | ||
| disallow: "/", | ||
| }, | ||
| { | ||
| userAgent: "GPTBot", | ||
| disallow: "/", | ||
| }, | ||
| ], | ||
| }; |
There was a problem hiding this comment.
The production rules block specific bots but don't specify behavior for other crawlers. This means bots like Googlebot will have unrestricted access by default. Consider adding an explicit rule for '*' user agent with 'allow: /' to make the intent clear, or add 'allow' rules for the blocked bots if they should only be blocked from specific paths.
There was a problem hiding this comment.
gonna ignore this to maintain the exact existing robots.txt
| rules: { | ||
| userAgent: "*", | ||
| disallow: "/", | ||
| }, |
There was a problem hiding this comment.
The rules property should be an array to match the MetadataRoute.Robots type definition and maintain consistency with the production configuration. Change 'rules: {' to 'rules: [{' and close with '}]'.
| rules: { | |
| userAgent: "*", | |
| disallow: "/", | |
| }, | |
| rules: [ | |
| { | |
| userAgent: "*", | |
| disallow: "/", | |
| }, | |
| ], |
There was a problem hiding this comment.
The Robots type can be an array or single object, gonna ignore this
Ticket:
This PR does the following:
Use the nextJS
robots.tssetup to create separaterobots.txtfiles per environment. For non-prod environments, set a disallow on everything to prevent google from indexing qa.Open questions
How has this been tested? How should a reviewer test this?
See the generated robots file on vercel: https://digital-collections-git-robots-ts-nypl.vercel.app/robots.txt
Accessibility concerns or updates
Checklist: