A powerful automated web crawler that performs login, button clicking, and generates PDF reports. Runs on a daily schedule and is deployable to Netlify.
- ✅ Automated browser login with credentials
- ✅ Automated button clicking
- ✅ PDF report generation with execution details
- ✅ Daily scheduling at 10:00 AM
- ✅ Screenshot capture for verification
- ✅ Comprehensive logging system
- ✅ Web API for manual execution
- ✅ Netlify deployment ready
- ✅ Error handling and retry logic
npm installCopy the example environment file and configure it:
copy .env.example .envEdit .env with your website details:
# Website Configuration
TARGET_URL=https://your-website.com/login
LOGIN_USERNAME=your_username
LOGIN_PASSWORD=your_password
# CSS Selectors for your website
USERNAME_SELECTOR=#username
PASSWORD_SELECTOR=#password
LOGIN_BUTTON_SELECTOR=#login-button
PUNCH_BUTTON_SELECTOR=#punch-button
# Scheduling (10 AM daily)
CRON_SCHEDULE=0 10 * * *
TIMEZONE=Asia/Kolkata# Run once for testing
npm start
# Run with web server
npm run dev- Connect your repository to Netlify
- Set environment variables in Netlify dashboard
- Deploy!
You need to inspect your target website and find the CSS selectors for:
- Username field: Usually
#username,#email, orinput[name="username"] - Password field: Usually
#passwordorinput[name="password"] - Login button: Usually
#login,button[type="submit"], or.login-btn - Punch button: The button you want to click after login
- Open your target website in Chrome
- Right-click on the element you want to select
- Choose "Inspect" or "Inspect Element"
- Right-click on the highlighted HTML
- Choose "Copy" > "Copy selector"
The CRON_SCHEDULE uses cron syntax:
0 10 * * *= 10:00 AM daily0 9 * * 1-5= 9:00 AM weekdays only0 */2 * * *= Every 2 hours
autopunch/
├── src/
│ ├── index.js # Main application
│ ├── server.js # Web API server
│ ├── config/
│ │ └── config.js # Configuration management
│ ├── crawler/
│ │ └── webCrawler.js # Selenium web crawler
│ ├── scheduler/
│ │ └── scheduler.js # Cron job scheduler
│ └── utils/
│ ├── logger.js # Logging utility
│ └── pdfGenerator.js # PDF report generator
├── functions/ # Netlify Functions
│ ├── api.js # Main API handler
│ └── scheduler.js # Scheduled function
├── dist/
│ └── index.html # Web dashboard
├── reports/ # Generated PDF reports
├── logs/ # Application logs
└── screenshots/ # Captured screenshots
When running the web server:
GET /- DashboardGET /health- Health checkPOST /api/run- Manual executionGET /api/logs- View recent logsGET /api/schedule- View scheduled tasks
Set these in your Netlify dashboard:
TARGET_URL=https://your-website.com/login
LOGIN_USERNAME=your_username
LOGIN_PASSWORD=your_password
USERNAME_SELECTOR=#username
PASSWORD_SELECTOR=#password
LOGIN_BUTTON_SELECTOR=#login-button
PUNCH_BUTTON_SELECTOR=#punch-button
BROWSER_HEADLESS=true
CRON_SCHEDULE=0 10 * * *
TIMEZONE=Asia/Kolkata
Netlify doesn't support cron jobs directly. You have two options:
-
GitHub Actions (Recommended):
- Create
.github/workflows/schedule.yml - Use GitHub Actions to trigger your Netlify function
- Create
-
External Cron Service:
- Use services like cron-job.org
- Set up a webhook to your Netlify function
- Store credentials securely in environment variables
- Use HTTPS URLs only
- Consider using encrypted credential storage
- Review website terms of service before automating
- Selectors not found: Inspect the website and update CSS selectors
- Login failed: Check credentials and selectors
- Timeout errors: Increase
BROWSER_TIMEOUTvalue - Chrome driver issues: Update Chrome and chromedriver
Set BROWSER_HEADLESS=false to see the browser in action.
Check the logs directory for detailed execution logs:
# View recent logs
npm run logs
# Or check the files directly
type logs\autopunch.logMIT License - see LICENSE file for details.
If you encounter any issues:
- Check the logs for detailed error messages
- Verify your website selectors are correct
- Test with
BROWSER_HEADLESS=falseto see what's happening - Ensure your website allows automated access
Note: This tool is for educational purposes. Always respect website terms of service and rate limits.