Yellow Pages Scraper: A Python tool to extract business information from Yellow Pages, offering command-line and Streamlit interfaces for seamless data retrieval."
🌐 Live Demo: Yellow Pages Scraper Streamlit
This Python script allows you to scrape business data from Yellow Pages. It provides both command-line and Streamlit interface options for ease of use.
- Scrape business data based on search terms and geographical location.
- Choose starting page number for scraping.
- Save the scraped data to a CSV file.
- Streamlit interface for interactive usage.
Clone the repository:
git clone https://github.com/adil6572/YP-business-scraper.git
cd YP-business-scraperInstall the required dependencies:
pip install -r requirements.txt🌐 Live Demo: Yellow Pages Scraper Streamlit
Run the scraper from the command line using the following syntax:
python CLI-interface.py "search terms" "location terms" --start_page 2 --filename output.csvReplace "search terms", "location terms", and output.csv with your desired parameters.
Run the Streamlit interface:
streamlit run GUI-interface.pyVisit the provided URL in your web browser to use the interactive interface.
Run the scraper from the command line:
python CLI-interface.py "roofing" "Ravenswood, Chicago, IL" --start_page 2 --filename roofing.csvRun the Streamlit interface:
streamlit run GUI-interface.pysearch_terms: Search terms for scraping (required).geo_location_terms: Geographical location terms for scraping (required).--start_page: Start page number for scraping (default: 1).--filename: CSV filename for output (default: business_data.csv).
The scraped data is saved to a CSV file with the following columns:
RankBusiness NamePhone NumberBusiness PageWebsiteCategoryRatingStreet NameLocalityRegionZipcode
We welcome contributions from the community! If you'd like to contribute to the project, please follow these guidelines:
-
Fork the Repository: Fork the project on GitHub and clone your fork.
-
Create a New Branch: Create a new branch for your contribution using a descriptive name. For example:
git checkout -b feature/new-feature
-
Make Changes: Make your changes or additions to the code.
-
Commit Changes: Commit your changes with a clear and concise commit message.
git commit -m "Add new feature" -
Push Changes: Push your changes to your fork on GitHub.
git push origin feature/new-feature
-
Create a Pull Request: Open a pull request on the main repository. Provide a clear title and description of your changes.
If you encounter any issues or have suggestions, please create an issue on GitHub. When reporting issues, please include:
- A clear and descriptive title.
- A detailed description of the issue, including steps to reproduce.
- The version of the software you are using.
- Any relevant error messages or screenshots.
We appreciate your help in improving this project!
- Proxy Support: Integrate proxy support for enhanced scraping.
- Detailed UI: Enhance the Streamlit interface with more detailed and user-friendly features.
- Multiple Search Terms and Locations: Allow users to input multiple search terms and locations for a comprehensive search.
- Multithreading: Implement multithreading to speed up the scraping experience.
This project is licensed under the MIT License - see the LICENSE file for details.
Thank you for exploring our Yellow Pages Scraper. We hope you find it useful and enjoyable! If you have any feedback, suggestions, or just want to say hello, feel free to reach out.
Happy scraping and have a fantastic day! 🚀✨
