Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
5c64a71
Added new feature in the existing GST calculator
tanujbordikar Aug 4, 2023
0b9e78c
Added new feature in the existing GST calculator
tanujbordikar Aug 4, 2023
2cffc8f
Improved the calculator by using tkinter
tanujbordikar Aug 4, 2023
0ad6ecc
added the screenshot with tkinter gui and also created a new document…
tanujbordikar Aug 5, 2023
369dcbe
added techCrunch.py
Mihan786Chistie Aug 5, 2023
af89634
updated techCrunch.py
Mihan786Chistie Aug 5, 2023
f1ae4ed
updated techCrunch.py
Mihan786Chistie Aug 5, 2023
58c7c6d
added README.md
Mihan786Chistie Aug 5, 2023
3f15a8f
updated techCrunch.py
Mihan786Chistie Aug 5, 2023
2dc2121
added requirements.txt
Mihan786Chistie Aug 5, 2023
e5d8c91
Pixel Art Generator Script Added
andoriyaprashant Aug 5, 2023
eeda4e1
Lint Fix
andoriyaprashant Aug 5, 2023
ac34009
culturally-inspired names imaginary
Swapnil-2503 Aug 5, 2023
66af624
morphological transformations
invigorzz313 Aug 5, 2023
31a905e
Infinite Runner with Obstacles Script Added
andoriyaprashant Aug 5, 2023
46c36bf
Gomoku_game.py
Shikhar9425 Aug 6, 2023
ac814f2
README.md
Shikhar9425 Aug 6, 2023
0171cee
Adding code, README file
MrResilient Aug 8, 2023
2328db2
Adding code, README file
MrResilient Aug 8, 2023
8a9853c
Merge branch 'cont1' of https://github.com/Shivansh-Jain-github/Amazi…
MrResilient Aug 8, 2023
eec9f8b
Merge pull request #2687 from Shivansh-Jain-github/master
1e9abhi1e10 Aug 8, 2023
14d473a
Merge pull request #2686 from Shivansh-Jain-github/cont1
1e9abhi1e10 Aug 8, 2023
a741426
Completed payment receipt project
Yashika-Agrawal Aug 8, 2023
2d2d94b
Merge pull request #2691 from Yashika-Agrawal/Payment
1e9abhi1e10 Aug 8, 2023
3250119
Adding code, README.md file
MrResilient Aug 8, 2023
84f20e8
Adding code, README.md file
MrResilient Aug 8, 2023
83f766a
Add commit
MrResilient Aug 8, 2023
51cf405
Delete
MrResilient Aug 8, 2023
fabfd2c
delete
MrResilient Aug 8, 2023
034c8d4
Merge pull request #2696 from Shivansh-Jain-github/CONT1
1e9abhi1e10 Aug 8, 2023
42b6943
Merge pull request #2695 from Shivansh-Jain-github/cont1
1e9abhi1e10 Aug 8, 2023
2efacbb
Merge pull request #2668 from Shikhar9425/master-8
1e9abhi1e10 Aug 8, 2023
57fe6d4
Merge pull request #2639 from Swapnil-2503/culturally-inspired-names
1e9abhi1e10 Aug 8, 2023
bfb1e89
Merge pull request #2635 from Mihan786Chistie/techCrunch
1e9abhi1e10 Aug 8, 2023
b25ce3d
Merge pull request #2650 from andoriyaprashant/branch28
1e9abhi1e10 Aug 8, 2023
a9dee15
Merge pull request #2637 from andoriyaprashant/branch27
1e9abhi1e10 Aug 8, 2023
8890205
Merge pull request #2640 from invigorzz313/morphtransforms
1e9abhi1e10 Aug 8, 2023
148eaea
Merge pull request #2631 from tanujbordikar/screenshot
1e9abhi1e10 Aug 8, 2023
4d01156
True False Automation
MrResilient Aug 8, 2023
c58725c
WhatsApp_timer_messenger code
MrResilient Aug 8, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions Web Scrapping using Beautiful Soup/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Web Scraping with Beautiful Soup

This script performs web scraping on a CodeChef problem statement webpage using the Beautiful Soup library in Python.

## Description

The Python script utilizes the `requests` and `BeautifulSoup` libraries to extract information from a CodeChef problem statement webpage. It demonstrates the following actions:

- Printing the title of the webpage.
- Finding and printing all links on the page.
- Extracting text from paragraphs.
- Extracting image URLs.
- Counting and categorizing HTML tags.
- Filtering and printing valid links.
- Saving extracted data to a text file.

## Prerequisites

Ensure you have the following libraries installed:

- `requests`
- `beautifulsoup4`

You can install them using the following commands:

```bash
pip install requests beautifulsoup4
57 changes: 57 additions & 0 deletions Web Scrapping using Beautiful Soup/code.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
import requests
from bs4 import BeautifulSoup
import re

url = 'https://www.codechef.com/problems/TWORANGES?tab=statement'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')

# Print the title of the webpage
print(f"Title: {soup.title.text}\n")

# Find and print all links on the page
print("Links on the page:")
for link in soup.find_all('a'):
print(link.get('href'))

# Extract text from paragraphs
print("\nText from paragraphs:")
for paragraph in soup.find_all('p'):
print(paragraph.text)

# Extract image URLs
print("\nImage URLs:")
for img in soup.find_all('img'):
img_url = img.get('src')
if img_url:
print(img_url)

# Count and categorize tags
print("\nTag counts:")
tag_counts = {}
for tag in soup.find_all():
tag_name = tag.name
if tag_name:
tag_counts[tag_name] = tag_counts.get(tag_name, 0) + 1

for tag, count in tag_counts.items():
print(f"{tag}: {count}")

# Filter and print valid links
print("\nValid links:")
for link in soup.find_all('a'):
href = link.get('href')
if href and re.match(r'^https?://', href):
print(href)

# Save data to a file
with open('webpage_data.txt', 'w') as file:
file.write(f"Title: {soup.title.text}\n\n")
file.write("Links on the page:\n")
for link in soup.find_all('a'):
file.write(f"{link.get('href')}\n")
file.write("\nText from paragraphs:\n")
for paragraph in soup.find_all('p'):
file.write(f"{paragraph.text}\n")

print("\nData saved to 'webpage_data.txt'")