crawl4ai version
0.4.248
Expected Behavior
When I crawl a page with a html table on it (for example: https://www.german-tigers.de/trainingszeiten.php) then the table should be correctly exported at least in cleaned_html. When I look into the html format of the result than the table is correctly in there, probably because this output is raw and not cleaned. But a table should also be correctly exported in cleaned_html. If columns or rows are missing then its a bug.
Current Behavior
Empty columns in a html table will get removed. This makes the table invalid and the LLM cannot properly extract data from that table, because the table is already wrong in the cleaned_html.
Is this reproducible?
Yes
Inputs Causing the Bug
- Test URL (https://www.german-tigers.de/trainingszeiten.php)
- Use the AsyncWebCrawler and just run .arun() on that url. No config needed. Check cleaned_html output and you will see, that the table is wrong.
Steps to Reproduce
import asyncio
from crawl4ai import AsyncWebCrawler
async def main():
async with AsyncWebCrawler() as crawler:
result = await crawler.arun(
url="https://www.german-tigers.de/trainingszeiten.php",
)
print(result.cleaned_html)
return result.cleaned_html
if __name__ == "__main__":
asyncio.run(main())
Code snippets
import asyncio
from crawl4ai import AsyncWebCrawler
async def main():
async with AsyncWebCrawler() as crawler:
result = await crawler.arun(
url="https://www.german-tigers.de/trainingszeiten.php",
)
print(result.cleaned_html)
return result.cleaned_html
if __name__ == "__main__":
asyncio.run(main())
OS
macOS
Python version
3.12
Browser
Arc
Browser version
1.83.1
Error logs & Screenshots (if applicable)

crawl4ai version
0.4.248
Expected Behavior
When I crawl a page with a html table on it (for example: https://www.german-tigers.de/trainingszeiten.php) then the table should be correctly exported at least in cleaned_html. When I look into the
htmlformat of the result than the table is correctly in there, probably because this output is raw and not cleaned. But a table should also be correctly exported incleaned_html. If columns or rows are missing then its a bug.Current Behavior
Empty columns in a html table will get removed. This makes the table invalid and the LLM cannot properly extract data from that table, because the table is already wrong in the cleaned_html.
Is this reproducible?
Yes
Inputs Causing the Bug
- Test URL (https://www.german-tigers.de/trainingszeiten.php) - Use the AsyncWebCrawler and just run .arun() on that url. No config needed. Check cleaned_html output and you will see, that the table is wrong.Steps to Reproduce
Code snippets
OS
macOS
Python version
3.12
Browser
Arc
Browser version
1.83.1
Error logs & Screenshots (if applicable)