Skip to content

Date parsing using Regex on HTML#134

Closed
stripathi669 wants to merge 3 commits intocodelucas:masterfrom
stripathi669:master
Closed

Date parsing using Regex on HTML#134
stripathi669 wants to merge 3 commits intocodelucas:masterfrom
stripathi669:master

Conversation

@stripathi669
Copy link
Copy Markdown

Hi,

I have added third layer of date parsing using regex on HTML.
Method is fairly simple: Make Reg ex for all combos of DD, MM, and YYYY.

I understand that it might not be as accurate as first two layers. But I am making this PR anyways, as you requested.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this on a new line please, e.g.

get_publishing_date(
  self.url,
  self.clean_doc,
  self.html
)

@codelucas
Copy link
Copy Markdown
Owner

See my inline comments and add unit-tests please.

This is a good start in expanding to the third layer of date parsing, via regex on html
Thanks for the PR :)

@codelucas codelucas closed this Mar 7, 2016
@yprez yprez mentioned this pull request Mar 9, 2016
ljluestc added a commit to ljluestc/newspaper that referenced this pull request Jun 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants