Skip to content

aekrylov/infsearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Information search crawler project

Crawls ria.ru with scrapy

Usage

Crawler

  1. pip install -r requirements.txt
  2. Set up splash instance, e.g. docker run -p 8050:8050 scrapinghub/splash
  3. scrapy crawl article_spider

Stemmer

  1. Crawl data
  2. run stemmer.py

Inverted index search

  1. Run stemmer on some data
  2. Run search.py
  3. Enter search query
  4. A list of article titles will be returned sorted by total term count

About

πŸ“–πŸ” Information search, ITIS 8th semester

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages