korenlazar ceff48dbd9 Fixed the bug with cerberus_web_client.py by working with Selenium. To login each chain working with it must have a username for login with Selenium. in this mechanism, a path to a gz file is returned instead of url
Added the option to output a prices json file in main.py under --prices-with-promos, where the prices are updated by the latest promotions (under the 'final_price' key, where 'price' represents the price before promotions).

Fixed small bug of BinaWebCleint by checking that filename does not contain 'null'.

Changed Hierarchy of chains such that it includes the webclients.

Added the date to the output filenames to start storing the data over time.

Black formatting (according to pip 8 guidelines).

Changed the chains_dict in main to a constant one.
2022-10-04 11:42:36 +03:00
2021-02-07 10:46:54 +02:00

Supermarket basic scraping

The library supports scraping from Shufersal, CoOp, Rami Levi, Osher Ad, Zol Vebegadol, Tiv Taam, Freshmarket, Mahsanei Hashook, Victory, Maayan2000, Yohananof, Stop Market, Keshet Taamim, Hazi Hinam, Dor Alon supermarkets, Shefa Birkat Hashem, Shuk Hayir, King Store and Super Bareket.

Installation

clone:

git clone https://github.com/korenLazar/supermarket-scraping.git
cd supermarket-scraping
virtualenv venv
venv\bin\activate
pip install -r requirements.txt

Dependencies

  1. python (3.7+)
  2. virtualenv

Usage

First, to find your Shufersal store's ID, you can run the following command (assuming you live in Jerusalem):

python main.py --find_store ירושלים --chain Shufersal

In case you want a different supermarket chain, just change 'Shufersal' to a different name (the options will be printed in case of misspelling).

The output of the last command - the different Shufersal stores in Jerusalem with their IDs - should be printed.

Now, that we have the store's ID, we can get the store's relevant promotions sorted by their start date, last update and length.

python main.py --promos 5 --chain Shufersal
  • We assumed that the store's ID is 5. Now, you can find the promos in both "results\Shufersal_promos_5.csv" and "results\Shufersal_promos_5.log".

For other documentation and commands, you can run

python main.py --h

Any file that was downloaded in the process will be located in the "raw_files" directory.

Good luck!

Description
string
Readme 131 KiB
Languages
Python 100%