68 Commits

Author SHA1 Message Date
6f527b12d5 prefs 2022-10-05 12:42:53 +03:00
b8ccd6dcf7 fix of download directory 2022-10-05 12:38:52 +03:00
Koren Lazar
6755ff5caf Merge pull request #3 from 1kamma/master
mistake in the requierments fixed
2022-10-05 07:51:58 +03:00
42fac846aa Merge branch 'master' of https://github.com/1kamma/supermarket-scraping 2022-10-05 03:52:43 +03:00
d047ffdcc2 added options for headless computers, changed the downloa path to raw_files 2022-10-05 03:37:43 +03:00
korenlazar
9b6f63a7f0 Added the chain Yeinot Bitan (also to tests).
Changed price with promos to include only regular promotions.
Added filtering of promotions including too many items.
2022-10-04 13:36:29 +03:00
korenlazar
86ff2ca7b7 Fixed small bug in valid_store_id_by_chain function 2022-10-04 12:11:44 +03:00
korenlazar
b1737839ce Fixed bug with Shufersal Scraping by changing xml files category back to normal Enum. 2022-10-04 12:09:42 +03:00
korenlazar
7b63eab7bd leftover from last commit 2022-10-04 11:42:57 +03:00
korenlazar
ceff48dbd9 Fixed the bug with cerberus_web_client.py by working with Selenium. To login each chain working with it must have a username for login with Selenium. in this mechanism, a path to a gz file is returned instead of url
Added the option to output a prices json file in main.py under --prices-with-promos, where the prices are updated by the latest promotions (under the 'final_price' key, where 'price' represents the price before promotions).

Fixed small bug of BinaWebCleint by checking that filename does not contain 'null'.

Changed Hierarchy of chains such that it includes the webclients.

Added the date to the output filenames to start storing the data over time.

Black formatting (according to pip 8 guidelines).

Changed the chains_dict in main to a constant one.
2022-10-04 11:42:36 +03:00
korenLazar
b5db721a3d Merge pull request #6 from korenLazar/test-scraping
Test scraping
2021-08-18 12:26:23 +03:00
KorenLazar
90cab0a2e1 Minor changes 2021-08-18 11:32:04 +03:00
KorenLazar
87b6fbe2b0 Changed ClubID enum class to include a string field used for printing, and define ClubID.OTHER as a default value for the class to handle invalid inputs. 2021-08-18 11:30:31 +03:00
KorenLazar
322995ba15 Added TODO for ordering the argparse 2021-08-18 11:16:25 +03:00
KorenLazar
294dee8cc2 Added test for searching different files' urls. Specifically, asserting the searching non-full files does not yield urls of full files. 2021-08-17 13:08:39 +03:00
KorenLazar
cffdd84086 Added specific searching for the download url of non-full promotions and prices files. Changed return value of get_download_url accordingly. 2021-08-17 13:06:42 +03:00
KorenLazar
3770352d04 Added new requirements to requirements.txt 2021-08-17 09:35:20 +03:00
KorenLazar
63fec1490c Added new requirements to requirements.txt 2021-08-17 09:18:45 +03:00
KorenLazar
c1281cb312 Added a test for scraping the promotions and exporting them to xlsx files. 2021-08-16 23:09:10 +03:00
KorenLazar
1a88ed6e01 minor changes 2021-08-16 23:08:04 +03:00
KorenLazar
9b0ab013c9 Added requirements to requirements.txt 2021-08-16 23:07:32 +03:00
KorenLazar
1a6707341d Logical fixes in promotions scraping and calculation. 2021-08-16 23:07:07 +03:00
KorenLazar
844a106c57 Added tqdm 2021-08-16 23:05:16 +03:00
KorenLazar
c793057623 Documentation and minor changes 2021-08-16 14:06:54 +03:00
KorenLazar
13991aaa40 Documentation and minor changes 2021-08-16 14:05:22 +03:00
KorenLazar
b3d410306d Removed filtering by PRODUCTS_TO_IGNORE 2021-08-16 14:04:46 +03:00
korenLazar
62089dd538 Merge pull request #5 from korenLazar/export-promotions-to-xlsx-table
Export promotions to xlsx table
2021-08-16 12:51:48 +03:00
KorenLazar
03ff6d5281 Changed create_items_dict function to included non-full prices file in the items dictionary.
Changed log_products_prices to work with an items dictionary and a __repr__ function of the Item class.
2021-08-16 12:44:32 +03:00
KorenLazar
e09b2da4a1 removed get_all_deals function 2021-08-16 12:43:01 +03:00
KorenLazar
58bb04f1dd Added get_all_promos_tags function and included the non-full promotions file in the promotions collection. 2021-08-16 12:42:38 +03:00
KorenLazar
ebb1e912b9 Change INFO logging format 2021-08-16 12:40:06 +03:00
KorenLazar
98dcc1c33d Add price_by_measure member to Item object 2021-08-16 12:39:28 +03:00
korenLazar
8a726ff605 Merge pull request #4 from korenLazar/export-promotions-to-xlsx-table
finished implementing exporting promotion to xlsx table and automatic…
2021-06-17 10:36:20 +03:00
KorenLazar
27b45a4999 finished implementing exporting promotion to xlsx table and automatically opening the xlsx file 2021-06-01 21:00:40 +03:00
KorenLazar
ec505dba67 minor rephrasing in documentation 2021-05-18 14:34:11 +03:00
3ae8d02836 correction, by comments and suggestions of Koren 2021-04-29 17:55:21 +03:00
5caf3e495c mistake in the requierments fixed 0.1 2021-04-17 22:36:18 +03:00
e1f43772b9 now excel is working 2021-04-17 20:49:08 +03:00
korenLazar
e740b122ff Merge pull request #1 from 1kamma/master
this will be better for the windows and unix-bases
2021-04-17 18:34:11 +03:00
d4ba19bf41 remove_unneeded 2021-04-17 12:07:08 +03:00
2a4b6562b7 change encoding 2021-04-17 12:06:25 +03:00
KorenLazar
9f5464317d Has added tests for the promotion functions for Shufersal and CoOp. Also added minor design changes in promotion.py and item.py 2021-03-08 14:13:30 +02:00
KorenLazar
c86fc7c1ab Moved to writing solely to CSV. Added some columns and drastly improved the logics behind price after promotion column. 2021-02-25 20:54:44 +02:00
KorenLazar
8aa33cbcda added columns to csv: price after promotion, discount in percentage and promotion type (regular/club/credit card). 2021-02-23 08:27:00 +02:00
KorenLazar
850d3963fe has added binaproject clients 2021-02-07 10:46:54 +02:00
KorenLazar
9983d07c2b replaced the member '_class_name' by the 'class.__name__' 2021-02-07 08:18:22 +02:00
KorenLazar
67bff9fa76 minor changes 2021-02-06 22:53:18 +02:00
KorenLazar
18f3fa32b9 has added many chains 2021-02-06 21:42:31 +02:00
KorenLazar
5aa4cd734d changed chains' members to be 'immutable static' 2021-02-06 15:57:05 +02:00
KorenLazar
3a57edf5af has added RamiLevi to the chains collection 2021-02-06 14:41:04 +02:00