Files
supermarket-scraping/item.py
korenlazar ceff48dbd9 Fixed the bug with cerberus_web_client.py by working with Selenium. To login each chain working with it must have a username for login with Selenium. in this mechanism, a path to a gz file is returned instead of url
Added the option to output a prices json file in main.py under --prices-with-promos, where the prices are updated by the latest promotions (under the 'final_price' key, where 'price' represents the price before promotions).

Fixed small bug of BinaWebCleint by checking that filename does not contain 'null'.

Changed Hierarchy of chains such that it includes the webclients.

Added the date to the output filenames to start storing the data over time.

Black formatting (according to pip 8 guidelines).

Changed the chains_dict in main to a constant one.
2022-10-04 11:42:36 +03:00

45 lines
1.2 KiB
Python

import json
import re
from bs4.element import Tag
class Item:
"""
A class representing a product in some supermarket.
"""
def __init__(
self,
name: str,
price: float,
price_by_measure: float,
code: str,
manufacturer: str,
):
self.name: str = name
self.price: float = price
self.final_price: float = price
self.price_by_measure = price_by_measure
self.manufacturer: str = manufacturer
self.code: str = code
@classmethod
def from_tag(cls, item: Tag):
"""
This method creates an Item instance from an xml tag.
"""
return cls(
name=item.find(re.compile(r"ItemN[a]?m[e]?")).text,
price=float(item.find("ItemPrice").text),
price_by_measure=float(item.find("UnitOfMeasurePrice").text),
code=item.find("ItemCode").text,
manufacturer=item.find(re.compile(r"Manufacture[r]?Name")).text,
)
def to_json(self):
return json.dumps(self, default=lambda o: o.__dict__)
def __repr__(self):
return f"\nשם: {self.name}\nמחיר: {self.price}\nיצרן: {self.manufacturer}\nקוד: {self.code}\n"