-
Notifications
You must be signed in to change notification settings - Fork 76
Open
Description
Description
Hi,
I encountered an issue with the deutschland package while trying to fetch financial reports using the Bundesanzeiger class. When running my tests, I received a TypeError indicating that a 'NoneType' object is not subscriptable. This occurs in the __generate_result method when trying to access the captcha_wrapper div.
Error Details
TypeError: 'NoneType' object is not subscriptable
Steps to Reproduce
- Initialize the
Bundesanzeigerclass. - Call the
get_reportsmethod with a valid search term. - Observe the error in the
__generate_resultmethod.
Code Snippet
Here is the relevant part of the code where the error occurs:
def __generate_result(self, content: str):
"""iterate trough all results and try to fetch single reports"""
result = {}
for element in self.__find_all_entries_on_page(content):
get_element_response = self.__get_response(element.content_url)
if self.__is_captcha_needed(get_element_response.text):
soup = BeautifulSoup(get_element_response.text, "html.parser")
captcha_image_src = soup.find("div", {"class": "captcha_wrapper"}).find(
"img"
)["src"]
img_response = self.__get_response(captcha_image_src)
captcha_result = self.captcha_callback(img_response.content)
captcha_endpoint_url = soup.find_all("form")[1]["action"]
get_element_response = self.session.post(
captcha_endpoint_url,
data={"solution": captcha_result, "confirm-button": "OK"},
)
content_soup = BeautifulSoup(get_element_response.text, "html.parser")
content_element = content_soup.find(
"div", {"class": "publication_container"}
)
if not content_element:
continue
element.report = content_element.text
element.raw_report = content_element.prettify()
result[element.to_hash()] = element.to_dict()
return resultAdditional Information
- Python version: 3.10.13
deutschlandpackage version: latest- OS: macOS
Logs
2024-06-01 14:17:21 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchen2?4-1.-search~table~panel-rows-2-search~table~row~panel-publication~link HTTP/1.1" 302 0 (connectionpool.py:549)
2024-06-01 14:17:21 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchergebnis?7 HTTP/1.1" 200 None (connectionpool.py:549)
2024-06-01 14:17:21 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchergebnis?7--captcha~panel-captcha_form-captcha_image&antiCache=1717244241383 HTTP/1.1" 200 None (connectionpool.py:549)
2024-06-01 14:17:23 [DEBUG] https://www.bundesanzeiger.de:443 "POST /pub/de/suchergebnis?7-1.-captcha~panel-captcha_form HTTP/1.1" 302 0 (connectionpool.py:549)
2024-06-01 14:17:23 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchergebnis?9 HTTP/1.1" 200 None (connectionpool.py:549)
2024-06-01 14:17:23 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchen2?4-1.-search~table~panel-rows-3-search~table~row~panel-publication~link HTTP/1.1" 302 0 (connectionpool.py:549)
2024-06-01 14:17:23 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchergebnis?10 HTTP/1.1" 200 None (connectionpool.py:549)
Please let me know if further information is needed.
Thank you!
Christian
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels