Skip to content

TypeError: 'NoneType' object is not subscriptable in __generate_result method #145

@christian-unlockai

Description

@christian-unlockai

Description

Hi,

I encountered an issue with the deutschland package while trying to fetch financial reports using the Bundesanzeiger class. When running my tests, I received a TypeError indicating that a 'NoneType' object is not subscriptable. This occurs in the __generate_result method when trying to access the captcha_wrapper div.

Error Details

TypeError: 'NoneType' object is not subscriptable

Steps to Reproduce

  1. Initialize the Bundesanzeiger class.
  2. Call the get_reports method with a valid search term.
  3. Observe the error in the __generate_result method.

Code Snippet

Here is the relevant part of the code where the error occurs:

def __generate_result(self, content: str):
        """iterate trough all results and try to fetch single reports"""
        result = {}
        for element in self.__find_all_entries_on_page(content):
            get_element_response = self.__get_response(element.content_url)

            if self.__is_captcha_needed(get_element_response.text):
                soup = BeautifulSoup(get_element_response.text, "html.parser")
                captcha_image_src = soup.find("div", {"class": "captcha_wrapper"}).find(
                    "img"
                )["src"]
                img_response = self.__get_response(captcha_image_src)
                captcha_result = self.captcha_callback(img_response.content)
                captcha_endpoint_url = soup.find_all("form")[1]["action"]
                get_element_response = self.session.post(
                    captcha_endpoint_url,
                    data={"solution": captcha_result, "confirm-button": "OK"},
                )

            content_soup = BeautifulSoup(get_element_response.text, "html.parser")
            content_element = content_soup.find(
                "div", {"class": "publication_container"}
            )

            if not content_element:
                continue

            element.report = content_element.text
            element.raw_report = content_element.prettify()

            result[element.to_hash()] = element.to_dict()

        return result

Additional Information

  • Python version: 3.10.13
  • deutschland package version: latest
  • OS: macOS

Logs

2024-06-01 14:17:21 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchen2?4-1.-search~table~panel-rows-2-search~table~row~panel-publication~link HTTP/1.1" 302 0 (connectionpool.py:549)
2024-06-01 14:17:21 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchergebnis?7 HTTP/1.1" 200 None (connectionpool.py:549)
2024-06-01 14:17:21 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchergebnis?7--captcha~panel-captcha_form-captcha_image&antiCache=1717244241383 HTTP/1.1" 200 None (connectionpool.py:549)
2024-06-01 14:17:23 [DEBUG] https://www.bundesanzeiger.de:443 "POST /pub/de/suchergebnis?7-1.-captcha~panel-captcha_form HTTP/1.1" 302 0 (connectionpool.py:549)
2024-06-01 14:17:23 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchergebnis?9 HTTP/1.1" 200 None (connectionpool.py:549)
2024-06-01 14:17:23 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchen2?4-1.-search~table~panel-rows-3-search~table~row~panel-publication~link HTTP/1.1" 302 0 (connectionpool.py:549)
2024-06-01 14:17:23 [DEBUG] https://www.bundesanzeiger.de:443 "GET /pub/de/suchergebnis?10 HTTP/1.1" 200 None (connectionpool.py:549)

Please let me know if further information is needed.

Thank you!

Christian

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions