If at some later time, the database needs to be re-freshed, we can fetch new questions and avoid downloading questions that are already in the database however there might be new answers. How to fetch only those questions that got new answers?
Logic:
- For each question ID on a page, retrieve the number of answers.
- For each question ID retrieve the number of answers submitted by other than the OP.
- If these numbers are different, but the question ID is in the DB do the following:
- Remove row with question id from the questions table
- Remove rows with question id from the answers table.
- Remove rows with question id from the question keyword table.
- Then process question.
Tasks:
If at some later time, the database needs to be re-freshed, we can fetch new questions and avoid downloading questions that are already in the database however there might be new answers. How to fetch only those questions that got new answers?
Logic:
Tasks: