Prepare an Excel file with paper IDs and their BibTeX entries (see [bibtex_mapping_of_ids.xlsx](./interconnections_datasets/bibtex_mapping_of_ids.xlsx)) that will be needed to map the extracted metadata from the references to the paper of your corpus. Then you can run the GROBID server via Docker (typically on port 8070). For the citations, the notebook uses a confidence-based approach for citation matching, automatically accepting high-confidence matches while flagging uncertain ones for manual review in an Excel file. After reviewing the uncertain matches, run the final cells to create the completed matrices saved as CSV files.
0 commit comments