Skip to content

rdsilva01/archv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

archv

archv is a Python package created to retrieve, process, and perform Natural Language Processing (NLP) on news articles. This package includes modules for extracting news information, embedding generation, and the implementation of a recommendation system of news articles by implementing a Redis VSS backend.

Google Colab

Try out archv on Google Colab.

Table of Contents

Installation

  1. Install the package via pip:
    pip install git+https://github.com/rdsilva01/archv.git

Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your changes. Make sure to write tests and update documentation where applicable.

License

This project is licensed under the MIT License. See LICENSE file for more details.

Citation

If you use archv in your research, please cite:

@inproceedings{silva2025rebuilding,
  author    = {Rodrigo Silva and Ricardo Campos},
  title     = {Rebuilding the Past: Reconstructing Portuguese News Outlets with Web Archives},
  booktitle = {Advances in Information Retrieval (ECIR)},
  year      = {2025},
  doi       = {10.1007/978-3-031-88720-8_15}
}

About

Rebuilding the Past: Reconstructing Portuguese News Outlets with Web Archives

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages