Skip to content

Error creating Vocabulary from pretrained with erwanf/gpt2-mini #223

@RobinPicard

Description

@RobinPicard

Describe the issue as clearly as possible:

I have an error when trying to create a Vocabulary with the model "erwanf/gpt2-mini"

Steps/code to reproduce the bug:

from outlines_core import Vocabulary

vocabulary = Vocabulary.from_pretrained("erwanf/gpt2-mini")

Expected result:

A `Vocabulary` instance

Error message:

Traceback (most recent call last):
  File "/Users/robin/outlines/.idea/test.py", line 18, in <module>
    vocabulary = Vocabulary.from_pretrained("erwanf/gpt2-mini")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: request error: https://huggingface.co/erwanf/gpt2-mini/resolve/main/tokenizer.json: status code 404

Outlines/Python version information:

0.2.11
Python 3.11.12 (main, Apr 8 2025, 14:15:29) [Clang 16.0.0 (clang-1600.0.26.6)]

Details ``` absl-py==2.2.2 accelerate==1.6.0 aiohappyeyeballs==2.6.1 aiohttp==3.11.16 aiosignal==1.3.2 airportsdata==20250224 annotated-types==0.7.0 anthropic==0.49.0 anyio==4.9.0 astor==0.8.1 astunparse==1.6.3 attrs==25.3.0 babel==2.17.0 backoff==2.2.1 backrefs==5.9 beartype==0.15.0 blake3==1.0.4 build==1.2.2.post1 cachetools==5.5.2 cairocffi==1.7.1 CairoSVG==2.8.2 certifi==2025.1.31 cffi==1.17.1 cfgv==3.4.0 chardet==5.2.0 charset-normalizer==3.4.1 chex==0.1.89 click==8.1.8 cloudpickle==3.1.1 colorama==0.4.6 compressed-tensors==0.9.2 coverage==7.8.0 cryptography==45.0.5 cssselect2==0.8.0 datasets==3.5.0 defusedxml==0.7.1 Deprecated==1.2.18 depyf==0.18.0 diff_cover==9.2.4 dill==0.3.8 diskcache==5.6.3 distlib==0.3.9 distro==1.9.0 dnspython==2.7.0 dottxt==0.1.5 einops==0.8.1 email_validator==2.2.0 etils==1.12.2 fastapi==0.115.12 fastapi-cli==0.0.7 filelock==3.18.0 flatbuffers==25.2.10 flax==0.10.5 frozenlist==1.5.0 fsspec==2024.12.0 gast==0.6.0 genson==1.3.0 gguf==0.10.0 ghp-import==2.1.0 gitdb==4.0.12 GitPython==3.1.44 google-ai-generativelanguage==0.6.15 google-api-core==2.24.2 google-api-python-client==2.167.0 google-auth==2.39.0 google-auth-httplib2==0.2.0 google-generativeai==0.8.4 google-pasta==0.2.0 googleapis-common-protos==1.70.0 griffe==1.7.3 grpcio==1.71.0 grpcio-status==1.71.0 h11==0.14.0 h5py==3.13.0 hf-xet==1.0.3 httpcore==1.0.8 httplib2==0.22.0 httptools==0.6.4 httpx==0.28.1 huggingface-hub==0.30.2 humanize==4.12.2 identify==2.6.9 idna==3.10 importlib_metadata==8.6.1 importlib_resources==6.5.2 iniconfig==2.1.0 interegular==0.3.3 iso3166==2.1.1 jax==0.5.3 jaxlib==0.5.3 Jinja2==3.1.6 jiter==0.9.0 jsonpath-ng==1.7.0 jsonschema==4.23.0 jsonschema-specifications==2024.10.1 keras==3.9.2 lark==1.2.2 libclang==18.1.1 llama_cpp_python==0.3.8 llguidance==1.0.1 llvmlite==0.44.0 lm-format-enforcer==0.10.11 Markdown==3.8 markdown-it-py==3.0.0 MarkupSafe==3.0.2 mdurl==0.1.2 mergedeep==1.3.4 mistral_common==1.5.4 mkdocs==1.6.1 mkdocs-autorefs==1.4.2 mkdocs-gen-files==0.5.0 mkdocs-get-deps==0.2.0 mkdocs-git-committers-plugin==0.2.3 mkdocs-git-revision-date-localized-plugin==1.4.7 mkdocs-literate-nav==0.6.2 mkdocs-material==9.6.15 mkdocs-material-extensions==1.3.1 mkdocs-redirects==1.2.2 mkdocs-section-index==0.3.10 mkdocstrings==0.29.1 mkdocstrings-python==1.16.12 ml_dtypes==0.5.1 mlx==0.24.2 mlx-lm==0.22.5 mpmath==1.3.0 msgpack==1.1.0 msgspec==0.19.0 multidict==6.4.3 multiprocess==0.70.16 namex==0.0.8 nest-asyncio==1.6.0 networkx==3.4.2 ninja==1.11.1.4 nodeenv==1.9.1 numba==0.61.2 numpy==2.1.3 ollama==0.4.7 openai==1.74.0 opencv-python-headless==4.11.0.86 opt_einsum==3.4.0 optax==0.2.4 optree==0.15.0 orbax-checkpoint==0.11.12 -e git+ssh://[email protected]/dottxt-ai/outlines.git@48e10d9b7fdebd0b5188fe4fbae9ab61b1945824#egg=outlines outlines_core==0.2.11 packaging==24.2 paginate==0.5.7 pandas==2.2.3 partial-json-parser==0.2.1.1.post5 pathspec==0.12.1 pillow==10.4.0 platformdirs==4.3.7 pluggy==1.5.0 ply==3.11 pre_commit==4.2.0 prometheus-fastapi-instrumentator==7.1.0 prometheus_client==0.21.1 propcache==0.3.1 proto-plus==1.26.1 protobuf==5.29.4 psutil==7.0.0 py-cpuinfo==9.0.0 pyarrow==19.0.1 pyasn1==0.6.1 pyasn1_modules==0.4.2 pycountry==24.6.1 pycparser==2.22 pydantic==2.11.3 pydantic-settings==2.8.1 pydantic_core==2.33.1 PyGithub==2.6.1 Pygments==2.19.1 PyJWT==2.10.1 pymdown-extensions==10.16 PyNaCl==1.5.0 pyparsing==3.2.3 pyproject_hooks==1.2.0 pytest==8.3.5 pytest-benchmark==5.1.0 pytest-cov==6.1.1 pytest-mock==3.14.0 python-dateutil==2.9.0.post0 python-dotenv==1.1.0 python-json-logger==3.3.0 python-multipart==0.0.20 pytz==2025.2 PyYAML==6.0.2 pyyaml_env_tag==1.1 pyzmq==26.4.0 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 responses==0.25.7 rich==14.0.0 rich-toolkit==0.14.1 rpds-py==0.24.0 rsa==4.9 safetensors==0.5.3 scipy==1.15.2 sentencepiece==0.2.0 shellingham==1.5.4 simplejson==3.20.1 six==1.17.0 smmap==5.0.2 sniffio==1.3.1 starlette==0.46.2 sympy==1.13.1 tensorboard==2.19.0 tensorboard-data-server==0.7.2 tensorflow==2.19.0 tensorflow-io-gcs-filesystem==0.37.1 tensorstore==0.1.73 termcolor==3.0.1 tf_keras==2.19.0 tiktoken==0.9.0 tinycss2==1.4.0 tokenizers==0.21.1 toolz==1.0.0 torch==2.6.0 torchaudio==2.6.0 torchvision==0.21.0 tqdm==4.67.1 transformers==4.51.3 treescope==0.1.9 typer==0.15.2 typing-inspection==0.4.0 typing_extensions==4.13.2 tzdata==2025.2 uritemplate==4.1.1 urllib3==2.4.0 uvicorn==0.34.1 uvloop==0.21.0 virtualenv==20.30.0 vllm==0.8.3 watchdog==6.0.0 watchfiles==1.0.5 webencodings==0.5.1 websockets==15.0.1 Werkzeug==3.1.3 wrapt==1.17.2 xgrammar==0.1.21 xxhash==3.5.0 yarl==1.19.0 zipp==3.21.0 ```

Context for the issue:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions