I'm receiving an error Non-consecutive added token '[PAD]' found

by Koat - opened Mar 23, 2024

Mar 23, 2024

tokenizer_config.json: 100%|███████████████████████████████████████████████████████████| 1.54k/1.54k [00:00<00:00, 8.04MB/s]
vocab.txt: 100%|█████████████████████████████████████████████████████████████████████████| 232k/232k [00:00<00:00, 2.78MB/s]
added_tokens.json: 100%|██████████████████████████████████████████████████████████████████| 82.0/82.0 [00:00<00:00, 443kB/s]
special_tokens_map.json: 100%|█████████████████████████████████████████████████████████████| 228/228 [00:00<00:00, 1.07MB/s]
Traceback (most recent call last):
File "/usr/local/bin/eland_import_hub_model", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/site-packages/eland/cli/eland_import_hub_model.py", line 254, in main
tm = TransformerModel(
File "/usr/local/lib/python3.10/site-packages/eland/ml/pytorch/transformers.py", line 643, in init
self._tokenizer = transformers.AutoTokenizer.from_pretrained(
File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 736, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
return cls._from_pretrained(
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2073, in _from_pretrained
raise ValueError(
ValueError: Non-consecutive added token '[PAD]' found. Should have index 30522 but has index 0 in saved vocabulary.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment