Instructions to use TaylorAI/gte-tiny with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use TaylorAI/gte-tiny with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("TaylorAI/gte-tiny") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use TaylorAI/gte-tiny with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("TaylorAI/gte-tiny") model = AutoModel.from_pretrained("TaylorAI/gte-tiny") - Inference
- Notebooks
- Google Colab
- Kaggle
I'm receiving an error Non-consecutive added token '[PAD]' found
tokenizer_config.json: 100%|███████████████████████████████████████████████████████████| 1.54k/1.54k [00:00<00:00, 8.04MB/s]
vocab.txt: 100%|█████████████████████████████████████████████████████████████████████████| 232k/232k [00:00<00:00, 2.78MB/s]
added_tokens.json: 100%|██████████████████████████████████████████████████████████████████| 82.0/82.0 [00:00<00:00, 443kB/s]
special_tokens_map.json: 100%|█████████████████████████████████████████████████████████████| 228/228 [00:00<00:00, 1.07MB/s]
Traceback (most recent call last):
File "/usr/local/bin/eland_import_hub_model", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/site-packages/eland/cli/eland_import_hub_model.py", line 254, in main
tm = TransformerModel(
File "/usr/local/lib/python3.10/site-packages/eland/ml/pytorch/transformers.py", line 643, in init
self._tokenizer = transformers.AutoTokenizer.from_pretrained(
File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 736, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
return cls._from_pretrained(
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2073, in _from_pretrained
raise ValueError(
ValueError: Non-consecutive added token '[PAD]' found. Should have index 30522 but has index 0 in saved vocabulary.