Models
Last updated
Last updated
AiBrow supports different models according to the local AI runtime being used. For Chrome AI, only the in-build Gemini Nano is available, whereas in both the WebGPU and Extension runtimes, a number of open weight models are all available.
This page currently describes the models available for the llama.cpp web extension runtime.
The following models are available pre-quantized to q4-k-m. Use the model's id when calling the to use the selected model . AiBrow will present a permission popup for the first use of a model on each specific web domain. The model will automatically be downloaded if it's not currently present on the user's machine. Models are only downloaded once since they do not change. A new model version will have a new id.
SmolLM2 1.7B Instruct
smollm2-1-7b-instruct-q4-k-m
SmolLM2 360M Instruct
smollm2-360m-instruct-q4-k-m
Gemma 2 2b Instruct
gemma-2-2b-instruct-q4-k-m
Gemma 2b Instruct
gemma-2b-instruct-q4-k-m
Llama 3.2 3B Instruct
llama-3-2-3b-instruct-q4-k-m
Llama 3.2 1B Instruct
llama-3-2-1b-instruct-q4-k-m
Qwen2.5 1.5b Instruct
qwen2-5-1-5b-instruct-q4-k-m
Qwen2.5 Coder 1.5B Instruct
qwen2-5-coder-1-5b-instruct-q4-k-m
Phi 3.5 Mini Instruct
phi-3-5-mini-instruct-q4-k-m
Granite 3.0 2b Instruct
granite-3-0-2b-instruct-q4-k-m
NuExtract v1.5
nuextract-v1-5-q4-k-m
Nomic Embed Text v1.5 q8_0
nomic-embed-text-v1-5-q8-0
all-MiniLM-L6-v2
all-minilm-l6-v2-q8-0
The current default models are SmolLM2 1.7B Instruct for language and translation, with Nomic Embed Text for embeddings.
When using the APIs, it is best to specify your model id explicitly on each create function to ensure consistency if these defaults should change in the future.
You can use any model that is openly available on Hugging Face by giving its URL, for example, "" would specify the quantized GGUF model for Gemma 2 JPN model.