Models

List of available Models

AiBrow supports different models according to the local AI runtime being used. For Chrome AI, only the in-build Gemini Nano is available, whereas in both the WebGPU and Extension runtimes, a number of open weight models are all available.

This page currently describes the models available for the llama.cpp web extension runtime.

Pre quantized models

The following models are available pre-quantized to q4-k-m. Use the model's id when calling the create function to use the selected model . AiBrow will present a permission popup for the first use of a model on each specific web domain. The model will automatically be downloaded if it's not currently present on the user's machine. Models are only downloaded once since they do not change. A new model version will have a new id.

Language Models

Name
id

SmolLM2 1.7B Instruct

smollm2-1-7b-instruct-q4-k-m

SmolLM2 360M Instruct

smollm2-360m-instruct-q4-k-m

Gemma 2 2b Instruct

gemma-2-2b-instruct-q4-k-m

Gemma 2b Instruct

gemma-2b-instruct-q4-k-m

Llama 3.2 3B Instruct

llama-3-2-3b-instruct-q4-k-m

Llama 3.2 1B Instruct

llama-3-2-1b-instruct-q4-k-m

Qwen2.5 1.5b Instruct

qwen2-5-1-5b-instruct-q4-k-m

Qwen2.5 Coder 1.5B Instruct

qwen2-5-coder-1-5b-instruct-q4-k-m

Phi 3.5 Mini Instruct

phi-3-5-mini-instruct-q4-k-m

Granite 3.0 2b Instruct

granite-3-0-2b-instruct-q4-k-m

NuExtract v1.5

nuextract-v1-5-q4-k-m

Embedding Models

Name
Id

Nomic Embed Text v1.5 q8_0

nomic-embed-text-v1-5-q8-0

all-MiniLM-L6-v2

all-minilm-l6-v2-q8-0

Defaults

The current default models are SmolLM2 1.7B Instruct for language and translation, with Nomic Embed Text for embeddings.

When using the APIs, it is best to specify your model id explicitly on each create function to ensure consistency if these defaults should change in the future.

Hugging Face models

You can use any model that is openly available on Hugging Face by giving its URL, for example, "https://huggingface.co/bartowski/gemma-2-2b-jpn-it-GGUF/resolve/main/gemma-2-2b-jpn-it-Q4_K_M.gguf" would specify the quantized GGUF model for Gemma 2 JPN model.

Last updated