Download and run Hugging Face AI models in Ollama
This article explains step by step guide to download an AI model from Hugging Face repository and run it with Ollama.
Ollama allows to create a model locally from GGUF format or SafeTensors format and this article will cover both models.
Creating from GGUF models is highly reliable as the GGUF file is packaged with model data with template file suitable for the AI model.
Running a model using GGUF:
Example repository with GGUF formal model https://huggingface.co/sergbese/gemma-3-isv-translator-v5-gguf-bf16
Create a download directory:
mkdir gemma-3-isv-translator-v5-gguf-bf16
Download the file into the directory
cd gemma-3-isv-translator-v5-gguf-bf16
wget "https://huggingface.co/sergbese/gemma-3-isv-translator-v5-gguf-bf16/resolve/main/gemma-3-finetune-2.BF16.gguf?download=true" -O ./gemma-3-finetune-2.BF16.gguf
Create an Ollama model:
echo 'FROM ./gemma-3-finetune-2.BF16.gguf' > Modelfile
ollama create gemma-3-isv-translator-v5-gguf-bf16:latest
Run and verify the model:
curl http://localhost:11434/api/chat -d '{ "model": "gemma-3-isv-translator-v5-gguf-bf16:latest", "stream": false, "messages": [ { "role": "user", "content": "Translate \"How old are you?\" from English to Tamil" } ] }' | jq .
Running a model using SafeTensors:
In order to create a model from SafeTensors format, we need to download / clone the entire hugging face repository.
Hugging face repository contains large files with size more than 1 GB, hence we need tools like git and git lfs to clone the repository.
Example repository containing SafeTensors models
https://huggingface.co/sergbese/gemma-3-isv-translator-v5
Install git
https://git-scm.com/
Install git lfs
https://git-lfs.com/
Configure git with lfs
git lfs install
Clone the hugging face repository
git clone https://huggingface.co/sergbese/gemma-3-isv-translator-v5
The clone will take longer to complete and it will depend on the internet speed and repository size. Git doesn't show progress of the download, hence the following command will continuously print the directory size and show progress during the download.
cd gemma-3-isv-translator-v5 && while :; do du -sh; sleep 2; done
Create a Ollama model from the cloned repository:
cd gemma-3-isv-translator-v5
echo 'FROM .' > Modelfile
ollama create gemma-3-isv-translator-v5:latest
Run and verify the Ollama Model:
curl http://localhost:11434/api/chat -d '{ "model": "gemma-3-isv-translator-v5:latest", "stream": false, "messages": [ { "role": "user", "content": "Translate \"How old are you?\" from English to Tamil" } ] }' | jq .
Unlike GGUF format, SafeTensors format doesn't contain the built in template for the model. Hence it is a good idea to create the model file along with a template based on the documentation of the base model.
Reference of Ollama Modelfile:
https://github.com/ollama/ollama/blob/main/docs/modelfile.md
Here is a sample Modelfile for base gemma 3 AI Model:
ollama show --modelfile gemma3:4b
FROM .
TEMPLATE """{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if or (eq .Role "user") (eq .Role "system") }}<start_of_turn>user
{{ .Content }}<end_of_turn>
{{ if $last }}<start_of_turn>model
{{ end }}
{{- else if eq .Role "assistant" }}<start_of_turn>model
{{ .Content }}{{ if not $last }}<end_of_turn>
{{ end }}
{{- end }}
{{- end }}"""
PARAMETER stop <end_of_turn>
PARAMETER temperature 1
PARAMETER top_k 64
PARAMETER top_p 0.95
Comments
Post a Comment