Download and run Hugging Face AI models in Ollama

6/14/2025 11:00:00 pm

This article explains step by step guide to download an AI model from Hugging Face repository and run it with Ollama.

Ollama allows to create a model locally from GGUF format or SafeTensors format and this article will cover both models.

Creating from GGUF models is highly reliable as the GGUF file is packaged with model data with template file suitable for the AI model.

Running a model using GGUF:

Example repository with GGUF formal model https://huggingface.co/sergbese/gemma-3-isv-translator-v5-gguf-bf16

Create a download directory:

mkdir gemma-3-isv-translator-v5-gguf-bf16

Download the file into the directory

cd gemma-3-isv-translator-v5-gguf-bf16

wget "https://huggingface.co/sergbese/gemma-3-isv-translator-v5-gguf-bf16/resolve/main/gemma-3-finetune-2.BF16.gguf?download=true" -O ./gemma-3-finetune-2.BF16.gguf

Create an Ollama model:

echo 'FROM ./gemma-3-finetune-2.BF16.gguf' > Modelfile

ollama create gemma-3-isv-translator-v5-gguf-bf16:latest

Run and verify the model:

curl http://localhost:11434/api/chat -d '{ "model": "gemma-3-isv-translator-v5-gguf-bf16:latest", "stream": false, "messages": [ { "role": "user", "content": "Translate \"How old are you?\" from English to Tamil" } ] }' | jq .

Running a model using SafeTensors:

In order to create a model from SafeTensors format, we need to download / clone the entire hugging face repository.

Hugging face repository contains large files with size more than 1 GB, hence we need tools like git and git lfs to clone the repository.

Example repository containing SafeTensors models

https://huggingface.co/sergbese/gemma-3-isv-translator-v5

Install git

https://git-scm.com/

Install git lfs

https://git-lfs.com/

Configure git with lfs

git lfs install

Clone the hugging face repository

git clone https://huggingface.co/sergbese/gemma-3-isv-translator-v5

The clone will take longer to complete and it will depend on the internet speed and repository size. Git doesn't show progress of the download, hence the following command will continuously print the directory size and show progress during the download.

cd gemma-3-isv-translator-v5 && while :; do du -sh; sleep 2; done

Create a Ollama model from the cloned repository:

cd gemma-3-isv-translator-v5

echo 'FROM .' > Modelfile

ollama create gemma-3-isv-translator-v5:latest

Run and verify the Ollama Model:

curl http://localhost:11434/api/chat -d '{ "model": "gemma-3-isv-translator-v5:latest", "stream": false, "messages": [ { "role": "user", "content": "Translate \"How old are you?\" from English to Tamil" } ] }' | jq .

Unlike GGUF format, SafeTensors format doesn't contain the built in template for the model. Hence it is a good idea to create the model file along with a template based on the documentation of the base model.

Reference of Ollama Modelfile:

https://github.com/ollama/ollama/blob/main/docs/modelfile.md

Here is a sample Modelfile for base gemma 3 AI Model:

ollama show --modelfile gemma3:4b

FROM .

TEMPLATE """{{- range $i, $_ := .Messages }}

{{- if or (eq .Role "user") (eq .Role "system") }}<start_of_turn>user

{{ .Content }}<end_of_turn>

{{ if $last }}<start_of_turn>model

{{- else if eq .Role "assistant" }}<start_of_turn>model

{{ .Content }}{{ if not $last }}<end_of_turn>

{{- end }}"""

PARAMETER stop <end_of_turn>

PARAMETER temperature 1

PARAMETER top_k 64

PARAMETER top_p 0.95

Search This Blog

Harish Blogs