Using offline AI models for free in your Phyton scripts with Ollama

posted Originally published at andresalvareziglesias.substack.com 2 min read

First published on Substack: https://andresalvareziglesias.substack.com/p/using-offline-ai-models-for-free

The Ollama project allows us to donwload and use AI models for offline usage with our computer resources. This allows us to experiment with the AI in our Python projects without any cost, an testing a lot of models to find the ideal choice for our project. It’s awesome.

Using offline AI models for free in your Phyton scripts with Ollama

Installation of Ollama

The installation of Ollama in a Linux device (for MacOS and Windows, check Ollama Github page) is very, very easy. Just write this command in a terminal:

curl -fsSL https://ollama.com/install.sh | sh

After a long wait, Ollama will be fully installed and configured.

Download a model

Once installed, we can download any model to our computer for offline usage. In the Ollama library page we can read the full list of available models.

For example, to download gemma2 with 2 billions of parameters, the command will be:

ollama pull gemma2:2b

The model will be downloaded to /usr/share/ollama/.ollama/models local folder, if you are curious (as I am).

Use the downloaded model in Python

Now, we can use the downloaded Gemma model as any other cloud model, like this:

from ollama import Client, ResponseError

try:
    client = Client(
        host='http://localhost:11434',
        headers={}
    )

    response = client.chat(
        model='gemma2:2b',
        messages=[{
            'role': 'user',
            'content': 'Describe why Ollama is useful',
        }]
    )

    print(response['message']['content'])

except ResponseError as e:
    print('Error:', e.error)

The program will output our requested answer. Wonderful!

A real example: check this article about Ollama… with Ollama

We can program a very simple article checker with Ollama and Python, like this

from ollama import Client, ResponseError

try:
    client = Client(
        host='http://localhost:11434',
        headers={}
    )

    prompt  = "I am an spanish writer that is learning how to "
    prompt += "write in english. Please, review if this article "
    prompt += "is well written. Thank you!\n\n"

    with open('article.md') as f:
        prompt += f.read()

    response = client.chat(
        model='gemma2:2b',
        messages=[{
            'role': 'user',
            'content': prompt,
        }]
    )

    print(response['message']['content'])

except ResponseError as e:
    print('Error:', e.error)

When executed, Gemma will give us a detailer analysis of this article, with advices for improvement.

Awesome! The possibilities are limitless!

A lot to learn, a lot of fun

Ollama allows us to test different models with our most precious data, without any privacy concern. Ollama allows to to save costs in the initial stages of development of an AI powered application.

And you? What kind of projects will you develop witn the help of Ollama?

Happy coding!

About the list

Among the Python and Docker posts, I will also write about other related topics, like:

  • Software architecture
  • Programming environments
  • Linux operating system
  • Etc.

If you found some interesting technology, programming language or whatever, please, let me know! I’m always open to learning something new!

About the author

I’m Andrés, a full-stack software developer based in Palma, on a personal journey to improve my coding skills. I’m also a self-published fantasy writer with four published novels to my name. Feel free to ask me anything!

If you read this far, tweet to the author to show them you care. Tweet a Thanks
Yo Andrés, Love the breakdown. Quick question—how’s the performance of Ollama offline vs. cloud models? Any lag or accuracy drops? Would be cool to hear your take! :-)
Wow ! Good Post
Thank you very much!
Thanks by sharing
The quality is almost the same... and the lag depends of your computer RAM memory. These models needs A LOT of RAM!
Great post! I love how you explained the simplicity of setting up Ollama and using AI models offline. It’s really helpful for developers looking to experiment with AI without cloud dependencies.

I’ve been exploring ways to fine-tune models for specific tasks. Does Ollama support any form of model customization or fine-tuning? Would love to hear.
Thx would really help fr and lovely article

More Posts

Are you using generative AI correctly?

Elmer Urbina - Nov 24, 2024

An introduction to moving from just an everyday AI user to developing AI solutions specific for you

Samuel Ekirigwe - Jan 14

I Tested the Top AI Models to Build the Same App — Here are the Shocking Results!

Andrew Baisden - Feb 12

Discover how to use AI writing tools without losing your authentic voice as a content creator.

Jimmy McBride - Oct 11, 2024

Learn how to write GenAI applications with Java using the Spring AI framework and utilize RAG for improving answers.

Jennifer Reif - Sep 22, 2024
chevron_left