Skip to main content

ChatOllama

Ollama allows you to run open-source large language models, such as Llama 2, locally.

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.

It optimizes setup and configuration details, including GPU usage.

For a complete list of supported models and model variants, see the Ollama model library.

Overviewโ€‹

Integration detailsโ€‹

ClassPackageLocalSerializableJS supportPackage downloadsPackage latest
ChatOllamalangchain-ollamaโœ…โŒโœ…PyPI - DownloadsPyPI - Version

Model featuresโ€‹

Tool callingStructured outputJSON modeImage inputAudio inputVideo inputToken-level streamingNative asyncToken usageLogprobs
โœ…โŒโœ…โŒโŒโŒโœ…โœ…โŒโŒ

Setupโ€‹

First, follow these instructions to set up and run a local Ollama instance:

  • Download and install Ollama onto the available supported platforms (including Windows Subsystem for Linux)
  • Fetch available LLM model via ollama pull <name-of-model>
    • View a list of available models via the model library
    • e.g., ollama pull llama3
  • This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.

On Mac, the models will be download to ~/.ollama/models

On Linux (or WSL), the models will be stored at /usr/share/ollama/.ollama/models

  • Specify the exact version of the model of interest as such ollama pull vicuna:13b-v1.5-16k-q4_0 (View the various tags for the Vicuna model in this instance)
  • To view all pulled models, use ollama list
  • To chat directly with a model from the command line, use ollama run <name-of-model>
  • View the Ollama documentation for more commands. Run ollama help in the terminal to see available commands too.

If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:

# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

Installationโ€‹

The LangChain Ollama integration lives in the langchain-ollama package:

%pip install -qU langchain-ollama

Instantiationโ€‹

Now we can instantiate our model object and generate chat completions:

  • TODO: Update model instantiation with relevant params.
from langchain_ollama import ChatOllama

llm = ChatOllama(
model="llama3",
temperature=0,
# other params...
)
API Reference:ChatOllama

Invocationโ€‹

from langchain_core.messages import AIMessage

messages = [
(
"system",
"You are a helpful assistant that translates English to French. Translate the user sentence.",
),
("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
API Reference:AIMessage
AIMessage(content='Je adore le programmation.\n\n(Note: "programmation" is not commonly used in French, but I translated it as "le programmation" to maintain the same grammatical structure and meaning as the original English sentence.)', response_metadata={'model': 'llama3', 'created_at': '2024-07-22T17:43:54.731273Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 11094839375, 'load_duration': 10121854667, 'prompt_eval_count': 36, 'prompt_eval_duration': 146569000, 'eval_count': 46, 'eval_duration': 816593000}, id='run-befccbdc-e1f9-42a9-85cf-e69b926d6b8b-0', usage_metadata={'input_tokens': 36, 'output_tokens': 46, 'total_tokens': 82})
print(ai_msg.content)
Je adore le programmation.

(Note: "programmation" is not commonly used in French, but I translated it as "le programmation" to maintain the same grammatical structure and meaning as the original English sentence.)

Chainingโ€‹

We can chain our model with a prompt template like so:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
),
("human", "{input}"),
]
)

chain = prompt | llm
chain.invoke(
{
"input_language": "English",
"output_language": "German",
"input": "I love programming.",
}
)
API Reference:ChatPromptTemplate
AIMessage(content='Ich liebe Programmieren!\n\n(Note: "Ich liebe" means "I love", "Programmieren" is the verb for "programming")', response_metadata={'model': 'llama3', 'created_at': '2024-07-04T04:22:33.864132Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 1310800083, 'load_duration': 1782000, 'prompt_eval_count': 16, 'prompt_eval_duration': 250199000, 'eval_count': 29, 'eval_duration': 1057192000}, id='run-cbadbe59-2de2-4ec0-a18a-b3220226c3d2-0')

Tool callingโ€‹

We can use tool calling with an LLM that has been fine-tuned for tool use:

ollama pull llama3-groq-tool-use

We can just pass normal Python functions directly as tools.

from typing import List

from langchain_ollama import ChatOllama
from typing_extensions import TypedDict


def validate_user(user_id: int, addresses: List) -> bool:
"""Validate user using historical addresses.

Args:
user_id: (int) the user ID.
addresses: Previous addresses.
"""
return True


llm = ChatOllama(
model="llama3-groq-tool-use",
temperature=0,
).bind_tools([validate_user])

result = llm.invoke(
"Could you validate user 123? They previously lived at "
"123 Fake St in Boston MA and 234 Pretend Boulevard in "
"Houston TX."
)
result.tool_calls
API Reference:ChatOllama
[{'name': 'validate_user',
'args': {'addresses': ['123 Fake St, Boston MA',
'234 Pretend Boulevard, Houston TX'],
'user_id': 123},
'id': 'fe2148d3-95fb-48e9-845a-4bfecc1f1f96',
'type': 'tool_call'}]

We look at the LangSmith trace to see that the tool call was performed:

https://smith.langchain.com/public/4169348a-d6be-45df-a7cf-032f6baa4697/r

In particular, the trace shows how the tool schema was populated.

Multi-modalโ€‹

Ollama has support for multi-modal LLMs, such as bakllava and llava.

ollama pull bakllava

Be sure to update Ollama so that you have the most recent version to support multi-modal.

import base64
from io import BytesIO

from IPython.display import HTML, display
from PIL import Image


def convert_to_base64(pil_image):
"""
Convert PIL images to Base64 encoded strings

:param pil_image: PIL image
:return: Re-sized Base64 string
"""

buffered = BytesIO()
pil_image.save(buffered, format="JPEG") # You can change the format if needed
img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
return img_str


def plt_img_base64(img_base64):
"""
Disply base64 encoded string as image

:param img_base64: Base64 string
"""
# Create an HTML img tag with the base64 string as the source
image_html = f'<img src="data:image/jpeg;base64,{img_base64}" />'
# Display the image by rendering the HTML
display(HTML(image_html))


file_path = "../../../static/img/ollama_example_img.jpg"
pil_image = Image.open(file_path)

image_b64 = convert_to_base64(pil_image)
plt_img_base64(image_b64)
<img src="" /> 
from langchain_core.messages import HumanMessage
from langchain_ollama import ChatOllama

llm = ChatOllama(model="bakllava", temperature=0)


def prompt_func(data):
text = data["text"]
image = data["image"]

image_part = {
"type": "image_url",
"image_url": f"data:image/jpeg;base64,{image}",
}

content_parts = []

text_part = {"type": "text", "text": text}

content_parts.append(image_part)
content_parts.append(text_part)

return [HumanMessage(content=content_parts)]


from langchain_core.output_parsers import StrOutputParser

chain = prompt_func | llm | StrOutputParser()

query_chain = chain.invoke(
{"text": "What is the Dollar-based gross retention rate?", "image": image_b64}
)

print(query_chain)
90%

API referenceโ€‹

For detailed documentation of all ChatOllama features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_ollama.chat_models.ChatOllama.html


Was this page helpful?


You can also leave detailed feedback on GitHub.