PharoInfer is a fully in-image inference engine for Pharo Smalltalk. It loads a GGUF model file directly from disk and drives llama.cpp through UFFI — there is no HTTP server, no Ollama bridge, and no subprocess. Talk to the model straight from the image.
libllama.so on Linux,
libllama.dylib on macOS, llama.dll on Windows).
The bindings target the modern API (b4000 and later)..gguf model file.Pharo will look for libllama.so (or the platform equivalent) on the
default library search path. To override, pin it from the image:
AILlamaLibrary libraryPath: '/home/me/llama.cpp/build/libllama.so'.
Metacello new
githubUser: 'pharo-llm' project: 'pharo-infer' commitish: 'main' path: 'src';
baseline: 'AIPharoInfer';
load.
| manager engine model |
manager := AIModelManager default.
manager currentBackend: AILocalBackend new.
model := manager loadModel:
(FileLocator home / 'models' / 'tiny.gguf') fullName.
engine := AIInferenceEngine default.
engine backend: manager currentBackend.
engine complete: 'Hello from Pharo!' model: model name.
engine
stream: 'Tell me a joke about Smalltalk'
model: model name
onToken: [ :piece | Transcript show: piece ].
| request |
request := AIChatCompletionRequest
model: model name
messages: {
AIChatMessage system: 'You are a helpful AI assistant.'.
AIChatMessage user: 'What is Smalltalk?' }.
AIChatAPI default complete: request.
AILocalBackend new
nGpuLayers: 999; "offload all layers"
nThreads: 8;
contextSize: 4096.
AILlamaLibrary — FFILibrary mapping the llama.cpp C entry points.AILlamaModelParams, AILlamaContextParams, AILlamaBatch,
AILlamaSamplerChainParams — FFIExternalStructure mirrors of the
by-value records used by llama.cpp.AILocalBackend — drives llama.cpp: loads a model, runs
tokenization + decode + sampling, and detokenizes back to UTF-8.AILocalModelHandle — owns the opaque (model *, context *) pair
and frees it on unload.AIGGUFParser — optional pre-flight reader for GGUF metadata
(header, vocab, special tokens) without loading the model.AIInferenceEngine, AIChatAPI — high-level entry points.