Granite Tool Call - Red Hat Support

Fine-tuned Granite 4.0-H-Micro to generate structured tool calls for Red Hat documentation search.

Developed by: jbarguti
License: Apache 2.0
Base Model: unsloth/granite-4.0-h-micro
Training: Unsloth + TRL

🎯 Purpose

Converts Red Hat technical questions into PortalSolrSearchTool calls. Handles both technical queries (generates tool calls) and conversational input (responds naturally).

Example:

Input:  "How do I configure static IP in RHEL 9?"
Output: <tool_call>[{"name": "PortalSolrSearchTool", "arguments": {"user_query": "configure static IP in RHEL 9"}}]

🚀 Usage

Installation

pip install unsloth
# Or for faster installation:
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

Quick Start

from unsloth import FastLanguageModel
from transformers import TextStreamer

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
    "jbarguti/granite-4-h-micro-tool-call-finetune-redhat",
    max_seq_length=2048,
    load_in_4bit=True,
)

FastLanguageModel.for_inference(model)

# Helper function
def ask_question(query):
    """Ask a question and get tool call or conversational response"""
    print(f"\n{'='*80}")
    print(f"Query: {query}")
    print(f"{'='*80}")
    print("Response:")
    
    messages = [{"role": "user", "content": query}]
    inputs = tokenizer.apply_chat_template(
        messages, 
        tokenize=True, 
        add_generation_prompt=True, 
        return_tensors="pt"
    ).to("cuda")
    
    text_streamer = TextStreamer(tokenizer, skip_prompt=True)
    _ = model.generate(
        inputs, 
        max_new_tokens=128, 
        temperature=0.3,
        top_p=0.95,
        top_k=50,
        streamer=text_streamer
    )
    print(f"{'='*80}\n")

# Try some examples
ask_question("How do I configure static IP in RHEL 9?")
ask_question("Hello!")
ask_question("What are the prerequisites for OpenShift 4.15?")

# Interactive mode
print("Interactive Mode - Type 'quit' to exit")
while True:
    query = input("\nYour question: ").strip()
    if query.lower() in ['quit', 'exit', 'q']:
        break
    if query:
        ask_question(query)

📊 Training Config

Dataset: 94 examples (RHEL, OpenShift, Ansible queries)
LoRA Rank: 32
Batch Size: 2 (effective 8 with gradient accumulation)
Steps: 100
Learning Rate: 2e-4

💡 Inference Settings

temperature = 0.3  # Low for consistent tool call formatting
max_new_tokens = 128
top_p = 0.95
top_k = 50

📝 Behavior

Technical queries → Generates tool calls Greetings/casual → Conversational responses
Off-topic → Polite redirect to Red Hat topics

🧪 Example Outputs

# Technical query
Input: "What are the prerequisites for OpenShift 4.15?"
Output: <tool_call>[{"name": "PortalSolrSearchTool", "arguments": {"user_query": "prerequisites for OpenShift 4.15"}}]

# Greeting
Input: "Hello!"
Output: Hello! I'm Red Hat's Ask Red Hat assistant. How can I help you with Red Hat products and services today?

# Error troubleshooting
Input: "ERROR rhsm cannot connect to Red Hat Subscription Management"
Output: <tool_call>[{"name": "PortalSolrSearchTool", "arguments": {"user_query": "ERROR rhsm cannot connect to Red Hat Subscription Management"}}]

📦 Model Info

LoRA Adapters: ~50MB
With Base Model: ~6-8GB (16-bit/full precision)
Architecture: Granite MoE Hybrid
Active Parameters: 3B

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jbarguti/granite-4-h-micro-tool-call-finetune-redhat

Base model

ibm-granite/granite-4.0-h-micro

Finetuned

(7)

this model