note_stack

Online Notepad

You Can Use ANY Local LLM in Antigravity!

6 min read

Summary

This video tutorial demonstrates how to significantly expand your large language model (LLM) catalog in the antigraphy platform by integrating and running open-source LLMs locally using LM Studio. The presenter guides viewers through the setup process, from downloading LM Studio to configuring a custom server in antigraphy, enabling the use of models beyond the default offerings.


Key Steps and Insights

  • Downloading and Installing LM Studio:
  • Visit the LM Studio website, select your platform, download, and install the software.
  • LM Studio provides access to many free, open-source models hosted on Hugging Face.
  • Examples of downloadable models include JM 4.7 flash and Google Gemini 34B.
  • Once installed, you can load your chosen model and run it locally by switching the status from off to on, effectively turning your machine into a local server hosting the model.
  • Creating a Custom Model Router in Antigraphy:
  • After setting up LM Studio, return to antigraphy and create a model router script (e.g., model_router.py).
  • This script is written using FastAPI and MCP frameworks to handle the server requests and model interaction.
  • The video emphasizes that no expert coding skills are needed because AI tools can generate this code.
  • The script includes:
  • Import statements for FastAPI, HPX, MCP, and related tools.
  • Configuration settings such as system prompts and temperature parameters.
  • Server setup details that can be left mostly unchanged.
  • The presenter offers to share this code upon request.
  • Configuring MCP Server in Antigraphy:
  • Navigate to MCP servers via the menu in antigraphy.
  • Since the server is custom, use Manage MCP to find and configure it.
  • Configuration involves editing a JSON file with server parameters.
  • This JSON file and related code are available upon request from the presenter.
  • Testing the Setup:
  • Once the server is running, test it by sending a query (e.g., "Hello, who are you?") through the LM Studio router.
  • The local model responds instantly; in the example, Gemini 3 flash replies with "Hi there, I'm Gemma."
  • The presenter notes that the response speed depends on the model size and PC capability.
  • The setup allows flexibility including chaining multiple models or having one model call another, enhancing antigraphy’s native model catalog.

Core Concepts and Definitions

TermDescription
LM StudioA downloadable application to run open-source LLMs locally on your PC, turning it into a server.
AntigraphyThe platform being extended with additional LLMs via custom model routing and MCP configuration.
MCP (Model Communication Protocol)A protocol for server communication enabling interaction with local LLMs in antigraphy.
FastAPIA Python web framework used to create the model router API for server-client communication.
Model RouterA custom Python script that routes requests from antigraphy to the local LLM server.

Highlights

  • No advanced coding required: AI-assisted code generation simplifies creating the model router.
  • Use of free/open-source models: Enables access to a larger variety of LLMs from Hugging Face.
  • Local hosting of LLMs: Avoids dependency on cloud APIs, reducing latency and potential costs.
  • Customizability: Users can tweak prompts, temperature, and even chain models for complex workflows.
  • PC requirements: Larger models require more capable hardware, but smaller/light models respond quickly on average PCs.

Timeline of Key Actions

TimeActionDescription
00:00–01:13Download LM Studio and installSelecting platform, downloading, installing, and initial setup of LM Studio
00:01:13–01:48Loading and running models in LM StudioChoosing a model (e.g., JM 4.7 flash, Gemini 34B), loading it, and turning the server on
01:48–02:50Creating the model router script in antigraphyWriting FastAPI/MCP Python script to route requests, with explanation emphasizing simplicity
02:50–03:28Configuring MCP server in antigraphyAdding custom MCP server via JSON configuration file
03:28–04:24Testing the local model through the routerSending a test query and receiving an instant response from the local Gemini model

Key Insights

  • Expanding antigraphy’s LLM catalog is straightforward by integrating LM Studio models.
  • Local model hosting improves response times and offline accessibility.
  • MCP protocol facilitates custom server communication within antigraphy.
  • AI tools reduce barriers for users without deep coding expertise.
  • Model flexibility and chaining enable sophisticated multi-model workflows.

Not Specified/Uncertain

  • Exact hardware specifications required for running larger models like Gemini 34B.
  • Detailed contents of the JSON MCP server configuration file.
  • Whether GPU acceleration is mandatory or optional for different models.
  • Security considerations for running local servers with LLMs.

This tutorial offers a practical, user-friendly method to diversify and enhance your LLM usage within antigraphy by leveraging open-source models and custom server routing.

Updated: 28 Mar 2026 Report