# Run an LLM on your computer with Ollama, Ngrok, and Cursor

Here’s a quick how-to guide for running a large language model (LLM) locally using three tools: **Ollama**, **Ngrok**, and **Cursor IDE**. This setup is perfect if you want to test AI features without relying on OpenAI’s cloud, or if you're just nerding out and want more control.

## Tools We'll Be Using

### Cursor (🔗 [https://www.cursor.com](https://www.cursor.com/features))

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747711902787/6950b9e7-fbf6-48ce-9985-c474a317c299.png align="center")

Cursor is a developer-focused IDE with AI built right in. You can switch between different models and agents, all powered by your OpenAI API key (unless you’re paying Cursor directly to use theirs).

Some standout features:

* Understands your codebase and adds context to prompts
    
* Runs commands for you (with your approval)
    
* Spot bugs and offers fixes
    
* Autocompletes like a champ
    
* Built-in smart reviews
    

It’s VS Code on steroids, with some real AI muscle.

### Ngrok (🔗 [https://ngrok.com](https://ngrok.com/))

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747711978194/64a3c159-8c4c-4fbf-8c59-9fb50f63181b.png align="center")

Ngrok is an API Gateway service that handles everything related to your API so that you focus on your business rules without worrying about secure connections and infrastructure. While it offers a full suite of infrastructure tools, we’ll focus on **Webhook testing**, which gives us a public URL to test a local server.

### Ollama (🔗 [https://ollama.com](https://ollama.com/))

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747712136731/23c04917-bfab-4c3b-9a45-364d69eeae40.png align="center")

Ollama is a fantastic open-source tool that lets you run LLMs locally. You can pull supported models, serve them through a local API, and even customize or train your own.

Browse available models at [ollama.com/library](https://ollama.com/library)

## Let’s Get Started

### 1\. Pull LLMs with Ollama

Once everything’s installed, the first step is to pick and download an LLM. Just a heads-up: LLMs are resource-hungry. Bigger models need more RAM and CPU, so make sure your machine can handle it.

To install a model:

```bash
ollama pull <model>
```

For example:

```bash
ollama pull gemma3:4b
```

You can check installed models with:

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747712460296/dbb95165-263d-467a-90b0-5cd84656a83f.png align="center")

### 2\. Set Up Ngrok

Create an Ngrok account and enable Multi-Factor Auth (highly recommended). Then, install the CLI and connect it with your account by running:

```plaintext
ngrok config add-authtoken <your-token>
```

The command above will set up your computer to connect securely with the Ngrok Web platform, but it won’t secure any endpoint you later expose.

🚧 Heads up: Your Ngrok endpoint exposes your local server to the internet. Even on the free plan, Ngrok gives you options like auth, IP filtering, and rate limits. Worth setting up later!

---

### 3\. Run the Ollama Server with Ngrok

Ollama can serve your model through a local API. Start it like this:

```plaintext
OLLAMA_ORIGINS=* ollama serve
```

By default, it runs on port `11434`.

Now expose that port with Ngrok:

```plaintext
ngrok http 11434 --host-header="localhost:11434"
```

Ngrok will generate a public HTTPS URL, called “endpoint” in the ngrok language, that maps to your local server.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747712830076/525fa7e0-9165-4133-bf11-51c683c5483e.png align="center")

We now have a public URL to connect us from the internet to our local Ollama server, It’s time to tell Cursor how to use this URL to hit our Ollama server locally.

---

### 4\. Connect Ollama to Cursor IDE

#### Add Your Model

In Cursor, go to **Settings &gt; Cursor Settings &gt; Models**.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747712883769/89331623-cfcf-4e48-9aa2-e6e25fafacc7.png align="center")

You’ll see some defaults, but they likely won’t match your locally installed model names. Hit `+ Add Model` and enter the exact model name from `ollama list`.

#### Override OpenAI API Base URL

Scroll to the **OpenAI API Key** section and expand **Override OpenAI Base URL**.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747712956275/a37fb5b2-24be-46f0-ab22-b861500b1a03.png align="center")

Paste your Ngrok HTTPS URL here. You can put anything in the API key field, it’s not validated in this case since our API doesn’t have any authentication method enabled yet.

---

### 5\. Test the Setup

To make sure everything works, you should:

* ✅ Be running the Ollama server
    
* ✅ Have Ngrok expose port 11434
    
* ✅ Have your LLM registered in Cursor
    
* ✅ Set the Ngrok URL in the API override
    

Now, toggle the **Enable OpenAI API Key** slider in Cursor’s settings. It’ll ping your API to confirm the connection.

If there's an error, Cursor will pop up the response — super useful for debugging.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747712982903/0e769d2a-2a14-41a8-8365-6ab3c6038bb1.png align="center")

Once it connects, you’re good to go! Start chatting with your local LLM in the left panel.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747713013716/8d7985cb-cf78-40f9-ba28-97ac78c85faf.png align="center")

Make sure to:

* Switch prompt mode to `Ask` or `Manual`
    
* Disable automatic model selection
    
* Manually pick one of your local models
    

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1747713041829/5871b234-0324-44b0-bb6b-ff9eba1a6706.png align="center")

---

## Takeaways

Congrats — you’ve got a local LLM running and integrated into your dev workflow.

That said, performance varies depending on your hardware. Try a smaller model (1b or 3b parameters) if your machine struggles. Models like 7b and up usually need at least 16GB of RAM.

And don’t be afraid to experiment. Pull multiple models and see which works best for your use case.

Also, remember: you don’t need Cursor or Ngrok specifically. All you need is:

* Ollama to run the model locally
    
* Some way to expose it via a fixed URL (Ngrok, localtunnel, etc.)
    
* An IDE that lets you set a custom OpenAI API base URL
    

---

## Next Steps

* 🔄 **Automate it all**  
    Right now, everything is manual. Consider scripting the startup, spinning up the server, and tunneling with one command.
    
* 🔐 **Secure your endpoint**  
    Even if your CLI is authenticated, your public API is wide open. Use Ngrok’s free auth and rate limits to avoid abuse. But then set the API Key in Cursor to successfully connect it to the API.
    
* 🔗 **Claim a static Ngrok URL**  
    Ngrok lets you claim a static domain for free. This means you won't have to update Cursor every time you restart the tunnel. Just be sure to lock it down properly if you go that route.