At the end of last year, I came to the conclusion that I had no choice but to stand up a lab at POINT that specifically was designed for me to learn about machine learning and large language models. As mentioned in my last post, I think that understanding the implicit uses and dangers of this relatively new breed of technology will be extraordinarily important going forward.
So first, hardware. I did some research and the hacker folks in my feeds were all using old hardware or Raspberry Pis just to say they could, which is cool. In fact Brent Huston at Microsolved wrote the article that started this all in a lot of ways. I am all about reusing devices, I have laptops that are 20 years old performing important functions at the Sempf Compound. For this, though, I needed a real research box, so I dropped the other shoe and got a mini PC (not this one).

Great now I have a place to put stuff. What to put where? After hemming and hawing I decided I wanted to have multiple machines running, and I wanted functional separation, because it makes my brain better. When I was in the dev space, I had a server running Windows Server and Hyper-V so that I could spin up a whole new machine that replicated the environment I was coding for It worked out well, and I wanted to emulate that with something lighter weight that wasn't a Microsoft product. So I started with Proxmox. I am not going to go into the setup and whatnot because there are FAR better guides out there than mine (I'd link but they go out of date fast) so I'll just say I have 128 gig of ram split amongst a few machines running things I want to keep separated. And if I ever run out of money I can sell the RAM and move to Cancun.
Alright, now I have a random empty virtual machine, so I started with what I know and installed Ubuntu. Your milage may vary of course. I think *nix is far superior for this kind of thing because of the file system but there are more options of course. Your distro of choice will be acceptable.
Next we will need a server to store our language models. No, we are not going to use remote services, this is something I want to be able to disconnect from the Internet and use. It is going to be designed to be a solo operator, so we need our own LLM services. Fortunately for me, there is Ollama. The good news is that installing it is just:
curl -fsSL https://ollama.com/install.sh | sh
So no complaints there. Now we have our own delivery service for an LLM, and can use the command line to go grab models from the open store. "What," you ask, "there are free models?" You bet. You won't get GPT5.3Codex (that joke is probably out of date already) but there is a lot available, and some of them are even ethically constructed so do your research. But we aren't stuck here. All Ollama does is provide the access to the models in all of the expected ways. We need a chat interface.
OpenWebUI enters the chat. This extremely generically named tool is the ChatGPT or Le Chat for our instance of Ollama. There aren't enough ways to describe how awesome this open source project is. It is large, complex, and it works. It is well documented and the core team is remarkable. Throw them some money if you can.
To focus OpenWebUI's chat, we need to give it a system prompt that will provide a personality. I have mine here, and you are welcome to steal it as a template - I stole VSCode's as a template so fair's fair. Steal and steal alike, amirite? You can do a lot with Soul files, and when I write about AI Concoles later I'll put some information up and the Claude and Mistral Soul structures, and how to try and bend the model to your needs, at least in the chat.

But anyway, with OpenWebUI we can specify a model from our Ollama instance, and then create a Workspace, which is OpenWebUI's interface for RAGs. More or less, what I have done here is given Chiron all of my books, and everything I have written, and all of the OWASP CheatSheets, the Testing Guide, and the ASVS. This has generated the POINT Application Security Intelligence.
It's not that terribly intelligent
I mean, it does exactly what I want it to do a vast majority of the time, but it will do all of the things that I am certain that you have heard the stories about. I give it 7 vulnerabilities, and ask it to build descriptions and remediation steps, and it will do three. Ta daaa! "No," I say in the chat, "there were 7, where are the other four?" "You are absolutely right, I only handled three of those! Let me do the other four."
And then it does two more. Ta daaaaa!
So there is still a way to go. That's just a small obvious example, there are subtle things too. I asked it to help with a code example for a remediation step, and while it did remediate the vulnerability, it introduced a new one. There is unquestionably more that needs to be done.
But it does something extremely well. For instance, I am clearly not going to load a client's source code into Mistral for analysis. It's a violation of the NDA I share with my clients and I am just not like that. But I can load it into Chiron for analysis because it never leaves my office. All and all a worthwhile project I recommend for everyone.