MikesBlog

LLMs At The Command Line - Part 1

If you are a command-line fan and want to experiment with large language models (LLM), you will love AiChat. There are many popular graphical front ends for working with LLMs, such as OpenAI’s ChatGPT, and Anthropic’s Claude, but get ready for this little powerhouse for CLI lovers as it has many advanced and useful features. One such feature is an easy-to-use, out-of-the-box RAG feature (Retrieval Augmented Generation) useful for searching existing content. I’ve put together a small demo here that shows how easy it can be to use in a pinch. There are many use cases where such an approach is just the right size. ...

Experimenting with Agentic AI Tooling: My Journey Through the Cutting Edge

The first time I fired up an MCP (Model Context Protocol) server plugin, “Agent,” I was excited to see it registered in Claude Desktop but immediately annoyed by the errors that popped up. I didn’t expect a smooth experience in my encounter with the future of Agentic AI, but I found many configuration tweaks, clunky debugging tools, and broken dependencies along the way. It was a stark reminder that we’re in the early days, and there’s a lot of ground to cover before Agents become seamless collaborators. ...

Navigating the Fragmented Landscape of Agentic AI Tools

Agentic AI, with its promise of creating systems capable of autonomous reasoning and action, has been a hotbed of innovation in the AI community. Tools from OpenAI, LangChain, and Microsoft are spearheading this new wave, each offering unique features and capabilities. However, the lack of standardization in this ecosystem presents significant challenges to developers, researchers, and organizations eager to adopt these technologies. The Current State of Agentic AI Tools The diversity of agentic AI tools is both a strength and a weakness. On one hand, it fosters creativity and innovation as developers explore various approaches to building autonomous systems. On the other hand, the fragmented landscape leads to: ...

Classify With Confidence

Large foundation models like GPT can classify text according to a well-crafted prompt instruction, and it’s remarkable how well they can do this, considering there has been no explicit training with labeled datasets. This has traditionally been done using machine learning models and logistic regression techniques. However, with generative model classification, we lose the ‘confidence level’ or the probability score of the prediction available in logistic regression. Traditional models like logistic regression provide a probability score for each class, indicating the model’s confidence level in its predictions. This confidence score is not just valuable; it’s essential for decision-making, as it helps users gauge how confident the model is about its classifications. While generative model responses may align well with the intended classification, we don’t directly get an explicit probability for each class. This can be a limitation, particularly in high-stakes applications where knowing the model’s confidence level is crucial. ...

Comparing Prompt Results - A Rose By Any Other Name

You might want to test an expected response from a prompt sent to a large language model, but string comparisons will not help you. The inherent variability in large language model (LLM) responses will require you to find new ways to compare generated prompt results. There are a few reasons why a generated prompt result will not exactly match a prior result: the prompt itself may have changed, the model parameters may have changed, or the model’s inherent variability may inject a small amount of change in the results. ...

Scaling OpenAI With AsyncOpenAI

As I stood outside and looked at the neighborhood wasteland that post-July 4th left behind, the whiff of gunpowder still hanging in the air, I felt a burst of good neighbor energy flow through me, so I grabbed a broom. Sweeping up the street gave me time to think about the other chores I had for the day, including the writing of a new blog post, and I began to wonder how I could use ChatGPT to help me speed some things up. ...