Comparing Prompt Results - A Rose By Any Other Name

You might want to test an expected response from a prompt sent to a large language model, but string comparisons will not help you. The inherent variability in large language model (LLM) responses will require you to find new ways to compare generated prompt results. There are a few reasons why a generated prompt result will not exactly match a prior result: the prompt itself may have changed, the model parameters may have changed, or the model’s inherent variability may inject a small amount of change in the results. ...

August 20, 2024 · 9 min · Michael OShea

Scaling OpenAI With AsyncOpenAI

As I stood outside and looked at the neighborhood wasteland that post-July 4th left behind, the whiff of gunpowder still hanging in the air, I felt a burst of good neighbor energy flow through me, so I grabbed a broom. Sweeping up the street gave me time to think about the other chores I had for the day, including the writing of a new blog post, and I began to wonder how I could use ChatGPT to help me speed some things up. ...

July 7, 2024 · 7 min · Michael OShea

Transformers - Positional Encoding

Since transformer input is processed in parallel rather than serially, it is necessary to encode the relative positions of the input sequence of tokens in some way. The positional encoding in the transformer model uses sinusoidal functions to create a unique encoding for each position. In working through the article on Transformers, as described in the original paper “Attention is All You Need” by Vaswani et al., the following formulas are used to encode the PE tensor values: ...

May 27, 2024 · 3 min · Michael OShea

Fine-tuning Llama3

Since Llama3 was released, the PyTorch llama3 documentation has a few glitches pointing at configurations in torchtune that are still referencing Llama2. The meta website is a little more up-to-date, but the documentation is a little light on details. So, I wrote this article to bring everything together. Prerequisites You’ll want to use Python 3.11 until Torch compile supports Python 3.12 , and I recommend setting up a virtual environment for this using venv or pipenv. Install torchtune pip install torchtune Install EleutherAI’s Evaluation Harness pip install lm_eval==0.4.* Download Llama3-8B model You will need to get access to Llama3 via instructions on the official Meta Llama3 page. You will also need your Hugging Face token setup from here. ...

May 11, 2024 · 7 min · Michael OShea

OpenAI Python API Compatibility

An increasing number of open-sourced generative AI large language models (LLM) are being hosted behind an OpenAI API-compatible endpoint or have tools that offer an OpenAI API. The Python library for accessing OpenAI is just a REST client, and the library provides a way to specify the URL and an API key, as well as the model being offered by the provider. Here are a few examples of how the OpenAI library is used with other open-source models. ...

May 9, 2024 · 3 min · Michael OShea