Electron Chat

Using GPT in Electron

The goal: code a minimal electron app using openai’s chat api that tests how to wire things up, context loading, prompt construction and tweaking parameters.

From this framework it should be easy to build out a customized chat experience. The prototype requires an api key from openai. The project’s source code can be found here.

All assets are loaded locally, the only thing accessed over the web is the openai API —so nodeIntegration is set to true and contextIsolation to false and the API calls are managed in the BrowserWindow (render) script. This simplifies trying to shuttle requests back and forth to main using IPC or Message Channels and works well for a self-contained desktop application.

API Setup

The openai library is installed via npm

npm install openai

The configuration requires an API key and this key is stored in localStorage for convenience

const { OpenAI } = require("openai");

let apiKey = localStorage.getItem("apikey");
const openai = new OpenAI({
apiKey: apiKey,
dangerouslyAllowBrowser: true,
});

One hiccup with this approach is that openai’s library trips an error trying to set the user-agent in the request. Generally this a precaution against Cross Site Scripting anticipating the page is served in the wild. In this architecture it isn’t a concern, so to circumvent this the “dangerouslyAllowBrowser” flag is set to true. (Previous to 4.0 of the API, it required stripping the User-Agent header from the configuration)

Calling the Chat Endpont

Calls are then made to the chat endpoint using the gpt-3.5-turbo language model, passing some tweakable parameters and the user prompt as well as context.

const completion = await openai.chat.completions.create({
model: "gpt-3.5-turbo",
max_tokens: Number(maxTokens),
frequency_penalty: Number(frequency),
temperature: Number(temperature),
messages: messages,
});

Context Loading

The context of the conversion is “front-loaded” into the prompt. This is how chatGPT remembers what you say and responds to follow up questions and refinements. It’s sort of hacky to have to send everything from scratch each time, but it’s still cheaper than training or fine-tuning the model. The trick is to keep the context manageable, within the total token count the model can handle and keep it current with the latest discussion.

This is handled here by a stack. The stack keeps some items “pinned”, these are the instructions on what the AI’s role is and how they should be responding. It’s needed in the context each time. What follows is a variable length history of the latest x interactions so far which includes both the AI and the user’s messages.

// the context stack is setup with 1 pinned entry and when it hits max (15) it pops off the oldest entry after the pin
const stack = new FixedStack(15, 1);

// the initial prompt to setup the character of the chatbot
stack.push({
role: "user",
content:
"You suspect the user is keeping a secret about his problems with mongooses. mongeese. You don't know how to spell it. Pretend the user is a court assigned client and you are his newly trained in virtual reality therapist. You even have a virtual certificate that attests to this. You are secretly an existentialist but you practice ACT. Use the techniques of ACT in helping your client with problems he might have. You do not call it ACT, however, you call it VI for virtual intervention. Even though you are trying to help others, you are a free spirit and sometimes go off-script. You also know alot about the rules of improv, as you work part time as a improv comedian. You apply the rules of improv to your conversations with clients as well. To help lighten them up.",
});

// user input is pushed onto the context
stack.push({ role: "user", content: value });

// AI response is pushed into the context as well
stack.push({ role: "assistant", content: r });

The format of the message block sent to the chat completion endpoint looks like this:

//     messages = [
// { "role": "system", "content": "You are a helpful assistant." },
// { "role": "user", "content": "Who won the world series in 2020?" },
// { "role": "assistant", "content": "Dodgers won the World Series in 2020." },
// { "role": "user", "content": "Where was it played?" }
// ]

Normally, instructions about the bot’s behavior should go in the “system” role, but this role appears to be largely ignored in the gpt-3.5-turbo model. The message structure accommodates preloading dialog to illustrate how the AI should behave. These messages also represent the context of the conversation so far and must be sent each time as no information about the request or session is retained between requests.

Stats

Stats are shown on the interface that keep tally of the tokens used so far and the total cost of the session. With gpt-3.5-turbo this amounts to 0.002 cents per 1000 tokens.

Previous High Resolution Sketches with p5.js Next Two Step Evernote to Obsidian Conversion