I’m giving a talk soon about terminals and coding agents to folks who work in tech but don’t write application or pipeline code for a living
Gonna cheat and use a post as my speaker notes
Terminals
Before we had mice and high resolution screens you can click around, we had the Terminal. Technically they were dedicated machines, but nowadays we have virtual ones.
Basic input and output of the computer.
Just about anything you can do pointing-and-clicking, you can do with a terminal.
If yinz ever hear the word “headless” used in tech, it means that the server or computer only has a terminal, no graphical user interface.
We use a Terminal to interact with Computers
File Systems
Everything is one of two things.
It is either File. Or it is a Directory (Folder).
That’s all, don’t over complicate it.
The path to a File or Directory is called a Path.
See, very simple.
The base directory is called Root. Its Path is /. It’s all turtles from there.
We use a Terminal to interact with File Systems
Commands
A Command is an instruction you give the Terminal
Here are some common ones:
pwd or the Path (To The) Working Directory, this will tell you where you are in the File System
ls or List, it lists out all your files in the current Directory. ~ is a shorthand for your Home Directory /Users/your-user/
mkdir or Make Directory. This will make a folder. mkdir folder will make a folder under your current Directory named folder. mkdir ~/folder will do the same thing but in the ~ Directory.
cd or Change Directory.
cd .. will take you up a level,
cd ../.. does two.
cd / takes you to Root.
cd /nested takes you to the nested Directory one level lower than your current (requires the folder to exist),
cd ~ takes you to /Users/your-user/.
cat or Concatenate. This one Command is pretty versatile.
cat filename.txt lets you view a File. Note that we can also do this with more filename.txt and less filename.txt
cat file1.txt file2.txt lets you view multiple Files.
cat > newfile.txt lets you create a file named newfile.txt. Note that we can do this with touch newfile.txt.
cat >> existingfile.txt lets you append (add to the bottom of) an existing File named existingfile.txt.
cat file1.txt > file2.txt lets you copy file1.txt to file2.txt
mv Move. This lets you move a file mv file.txt /where/to/move/it
open This is pretty self explanatory, open the File in the default Application set by your OS
man Manual. This is the most important Command. man theCommand opens up the user manual (Documentation) for that Command. So man ls
Programs and Applications are fancy Commands.
For example, git is one such fancy Command to do manual version management (no automatic syncing) for your Files and Directories.
Advanced Commands include: curl, jq, grep, find, tail, ssh, and scp. All of those are out of scope.
docker and Containers are definitely out of scope.
We use a Terminal to run Commands on File Systems
LLMs
Large Language Models. AKA a very fancy stats equation (Model).
You start with taking a massive amount of data. Like every piece of text, image, audio, and video on the internet.
Then you use a bunch of GPUs to keep crunching maths and stats on the data till you end up with a function.
What does the function do?
When you give it an input in the form of a question or command, it will return an approximation of what an answer would look like (based on the training and added context data). These are called a Prompt and a Response.
It is a lossy search algorithm with a chat interface, or an Unstructured Query Language.
LLMs are programs that can lossily query and synthesize training data
Tokens
These are our units. The inputs and outputs. Whether it’s text, images, audio, or video. The data are broken up into Tokens. A Token is a chunk of common data.
For example, let’s look at this prompt “is there a pattern to Machine Learning investment, one sentence response”.
Here, each of the words and symbols in the prompt breaks into a Token, so about 12 Tokens.
Here’s a response from Haiku 4.5 “ML investment follows a boom-bust cycle driven by hype cycles, breakthrough moments (transformers, LLMs), and funding availability rather than consistent fundamental metrics.”. Roughly 32 Tokens.
In the output, it’s not one-to-one word-to-token. “transformers” breaks into two Tokens. “transform” and “ers”. “LLMs” breaks into three, “L”, “LM”, and “s”.
This is what our usage limits are tied to.
Every LLM handles Tokens differently, here’s a tool OpenAI made to count tokens for their Models.
We query LLMs with Tokens
Agents
Agents are what happens when we run an LLM in a loop. Given an input, keep reprompting “yourself” till an output is reached.
Human-in-the-Loop means a sufficiently skilled human is validating the outputs and/or tuning.
Remember that processing isn’t free. Text is the most native input for them. PDFs, W0rd Docs, Slides, and Spreadsheets are harder for the clanker to ingest than Markdown, txt, and CSVs. This in large part is why coding agents have become so useful for software developers. They are cognitive power tools.
As far as images go, they do a decent job of parsing screenshots, but are far from perfect.
Treat every session as ephemeral. I have strong opinions on using browser chats as state storage for projects. Chats with your terminal agent are transient. If the output or part of the process was important. Then export it and/or track it with git.
Agents are a Command you run in the Terminal to query an LLM with Tokens.
Conclusion
My general rule of thumb for what the clanker can and can’t do reliably.
What it can.
Any task that requires pattern matching plus domain knowledge.
A lot of my backend dev work falls in two buckets.
One. I need to take some input, potentially do some processing, and query a data source in a structured way. Then potentially do some more processing, and return the output.
Two. I need to take some input, potentially do some processing, and hit a Service (a Command with a Web Interface). Then potentially do some more processing, and return the output.
If I can represent the task as a Bash Script, then I can probably have the bot convert it to application code.
What it can’t.
Tasks that don’t fit a pattern.
Also, pro tip, not every task needs Opus. Sometimes Sonnet or Haiku are more than enough to get it done.