What I've Learned Building With AI

by Will Hakim October 31, 2024

Avid users of information technology can point to a handful of key launch or demonstration moments that re-define a sense of what is possible. Douglas Engelbart’s “Mother of All Demos” in 1968 gave the world the graphical user interface. Steve Jobs' iPhone launch in 2007 allowed us to see a powerful computer in our pocket. Next month will be exactly (and only!) two years since a launch that could be similarly profound: the introduction of ChatGPT.

It is hard to describe the ensuing twenty-four months in Silicon Valley as anything other than a seismic shift for the technology industry. I am a staff engineer here at Halcyon, and have spent more than a decade working in Natural Language Processing. ChatGPT’s anniversary makes me reflect on some of the lessons I’ve learned working with AI in the age of the Large Language Model.

First, the gulf between the haves and the have-nots of AI has simply ballooned, something I have witnessed first-hand in the conversations I have with colleagues at larger orgs like OpenAI and Google. Five years ago those colleagues would often recommend AI solutions that made sense for companies of all shapes and sizes. Nowadays, the same colleagues can blithely suggest resorting to fine-tuning a language model at a price and scale that’s unapproachable (or irresponsible) for most startups. And those startups that have chosen to go head-to-head with the foundation model providers are finding it increasingly hard to compete.

What, then, makes AI startups competitive in an age of foundation models? A few things, first among which is domain expertise. That is why we are deeply invested in the deeply unsexy work of gathering data from a wide array of sources (shout out to the developers at the Public Utility Commission of Oregon, who are still jamming nested HTML tables into web pages like it’s 1999!). Pattern-matching across these data corpora has allowed us to build abstractions such as our knowledge graph and internal authoritativeness ranker.

In a world in which anyone can stand up a RAG chatbot in a week and index huge amounts of data in a month, it is these abstractions that provide the contextual knowledge necessary to ground raw unstructured data. For example, encoding the temporal sequencing of a list of documents in a graph can allow an LLM to compare a utility’s projections over time. Extracting and saving the ownership structures of a mountain of LLCs can reveal relationships between projects in an interconnection queue. And distinguishing between a small special-interest group and a much larger (or more influential) one can provide more accurately-sourced and up-to-date query results. This integration of domain knowledge into an AI pipeline — rather than simple usage of the flashiest, newest AI tech —underpins Halcyon’s value proposition.

Oct 31 Will quote

Another advantage on which AI startups can compete with foundation models is understanding a customer’s workflow and fitting AI into it. This point may seem obvious (or even banal) but to me, it is elemental. One thing I have observed is the extent to which AI companies continue to hit their heads against the wall of forcing users to change their way of working. As a software engineer, the litany of AI-assisted developer tooling on the market provides a natural and illustrative example. Writing code is an interactive process: the process of writing the code crystallizes requirements, elucidates edge cases, and sparks follow-on work.

And yet – AI coding tools too often presume that an engineer composes their thoughts first, and produces code second. That presumption means that these tools may fail to take advantage of the context that emerges from the code-writing process itself. By contrast, the decline of Stack Overflow in the face of ChatGPT illustrates how disruptive LLMs can be when they slot neatly into an existing workflow. Developers often use StackOverflow to debug already-broken code. That is something that LLMs can do quite well, because all relevant context and error messages can be provided upfront, which gives the AI the best chance to provide a helpful output.

This lesson of fitting into customer workflows is something we have learned at Halcyon. The recent battles over alternative interfaces at OpenAI and Anthropic suggests the big players have too. A simple query box is non-ideal for many workflows. On one hand, it’s too open-ended. Users may not inherently understand the system through no fault of their own, and that can result in queries that do not always map to our RAG pipeline or or the data we have ingested. On the other hand, a query box does not even provide some of the table-stakes capabilities that customers enjoy in old-school tools like Microsoft Excel, such as the ability to operate on multiple datasets simultaneously or control the formatting of outputs.

That is why we’re investing engineering resources in using Halcyon’s core platform to power new capabilities such as batch processing, structured outputs, and targeted notifications, solutions that meet users where they are. I believe that today’s AI technology has outpaced its own UX — user experience — and that startup wins in today’s AI landscape will emerge as the two harmonize.

I am incredibly excited to be part of the next two years of AI development. But reflecting on technology’s past has reminded me that the utility of future technological advances remains grounded in the same old fundamentals: understanding the domain, and understanding the user. Therein lies usefulness and value, which we drive towards at Halcyon.

Comments or questions? We’d love to hear from you - sayhi@halcyon.eco, or find us on LinkedIn and Twitter

RELATED ARTICLES

RAG-ing Against the Machine

Accuracy and Choice, at Technology Speed