What makes large language models tick?

Understanding how LLMs work is crucial to getting the most from this powerful form of AI 

GPT-4, the most advanced LLM (a form of AI) from OpenAI, has enabled the development for several industries of professional-grade generative AI tools, including AI legal assistants such as CoCounsel. This latest generation of AI, more sophisticated and intuitive than ever and designed to respond to natural language instruction, is becoming increasingly relied upon by lawyers in practice. 

But the newest AI is only as useful as its users are skilled. And that skill cannot develop without being preceded by what we like to call AI literacy. Evaluating and deciding how to apply LLM-powered tools requires at least a fundamental understanding of how LLMs work. In this two-part post, we explore AI prompting for legal professionals, with an emphasis on how to ensure accurate and consistent output in accordance with lawyers’ professional and ethical obligations

In part one, we discuss how LLMs “think” and why the quality of their output is contingent on the quality of the prompts they receive. In part two, we offer prompting tips for legal professionals to ensure optimal use of LLMs in practice.

What is an AI prompt?

At its most basic, an AI prompt is any instruction or request put into an AI. And AI prompting has already been part of the daily routine of many for years. Those requests to Alexa or Siri? Those are AI prompts. 

The request is only one of multiple parts of a prompt, and those parts depend on the type of AI being used. Different types of AI have been built to include additional engineering to ensure each request is correctly routed to the AI’s specific functions. The engineering or routing depends on the type of AI you’re using, such as general-use AI like OpenAI’s ChatGPT, or specific-use AI (we break down the difference here).

As an example, requests submitted to ChatGPT follow a simple route to GPT-4 (the LLM powering ChatGPT), while requests submitted to specific-use AI like CoCounsel have a more complex route to GPT-4. The request is routed through several discrete functions, such as legal research (which then consults a database of case law, regulations, and statutes) or document review (that reads each document provided). As a result, the specific-use LLM receives a much more sophisticated total prompt that includes multiple parts: the request plus domain-specific content, as well as potential additional back-end prompting.

How do large language models “think”?

You now know the parts of an AI prompt. But why does the structure of a prompt matter so much when using LLMs? Knowing how LLMs “think”—and their limitations—is key to understanding why the quality of its output depends on the quality of the prompts it receives. This understanding also helps you get the best possible output, rather than mediocre, half-useful responses. 

Today’s LLMs can perform at a human level on various professional and academic benchmarks—GPT-4 has passed the bar exam—but even the most advanced LLMs are less capable than humans. For one, LLMs lack full abstract reasoning capabilities, which humans have.  

LLMs are pattern-recognizing and -generating machines that have been trained on billions of data points and can generate novel content as a result of that training (hence the name “generative AI”). LLMs try to predict what a human might conclude, based on the data it’s trained on. 

And It’s for precisely this reason prompting is so important. If you’ve used AI voice assistants like Alexa or Siri, you know the clarity and specificity of your prompts matters. With LLMs, grammar is a particularly important factor—the wording and punctuation you choose significantly impacts the clarity of your prompts. And as lawyers, you already know failure to use punctuation properly—especially commas—can alter the meaning of language or result in costly ambiguity or misinterpretation

Let’s say you want to use AI to review a document and find references to Apple. You submit the following prompt: Does the document contain references to apple? 

This prompt may be too ambiguous for a LLM. The LLM cannot discern whether you’re asking it to find references to fruit or Apple products such as the iPhone. LLMs are predictive models, and while it might correctly predict what you intended to refer to (the brand Apple), this ambiguity leaves room for inconsistent or inaccurate results. 

Now imagine you make a different request: Does the document contain apple pie? The AI understands you’re referring to a dessert, as opposed to fruit or the brand Apple. You’ve communicated a more specific, complex message that reduces ambiguity. Asking “Does the document contain apple pie recipes?” provides context to the AI (that it is searching for a specific recipe in a recipe book).

These examples indicate how LLMs think and why the specificity of your prompt matters. There are also limitations on context and how much information a LLM can handle. When humans read, we read letters and words as separate units that are individually assigned a particular meaning. The words, taken together, communicate a more sophisticated message. When LLMs read, they break down language into a series of tokens, which also form meaning when taken together.  

LLMs can only consider a limited number of tokens of information at any given time, and a token varies greatly in length, ranging from one character to one word. As a result, LLMs can only handle a limited amount of context. A LLM’s limit on the amount of information retained in its memory at one time is known as a context window. And it’s context window can significantly impact the quality of the output you receiveI. 

Context windows are always in motion. When a context window is full and new tokens (information) arrive, the LLM will release the oldest tokens in the context window to make room for new ones. When this older information is expelled, it’s completely forgotten by the AI. This limited memory is another limitation of the AI—another human capability the AI lacks. 

This limit on AI’s ability to retain information impacts how we should interact with it, and you should think of your prompts in two categories: requests and refinements. Requests are the initial instructions and queries to the AI. Refinements are additional follow-up directions, additional details, or corrections. The latter helps fine-tune the AI’s output.

The golden rule for writing AI prompts

Humans are capable of translating imprecise instructions into actionable directions (think back to the Apple example). They also benefit from a longer-term memory, or much larger context window. AI has a much shorter context window, but a much more expansive knowledge base than humans. 

An LLM won’t perform well in the face of ambiguity. It requires specific instructions within its comparatively limited context window. For this reason, it’s important to be intentional and write unambiguous, clear questions to get accurate output. And, the amount and type words you include impact your results too. 

If you take away one rule about AI prompting, it should be this: How you write your request will determine how the prompt is interpreted. 

Today’s unprecedented AI is unlike most traditional technology in that it knows how to read, comprehend, and write, and it can learn. Learning how to prompt a LLM is analogous to getting to know a new colleague. LLMs need our human experience and direction to function properly—it requires effective instructions and queries—to produce the right results. 

In our next post in this series on AI literacy, we share specific AI prompting techniques for lawyers.

Featured posts

Draft Correspondence

Rapidly draft common legal letters and emails.

How this skill works

  • Specify the recipient, topic, and tone of the correspondence you want.

  • CoCounsel will produce a draft.

  • Chat back and forth with CoCounsel to edit the draft.

Review Documents

Get answers to your research questions, with explanations and supporting sources.

How this skill works

  • Enter a question or issue, along with relevant facts such as jurisdiction, area of law, etc.

  • CoCounsel will retrieve relevant legal resources and provide an answer with explanation and supporting sources.

  • Behind the scenes, Conduct Research generates multiple queries using keyword search, terms and connectors, boolean, and Parallel Search to identify the on-point case law, statutes, and regulations, reads and analyzes the search results, and outputs a summary of its findings (i.e. an answer to the question), along with the supporting sources and applicable excerpts.

Legal Research Memo

Get answers to your research questions, with explanations and supporting sources.

How this skill works

  • Enter a question or issue, along with relevant facts such as jurisdiction, area of law, etc.

  • CoCounsel will retrieve relevant legal resources and provide an answer with explanation and supporting sources.

  • Behind the scenes, Conduct Research generates multiple queries using keyword search, terms and connectors, boolean, and Parallel Search to identify the on-point case law, statutes, and regulations, reads and analyzes the search results, and outputs a summary of its findings (i.e. an answer to the question), along with the supporting sources and applicable excerpts.

Prepare for a Deposition

Get a thorough deposition outline in no time, just by describing the deponent and what’s at issue.

How this skill works

  • Describe the deponent and what’s at issue in the case, and CoCounsel identifies multiple highly relevant topics to address in the deposition and drafts questions for each topic.

  • Refine topics by including specific areas of interest and get a thorough deposition outline.

Extract Contract Data

Ask questions of contracts that are analyzed in a line-by-line review

How this skill works

  • Allows the user to upload a set of contracts and a set of questions

  • This skill will provide an answer to those questions for each contract, or, if the question is not relevant to the contract, provide that information as well

  • Upload up to 10 contracts at once

  • Ask up to 10 questions of each contract

  • Relevant results will hyperlink to identified passages in the corresponding contract

Contract Policy Compliance

Get a list of all parts of a set of contracts that don’t comply with a set of policies.

How this skill works

  • Upload a set of contracts and then describe a policy or set of policies that the contracts should comply with, e.g. "contracts must contain a right to injunctive relief, not merely the right to seek injunctive relief."

  • CoCounsel will review your contracts and identify any contractual clauses relevant to the policy or policies you specified.

  • If there is any conflict between a contractual clause and a policy you described, CoCounsel will recommend a revised clause that complies with the relevant policy. It will also identify the risks presented by a clause that does not conform to the policy you described.

Summarize

Get an overview of any document in straightforward, everyday language.

How this skill works

  • Upload a document–e.g. a legal memorandum, judicial opinion, or contract.

  • CoCounsel will summarize the document using everyday terminology.

Search a Database

Find all instances of relevant information in a database of documents.

How this skill works

  • Select a database and describe what you're looking for in detail, such as templates and precedents to use as a starting point for drafting documents, or specific clauses and provisions you'd like to include in new documents you're working on.

  • CoCounsel identifies and delivers every instance of what you're searching for, citing sources in the database for each instance.

  • Behind the scenes, CoCounsel generates multiple queries using keyword search, terms and connectors, boolean, and Parallel Search to identifiy the on-point passages from every document in the database, reads and analyzes the search results, and outputs a summary of its findings (i.e. an answer to the question), citing applicable excerpts in specific documents.

Skills

UNIVERSAL
Search a Database

Find all instances of relevant information in a database of documents.

Summarize

Get an overview of any document in straightforward, everyday language.

Draft Correspondence

Rapidly draft common legal letters and emails.

TRANSACTIONAL
Contract Policy Compliance

Get a list of all parts of a set of contracts that don’t comply with a set of policies.

Extract Contract Data

Ask questions of contracts that are analyzed in a line-by-line review

Prepare for a Deposition

Get a thorough deposition outline by describing the deponent and what’s at issue.

LITIGATION
Legal Research Memo

Get answers to your research questions, with explanations and supporting sources.

Review Documents

Get comprehensive answers to your questions about a set of documents.