Something went wrong
Please try again
A Common-Sense Guide to AI Engineering
Some error occured while loading the Quick View. Please close the Quick View and try reloading the page.
Couldn't load pickup availability
- Format:
-
26 May 2026

Want to build an LLM-powered app but don't know where to begin? With this step-by-step guide, you can master the underlying principles of AI engineering by building an LLM-powered app from the ground up. Tame unpredictable models with prompt and context engineering. Use evals to keep them on track. Give chatbots the knowledge to answer anything a user wants to know. Equip agents with the tools and smarts to actually get the job done. By the end, you'll have the intuition and the confidence to build on top of LLMs in the real world.
Fragmented documentation, obsolete tutorials, and frameworks that deliver a prototype but flop in production can make AI engineering feel overwhelming. But it doesn't have to be that way. With real-world code and step-by-step instructions as your guide, you can learn to build robust LLM-powered apps from the ground up while mastering both the how and why of the most crucial underlying concepts.
Harness context engineering and retrieval systems to create AI assistants that understand your proprietary data. Create chatbots that answer organization-specific questions and help solve users' issues. Design agents that conduct research, make decisions, and take action in the real world. Level up your prompt engineering and get an LLM to do your bidding---not its own. Use automated evals to keep constant tabs on your app's quality while setting up guardrails to protect your users and organization. And implement observability systems that make it easy to debug your app when things do go wrong.
With a systematic approach grounded in the core principles of building AI apps for real users, you'll easily evolve and adapt even as the hype and tools come and go.
COMPUTERS / Artificial Intelligence / General, Generative artificial intelligence / generative AI, COMPUTERS / Machine Theory, COMPUTERS / Mathematical & Statistical Software, COMPUTERS / Computer Engineering, COMPUTERS / Programming / General, COMPUTERS / Data Science / Machine Learning, Computer programming / software engineering, Knowledge management
- Iyanuoluwa Ajao, Senior Applied AI Engineer, Dataligence Labs
- Foundations
- HeLLMo,World!
- Signing Up for an LLM-as-a-Service
- Creating Our First App
- Tweaking the Model and Temperature
- Checking API Usage
- Wrapping Up
- Understanding How LLMs Work
- What Is a Large Language Model (LLM)
- Realizing LLMs Are Nondeterministic Creatures
- Gauging the Temperature
- Understanding the Challenges of Nondeterminism
- Wrapping Up
- Diving Deeper into LLMs
- Diving into Tokens
- Diving into Embeddings
- Diving into Fine-Tuning
- Wrapping Up
- Selecting an LLM
- Getting Your Hands on an LLM
- Comparing Different LLMs
- Deciding on an LLM
- Wrapping Up
- HeLLMo,World!
- Chatbots
- Building a Chatbot
- Getting User Input
- Augmenting the Prompt
- Adding Multi-Turn Dialogue
- Managing State with Memory Systems
- Adding a System Prompt
- Building the Messages Array
- Wrapping Up
- Augmenting a Prompt with Knowledge
- Building a Chatbot
- Augmenting with Knowledge
- Avoiding Context Window Limitations
- Preparing the Data
- Implementing the Knowledge Chatbot
- Running into PACKing Problems
- Wrapping Up
- Efficiently Adding Knowledge with RAG
- Augmenting with Documentation Chunks
- Getting into Search Engines, Retrieval, and RAG
- Searching with Meaning: Keywords Versus Semantics
- Using Embedding-Similarity Search
- Building a Starter Search Engine
- Implementing a RAG Chatbot
- Wrapping Up
- Measuring Quality with Evals
- Introducing Evals
- Setting Up Our App
- Conducting Error Analysis
- Open Coding
- Axial Coding
- Creating an Eval Test Framework
- Running Human Evals
- Wrapping Up
- Prompt Engineering
- Eliminating Ambiguity
- Utilizing the System Prompt
- Rewriting History
- Using Delimiters and Bullet Points
- Reordering Prompt Components
- Wrapping Up
- Reducing Hallucinations
- Understanding Why Our App Hallucinates
- Instructing the LLM to Be Faithful
- Pleading and Threatening
- Upgrading the Model
- Citing Sources and Few-Shot Prompting
- Iterate, Iterate, Iterate
- Chain-of-Thought Prompting
- Final Prompt Engineering Thoughts
- Checking On Our Evals
- Wrapping Up
- Evaluating and Optimizing RAG
- Discovering a RAG Failure
- Evaluating RAG
- Expanding the Query
- Metadata-Based Filtering
- Evaluating RAG Subcomponents
- Dreaming Up an Agentic RAG Wish List
- Wrapping Up
- Building a Chatbot
- Agents
- Equipping an LLM with Tools
- Understanding an LLM’s Limitations
- Triggering a Function
- Defining “Agents”
- Feeding Tool Results Back to the LLM
- Building a Website Reader Tool
- Deciding to Use a Tool
- Using the Tools API
- Wrapping Up
- Running the Agent Loop
- Solving a Complex Problem
- Constructing an Agent Loop
- Building a News Podcast Agent
- Exploring Agent Failure Modes and Evals
- Giving the Agent a Plan
- Asking the Agent to Create a Plan
- Wrapping Up
- Architecting Agentic Workflows
- Designing an LLM Assembly Line
- Implementing an LLM Assembly Line
- Weighing Agentic Workflows Against Classic Agent Loops
- Workflow Routing
- Performing Tasks in Parallel
- Wrapping Up
- Enhancing Retrieval with Agentic RAG
- Architecting an Agentic RAG Plan
- Implementing a RAG Agent
- Avoiding Unnecessary RAG
- Generating Structured Outputs
- Researching as an Agent
- Conducting Multi-Hop Research
- Wrapping Up
- Building System-Integrated Agents
- Integrating with Databases
- Reading and Writing
- Writing Safely
- Including a Human in the Loop
- Integrating with Web APIs
- Integrating MCP and Other Third-Party Tools
- Hosting Your Own Tools
- Wrapping Up
- Equipping an LLM with Tools
- Production
- Setting Guardrails
- Introducing Guardrail Types
- Guarding LLMs with Other Models
- Balancing Guardrail Trade-Offs
- Mitigating Cybersecurity Risks
- Protecting Personally Identifiable Information
- Using Guardrail Frameworks
- Red-Teaming, Evals, and Monitoring
- Wrapping Up
- Observing AI Systems
- Logging All the Things
- Using Observability Tools
- Running Evals in Production
- Monitoring and Alerts
- Gathering User Feedback
- Wrapping Up
- Handling Exceptions
- Understanding Your Errors
- Retrying Requests
- Switching Models
- Falling Back to Semantic Search and Caching
- Fitting in the Context Window
- Aborting Hanging Requests
- Recovering from Tool Failures
- Creating a Fallback Plan
- Wrapping Up
- Automating Evals
- Unit Testing
- Running Reference-Based Evals
- Checking Outputs Deterministically
- Using an LLM-as-Judge
- Running Evals
- Aligning the LLM Judge
- Working with Imperfect Judges
- Wrapping Up
- Final Thoughts
- Setting Guardrails