Multi-Cloud AI Practitioner Hub · v2.0

The GenAI Practitioner
Knowledge Platform

From AI project lifecycle fundamentals to certification mastery — a community-built resource for practitioners across all cloud platforms.

☁ AWS ☁ Azure ◉ Google Cloud □ Oracle OCI □ VMware □ HPE
📚
Practitioner
Learning & cert prep
AI Developer
Building GenAI apps
🏗
AI Architect
Designing solutions
📊
Data Engineer
Pipelines for AI
🔒
Security Eng.
AI governance
🎯
AI Leader
Strategy & decisions
⚠ Educational Disclaimer
Independent educational resource — not affiliated with Amazon Web Services, Microsoft, Google, Oracle, VMware, or HPE. All vendor names and certifications are trademarks of their respective owners. Content for study purposes only; always verify with official vendor documentation. Not for redistribution.
Hub Home
☁ AWS
Multi-Cloud AI Practitioner Platform · Technology Agnostic

Build, Learn & Certify in Generative AI

A practitioner-built knowledge platform covering the complete AI project lifecycle, certification pathways across AWS, Azure, and Google Cloud, interactive mind maps, real-world case studies, and curated learning resources — cloud-agnostic by design.

8Lifecycle Phases
5Exam Domains
6+Mind Maps
12Case Studies
40+Resources
6Cloud Platforms
AI project lifecycle — click any phase to explore
Phase 1
Scoping & Strategy
Business case, ROI
Phase 2
Data & Foundation
FM selection, data
Phase 3
Architecture
RAG, agents, vectors
Phase 4
Development
Build, prompt, pipeline
Phase 5
Safety & Governance
Guardrails, compliance
Phase 6
Testing & Evaluation
RAGAS, model eval
Phase 7
Deployment
Provisioning, CI/CD
Phase 8
Ops & Improve
Monitor, optimize
Certification pathways — select your cloud
PRIMARY · AIP-C01
AWS GenAI Developer — Professional
The flagship AI developer cert. 75 Qs · 204 min · pass 750. Fully covered with domain deep-dives, mind maps, case studies and resources.
Active 2026Covered here
PREREQUISITE · AIF-C01
AWS AI Practitioner
Foundational GenAI concepts. Recommended before AIP-C01. No hands-on prerequisite required.
Associate level
COMPLEMENTARY · MLS-C01
AWS ML Specialty
SageMaker, model training, MLOps. Content planned.
Azure AI content coming soon
AI-900 · AI-102 · Azure OpenAI · Azure AI Studio
Google Cloud AI content planned
Professional ML Engineer · Vertex AI · Gemini API
Soon
Oracle OCI AI
OCI Generative AI Professional · AI Foundations Associate
Soon
VMware Private AI
Private AI Foundation with NVIDIA · VCF AI workloads
Soon
HPE GreenLake AI
HPE AI Essentials · ML Dev Environment · Ezmeral
Featured mind maps — live now
D1 · INTERACTIVE MAP
Domain 1 — FM Integration & RAG
Full radial mind map. 9 topic clusters, 60+ nodes, exam tips, code demos, animated walkthroughs. Dark sci-fi UI.
LivePan & zoomExam tips
DECISION MAP
RAG vs Fine-Tuning Decision Guide
Interactive decision framework. Compare table (7 dimensions). Scenario walkthroughs. Critical AIP-C01 topic.
LiveCompare tableScenarios
About this platform
Start with the AI Project Lifecycle for the big picture, then dive into certification domain deep-dives, use interactive mind maps for visual learning, study real-world case studies, and tap the curated resources section for courses, conferences, and communities. Community-built and practitioner-maintained.
Technology Agnostic Framework

AI Project Lifecycle

The end-to-end framework for delivering production AI systems — from business case to continuous improvement. Applicable across AWS, Azure, Google Cloud, and on-premises. Click any phase to expand.

Phase 1
Scoping & Strategy
Business case, ROI
Phase 2
Data & Foundation
FM selection, data
Phase 3
Architecture
RAG, agents, vectors
Phase 4
Development
Build, prompt, pipeline
Phase 5
Safety & Governance
Guardrails, compliance
Phase 6
Testing & Evaluation
RAGAS, model eval
Phase 7
Deployment
Provisioning, CI/CD
Phase 8
Ops & Improve
Monitor, optimize
Phase 1 — Scoping & Strategy
Define the problem worth solving. Identify AI-suitable tasks: pattern recognition, content generation, semantic search, classification, anomaly detection. Build the business case — quantify current pain, estimate AI ROI, assess risks. Select use-case tier: Quick Win (80% confidence, 4-week) vs Strategic (6-12 month). Document success metrics before building anything. Tools: AWS Well-Architected GenAI Lens, Azure AI Adoption Framework, Google Cloud AI maturity model.
Business caseUse-case selectionSuccess metricsAWS Well-Architected GenAI Lens
Phase 2 — Data & Foundation Model Selection
Audit your data: structured, unstructured, multimodal. Assess quality, volume, recency and sensitivity. Choose FM tier: speed+cost (Haiku, GPT-3.5-turbo), quality+reasoning (Claude Sonnet, GPT-4o), compliance (Titan, Azure Government). Evaluate whether custom fine-tuning is needed vs prompting vs RAG. Identify sensitive data requirements: PII, PHI, GDPR, CCPA.
Amazon BedrockSageMaker JumpStartVertex AI Model GardenData quality audit
Phase 3 — Architecture Design
Choose the right architecture pattern: Direct FM (simple Q&A), RAG (company knowledge, prevent hallucination), Agentic (multi-step tasks), Multi-model routing (mixed complexity). For RAG: hierarchical chunking (parent 800T + child 200T), vector DB (OpenSearch/pgvector/MemoryDB), hybrid BM25+ANN search, Bedrock Reranker. Document all decisions as Architecture Decision Records (ADRs).
Bedrock Knowledge BasesOpenSearchBedrock AgentsHierarchical chunking
Phase 4 — Development
Build the data pipeline: validate → Textract (scanned PDFs) → Glue Data Quality → Comprehend (entity enrichment) → normalize → index into vector store. Develop prompt templates with chain-of-thought, few-shot, and structured JSON output. Implement agent action groups with OpenAPI schemas. Build streaming APIs (Bedrock Streaming + WebSocket ~150ms first token) and async patterns (SQS→Lambda→S3→SNS) for long jobs.
LambdaStep FunctionsBedrock Prompt ManagementSQS + SNS
Phase 5 — Safety, Security & Governance
Bedrock Guardrails (6 filters: Content, PII Redaction, Grounding Check, Prompt Attacks, Denied Topics, Word Filters). Secure network: VPC Endpoints — no public internet for Bedrock. Encrypt with CMK. Store secrets in Secrets Manager. IAM least-privilege on agent roles. Audit logging: CloudTrail + Audit Manager for HIPAA/SOC2 evidence. Macie scans S3 training data for PII (pre-ingestion).
Bedrock GuardrailsVPC EndpointsKMS CMKAudit ManagerMacie
Phase 6 — Testing & Evaluation
RAGAS evaluation: Faithfulness (hallucination → "ONLY from context" + Guardrails Grounding Check), Answer Relevancy (off-topic → tighten prompt), Context Recall (retrieval failure → hybrid search), Context Precision (noise → Bedrock Reranker). Bedrock Model Evaluation to compare FMs on your custom dataset. Build 200 golden prompt-answer pairs. Amazon A2I for human review of subjective quality.
Bedrock Model EvaluationAmazon A2ICloudWatch SyntheticsRAGAS metrics
Phase 7 — Deployment
Deployment modes: On-Demand (variable traffic), Provisioned Throughput (≥40% sustained — eliminates throttling), SageMaker Async (>60s jobs), Batch Inference (70% cheaper for offline). Build MLOps CI/CD: CodePipeline → evaluate → SageMaker Model Registry → Synthetics gate → canary → prod. Bedrock Cross-Region Inference profiles for automatic regional failover without code changes.
Bedrock Provisioned TPSageMaker Model RegistryCodePipelineCross-Region Inference
Phase 8 — Operations & Continuous Improvement
Monitor: X-Ray (per-component latency — fix biggest bottleneck first), CloudWatch InvocationLatency P99 (SLA breach signal), InvocationThrottles (rising = need Provisioned TP). SageMaker Model Monitor for output quality drift → alarm → rollback via Model Registry (3-5 min). Cost hierarchy: Prompt Caching (90% savings) → Model Routing (AppConfig) → Batch → Provisioned TP. Run RAGAS on production samples weekly.
AWS X-RayCloudWatchSageMaker Model MonitorBedrock Prompt CachingAppConfig
Lifecycle to AIP-C01 domain mapping
Phase 2-3 → D1 (31%) FM Integration & RAG · Phase 4 → D2 (26%) Implementation · Phase 5 → D3 (20%) Safety & Governance · Phase 8 cost → D4 (12%) Optimization · Phase 6 → D5 (11%) Testing & Validation
AWS Certified · Professional Level · 2026 Edition
Generative AI Developer — Professional · AIP-C01
Validates practical knowledge of implementing GenAI solutions in production using AWS. Covers FM selection, RAG architecture, agentic AI, security, optimization, and evaluation.
75 Questions204 MinutesPass: 750/10002+ yrs AWS exp.1+ yr GenAI65 scored + 10 unscored
↗ Official Exam Page ↓ Download Exam Guide PDF ↗ AWS Skill Builder Prep
The 5 exam domains — click to study
D1 · 31% 20-23 questions
Foundation Model Integration, Data Management & Compliance
GenAI architecture, FM selection, fine-tuning (LoRA/PEFT), data pipelines, vector stores, RAG, hybrid search, prompt governance.
Bedrock KBOpenSearchRAGLoRA
D2 · 26% 17-20 questions
Implementation & Integration
Bedrock Agents, Strands, Agent Squad, MCP, deployment modes, enterprise integration, streaming/async, Q Business, Q Developer.
Bedrock AgentsStep FunctionsMCP
D3 · 20% 13-15 questions
AI Safety, Security & Governance
Guardrails (6 filters), prompt injection defense, VPC, KMS CMK, Secrets Manager, Macie, Audit Manager, Clarify SHAP, A2I.
GuardrailsKMSMacie
D4 · 12% 8-9 questions
Operational Efficiency & Optimization
Cost hierarchy (Prompt Caching → Routing → Batch → Provisioned TP), X-Ray, CloudWatch P99, Model Monitor.
Prompt CachingX-RayAppConfig
D5 · 11% 7-8 questions
Testing, Validation & Troubleshooting
RAGAS (4 metrics + fixes), Bedrock Model Evaluation, golden test sets, Synthetics quality gates, agent trace debugging.
RAGASModel EvalSynthetics
TOOLS
Mind Maps & Visual Learning
Interactive dark sci-fi mind maps. D1 and RAG vs FT live now. D2-D5 coming soon.
D1 LiveRAG vs FT Live
Domain 1 · AIP-C01 · Highest Weight
Foundation Model Integration,
Data Management & Compliance
GenAI architecture, FM selection & fine-tuning, data validation pipelines, vector stores, RAG retrieval, and prompt governance.
31%Exam weight
20-23Questions
Task 1.1-1.2 — Architecture design & model selection
1.1.1GenAI architecture design — FM selection, integration patterns, AppConfig routing
Pattern-match: "employees ask questions about company docs" → RAG. "Multi-step autonomous task" → Bedrock Agents. "Mixed complexity queries" → multi-model routing via AppConfig. AppConfig stores routing rules — Lambda reads at runtime — NO redeploy to switch models (key exam distinction from environment variables).
Amazon BedrockAWS LambdaAPI GatewayAWS AppConfig
CRITICAL: "Switch models without code modification" → AWS AppConfig ALWAYS. Most frequently tested pattern in D1.
1.2.3Resilient AI — Cross-Region Inference & Step Functions circuit breaker
Bedrock Cross-Region Inference profiles: automatic regional failover without code changes. Step Functions circuit breaker: wraps agent invocation with timeout states, max iterations, and fallback transitions. Exponential backoff for 429 ThrottlingExceptions.
Bedrock Cross-Region InferenceAWS Step FunctionsCloudWatch
Exam tip: "Regional failover, no code" → Cross-Region Inference profiles. "Circuit breaker for agentic loop" → Step Functions timeout state → catch → fallback.
1.2.4FM customization — LoRA (90% cheaper), PEFT adapters, Continued pre-training
LoRA: trains tiny adapter matrices (1-5% of params), 90% cheaper than full fine-tuning. Multiple LoRA adapter sets from one base model = N product variants. Continued pre-training: for domain vocabulary FIRST, then fine-tune on tasks. SageMaker Model Registry: versioning + approval gates + rollback.
SageMaker AISageMaker Model RegistryLoRA / PEFTS3
Decision tree: "Brand voice/format" → fine-tune (LoRA). "Domain vocab not recognized" → continued pre-training FIRST. "N variants from one base" → N LoRA adapter sets. NOT for knowledge injection (use RAG for facts).
Task 1.3 — Data validation & processing pipelines
1.3.1Data quality workflows — Textract, Glue Data Quality, Comprehend, Macie
Pipeline: S3 → Textract (scanned PDFs, tables) → Lambda (validate) → Glue Data Quality (null/duplicate rules) → Comprehend (entity enrichment) → normalize → Bedrock KB. Macie: PII in S3 training data (pre-ingestion). Different from Guardrails PII (post-generation FM output).
AWS Glue Data QualityAmazon TextractAmazon ComprehendAmazon Macie
Macie vs Guardrails: Macie → PII in S3 training data (BEFORE the FM). Guardrails PII → FM output (AFTER generation). Exam loves this distinction.
Task 1.4 — Vector store solutions
1.4.1Vector DB selection — OpenSearch, pgvector, MemoryDB, Neptune
OpenSearch: hybrid BM25+ANN (most tested, Bedrock KB default). Aurora pgvector: existing PostgreSQL — no migration. MemoryDB: sub-millisecond vector cache. Neptune Analytics: entity graphs + vector queries. CRITICAL: same embedding model for docs AND queries — model mismatch = incompatible vector spaces = garbage results.
Amazon OpenSearchAurora pgvectorAmazon MemoryDBAmazon Neptune
Selection: "Hybrid BM25+ANN" → OpenSearch. "Existing PostgreSQL" → pgvector. "Sub-ms vector cache" → MemoryDB. "Entity graph + vector" → Neptune Analytics.
Task 1.5 — Retrieval mechanisms (RAG)
1.5.1Chunking strategies — hierarchical (MOST TESTED: parent 800T + child 200T)
Hierarchical (most tested): parent 800T (generation context) + child 200T (retrieval precision). Best for long structured docs (legal, technical manuals). Fixed-size: uniform short docs. Semantic: topic boundaries via FM. Sentence-window: exact citations with surrounding context.
Bedrock KB chunking
Exam pattern: "Long structured docs + need precision AND context" → hierarchical (parent 800T + child 200T). This is "the one" chunking answer for enterprise content on the exam.
1.5.2-1.5.6Hybrid BM25+ANN, Bedrock Reranker, CRITICAL same-model rule
Hybrid search: BM25 (exact terms — statute codes, model numbers) + ANN vector (semantic meaning). Merged via Reciprocal Rank Fusion (RRF). 15-30% better recall than either alone. Bedrock Reranker: ANN retrieves top-20 → cross-attention rescore → top-5 to FM. CRITICAL: same embedding model MUST be used for doc indexing AND query time.
OpenSearch Hybrid BM25+ANNBedrock Reranker
Enterprise RAG formula: Hierarchical chunking + Hybrid BM25+ANN + Bedrock Reranker = gold standard. MOST COMMON TRAP: "different model for faster queries" → always wrong.
Task 1.6 — Prompt engineering & governance
1.6.1-1.6.6Bedrock Prompt Management, Prompt Flows, Prompt Caching — three distinct services
Prompt Management: version control + parameterized templates + approval workflow gates (draft → review → production). Prevents unauthorized prompts reaching production. Prompt Flows: no-code visual builder for sequential chains — non-technical teams, no Lambda. DIFFERENT from Bedrock Agents (which use autonomous tool-calling). Prompt Caching: static prefix cache reads ≈10% of standard rate → 90% savings on prefix tokens.
Bedrock Prompt ManagementBedrock Prompt FlowsBedrock Prompt Caching
Exam: "Prevent unauthorized prompts in prod" → Prompt Management approval workflow. "No-code sequential chain" → Prompt Flows. "90% savings on repeated system prompt" → Prompt Caching. Frequently confused on exam.
D1 Pattern Recognition — exam keyword map
"Company docs / cite sources" → RAG · "Brand voice / format" → fine-tune · "Long structured docs" → hierarchical chunking · "Exact codes failing" → hybrid BM25+ANN · "No code to switch models" → AppConfig · "Prevent unauthorized prompts" → Prompt Management · "Non-technical no-code chain" → Prompt Flows
Domain 2 · AIP-C01
Implementation & Integration
Agentic AI (Bedrock Agents, Strands, Agent Squad, MCP), model deployment strategies, enterprise integration, FM API patterns, streaming/async, Q Business, Q Developer.
26%Exam weight
17-20Questions
Task 2.1 — Agentic AI
2.1.1ReAct loop — Thought → Action → Observation, Bedrock Agents architecture
ReAct = Reason + Act. Pattern: THOUGHT (reasoning, which tool + params) → ACTION (Lambda tool call) → OBSERVATION (read result) → repeat until done. Bedrock Agents: instruction prompt + action groups (Lambda + OpenAPI schema on S3) + Knowledge Base + Guardrails + memory + trace. Enable trace mode for debugging.
Amazon Bedrock AgentsAWS LambdaOpenAPI schemasS3
Exam tip: Agent routes to wrong tool → enable trace, examine THOUGHT blocks (action group descriptions may be ambiguous). NEVER change code before reading trace.
2.1.2-2.1.3Safeguards — maxIterations (implement FIRST), HITL .waitForTaskToken (zero cost pause)
maxIterations: caps runaway ReAct loops — implement FIRST before any other safeguard. Step Functions .waitForTaskToken: workflow PAUSES (zero compute cost while waiting), resumes on human approval via SendTaskSuccess. IAM least-privilege on agent execution role. Bedrock Guardrails as final content layer.
AWS Step FunctionsAWS IAMBedrock GuardrailsBedrock Agents maxIterations
Safeguard order: maxIterations (FIRST) → Step Functions circuit breaker → IAM least-privilege → Guardrails. .waitForTaskToken = ZERO compute while paused.
2.1.1Multi-agent — Agent Squad (pre-built) vs Strands SDK (custom) vs AgentCore (managed)
AWS Agent Squad: pre-built supervisor routes to registered specialist agents (best for standard multi-agent patterns). Strands Agents SDK: Python SDK for custom orchestration, non-standard workflows. Bedrock AgentCore: fully managed multi-tenant agent runtime (serverless, handles state, memory). AgentCore Memory: scalable long-term memory for sessions.
AWS Agent SquadStrands Agents SDKBedrock AgentCore
Selection: "Supervisor + specialists, standard" → Agent Squad. "Custom orchestration, non-standard" → Strands SDK. "Production multi-tenant managed" → AgentCore.
2.1.7MCP — Lambda (stateless) vs ECS Fargate (stateful) — CRITICAL distinction
MCP (Model Context Protocol): JSON-RPC 2.0 standard for agent-tool communication. Reduces N×M integrations to N+M. Lambda MCP servers: stateless, lightweight, auto-scale to zero. ECS Fargate MCP servers: stateful, persistent DB connections, streaming data, complex tools with internal state.
AWS Lambda (stateless MCP)Amazon ECS Fargate (stateful MCP)
CRITICAL: Lambda = stateless MCP. ECS = stateful MCP (persistent connections). If tool needs to "maintain a connection" or "stream data" → ECS. Tested repeatedly on exam.
Task 2.2 — Model deployment strategies
2.2.1-2.2.3Deployment modes — On-demand, Provisioned (≥40%), Async (>60s), Batch (70% cheaper)
On-Demand: variable/sporadic traffic, pay per token. Provisioned Throughput: flat hourly rate, break-even ≈40% sustained utilization, ELIMINATES throttling. SageMaker Async Inference: jobs >60s. Batch Inference: 70% cheaper for offline bulk — S3 in → process → S3 out, no persistent endpoint cost.
Bedrock On-DemandBedrock Provisioned ThroughputSageMaker Async InferenceSageMaker Batch Transform
Decision: Variable → on-demand. ≥40% sustained → provisioned. Jobs >60s → async. Offline bulk → Batch (70% cheaper). Know the 40% break-even.
Tasks 2.3-2.5 — FM API patterns & enterprise tools
2.3+2.4Streaming, async patterns, CI/CD — WebSocket, SQS decoupling, exponential backoff
Streaming: Bedrock Streaming API + WebSocket API Gateway ≈150ms first token. Async (jobs >60s): API returns 202 immediately → SQS queues job → Lambda worker → S3 result → SNS push notification. NEVER hold connection open. SDK exponential backoff + jitter for 429 ThrottlingException. X-Ray distributed tracing across all service boundaries.
API Gateway WebSocketAmazon SQSAmazon SNSAWS X-Ray
Pattern: "Real-time chatbot" → Bedrock Streaming + WebSocket. "Long job" → 202 + SQS decoupling. "429 throttle" → SDK exponential backoff with jitter.
2.5.3-2.5.4Q Business (enterprise chatbot with ACL) vs Q Developer (IDE coding AI) — different services
Amazon Q Business: fully managed enterprise chatbot — native connectors (SharePoint, Confluence, Salesforce, S3, Slack), automatic ACL enforcement (users only see docs they're authorized to access). Amazon Q Developer: VS Code/JetBrains plugin — code generation, security scanning (OWASP Top 10), unit test generation, CLI suggestions.
Amazon Q BusinessAmazon Q DeveloperIAM Identity Center
CRITICAL: Q Business = enterprise chatbot with ACL enforcement. Q Developer = developer coding assistant. Completely different services. Exam loves asking which to use.
Domain 3 · AIP-C01
AI Safety, Security & Governance
Guardrails (6 filters), prompt injection defense, VPC/KMS/IAM security, data privacy, AI governance, Responsible AI (Clarify SHAP, A2I).
20%Exam weight
13-15Questions
Bedrock Guardrails — all 6 filter types
FILTER 1
Content Filters
Hate / violence / sexual / misconduct. Configurable strength (LOW/MEDIUM/HIGH) per category per direction (input/output).
FILTER 2
PII Redaction
30+ PHI entity types. ANONYMIZE mode replaces with [REDACTED]. BLOCK mode stops response. Required for HIPAA compliance on FM outputs.
FILTER 3
Grounding Check
Post-generation anti-hallucination. Compares response against retrieved context. Blocks claims unsupported by context. Configurable threshold.
FILTER 4
Prompt Attacks
ML-powered jailbreak & prompt injection detector. Catches ALL rephrasings. Applies to input AND output. Combine with IAM least-privilege for defense-in-depth.
FILTER 5
Denied Topics
Block specific subjects entirely — competitors, investment advice, legal advice. Defined in natural language. Semantic matching, not just keywords.
FILTER 6
Word Filters
Exact phrase/word blocking. Simple but easily bypassed via rephrasing — always layer with Prompt Attacks for comprehensive defense.
Task 3.1-3.2 — Security & compliance
3.1+3.2HIPAA security stack — ALL 5 required controls (missing any = wrong answer)
Full HIPAA requires ALL 5: (1) VPC Endpoints/PrivateLink — Bedrock traffic never public internet. (2) CMK — customer controls rotation, can revoke by disabling key. (3) Guardrails PII ANONYMIZE — PHI in FM outputs. (4) CloudTrail Data Events on PHI S3 buckets. (5) Audit Manager — automated HIPAA/SOC2 evidence from CloudTrail + Config + Security Hub.
VPC EndpointsAWS KMS CMKSecrets ManagerCloudTrailAudit ManagerMacie
HIPAA checklist: VPC Endpoints · CMK · Guardrails PII ANONYMIZE · CloudTrail Data Events · Audit Manager. ALL 5 required. Macie = S3 pre-ingestion. Guardrails PII = FM output post-generation.
3.2Macie vs Guardrails PII — timing distinction (pre-ingestion vs post-generation)
Amazon Macie: discovers PII in S3 BEFORE data enters FM (training data audit, KB ingestion). Guardrails PII Redaction: operates ON FM OUTPUTS (post-generation). Complementary in HIPAA environments — not alternatives. Use both.
Amazon Macie (pre-ingestion)Bedrock Guardrails PII (post-generation)
Exam trick: "PII in S3 training data" → Macie. "SSN in chatbot response" → Guardrails PII Redaction. Timing = the key differentiator.
Task 3.3-3.4 — Responsible AI
3.3+3.4Clarify SHAP, Model Cards, Audit Manager, A2I — Responsible AI toolkit
SageMaker Clarify: (1) Bias detection — Class Imbalance (CI), Difference in Proportions of Labels (DPL). (2) SHAP values per prediction — required for ECOA adverse action explanations in lending. Model Cards: governance documentation for regulated deployment. Audit Manager: automates evidence collection. A2I: human review for subjective quality.
SageMaker ClarifySageMaker Model CardsAWS Audit ManagerAmazon A2I
Pattern map: "Explain loan denial to regulator" → Clarify SHAP. "Automate HIPAA compliance evidence" → Audit Manager. "Subjective quality (tone, culture)" → A2I human review. "Algorithmic fairness" → Clarify bias (CI, DPL).
Domain 4 · AIP-C01
Operational Efficiency & Optimization
Cost optimization hierarchy, performance optimization (X-Ray, latency bottlenecks), monitoring (CloudWatch P99, Model Monitor).
12%Exam weight
8-9Questions
Task 4.1 — Cost optimization hierarchy (best ROI order)
1. HIGHEST ROI
Prompt Caching
Cache static prefix. Reads ≈10% standard rate. 90% savings on prefix tokens. Track: CacheReadInputTokens.
2. HIGH ROI
Model Routing
AppConfig rules: Haiku for 60% simple queries (8× cheaper). 40-60% average cost reduction. No code changes.
3. OFFLINE
Batch Inference
70% cheaper than real-time. S3 in → process → S3 out. No persistent endpoint. For nightly/weekly offline jobs.
4. HIGH VOLUME
Provisioned TP
Break-even ≈40% sustained. Flat hourly rate. Eliminates throttling + saves money above threshold.
Task 4.2-4.3 — Performance & monitoring
4.2+4.3X-Ray bottleneck analysis, CloudWatch P99, Model Monitor drift → rollback
X-Ray: per-component latency breakdown. ALWAYS fix BIGGEST bottleneck first (80/20 rule). CloudWatch key metrics: InvocationLatency P99 (SLA breach signal — not average), InvocationThrottles (rising = need Provisioned TP), InputTokenCount+OutputTokenCount (cost proxy). SageMaker Model Monitor: baseline → compare current → alarm → Model Registry rollback (3-5 min vs hours of retraining).
AWS X-RayAmazon CloudWatchSageMaker Model MonitorSageMaker Model Registry
Pattern: "Which RAG stage is slow?" → X-Ray trace. "Quality dropped after upgrade" → Model Monitor + Model Registry rollback. P99 > SLA → latency issue. ThrottlingExceptions rising → provision throughput.
Domain 5 · AIP-C01
Testing, Validation & Troubleshooting
RAGAS (4 metrics + specific fixes), Bedrock Model Evaluation, quality gates, agent trace debugging, retrieval troubleshooting, rollback strategies.
11%Exam weight
7-8Questions
RAGAS — 4 metrics, what each means, and the exact fix
MetricMeasuresLow score meansSpecific fixAWS Service
FaithfulnessGrounded in context?Hallucination"ONLY from context" prompt + Guardrails Grounding CheckBedrock Guardrails
Answer RelevancyRight question answered?Off-topic driftTighten system prompt scopeBedrock Prompt Management
Context RecallRight docs retrieved?Retrieval failureHybrid BM25+ANN + re-evaluate chunkingOpenSearch Hybrid
Context PrecisionSignal-to-noise?Too much noiseAdd Bedrock Reranker (top-20 → top-5)Bedrock Reranker
Task 5.1-5.2 — Evaluation systems & troubleshooting
5.1.2Bedrock Model Evaluation, quality gates, golden test sets — full deployment gate
Bedrock Model Evaluation: compare multiple FMs on your custom dataset before selecting for production. Build 200 golden prompt-answer pairs. CloudWatch Synthetics canaries run RAGAS on golden set each deployment. If Faithfulness drops → CW Alarm → CodePipeline gate blocks → Model Registry rollback (3-5 min). Amazon A2I for subjective quality review.
Bedrock Model EvaluationCloudWatch SyntheticsSageMaker Model RegistryAmazon A2I
Exam: "Compare FMs before choosing" → Bedrock Model Evaluation. "Automated deployment quality gate" → 200 golden pairs + Synthetics + RAGAS + CW alarm + deployment block.
5.2Agent debugging — trace mode FIRST, then THOUGHT blocks, never change code first
Debug process: (1) Enable trace mode — see every THOUGHT, ACTION, OBSERVATION. (2) Wrong tool routing → examine THOUGHT blocks (action group descriptions are ambiguous, not the code). (3) Context window overflow → dynamic chunking, context pruning. (4) Poor quality → check RAGAS Context Recall FIRST before blaming generation. NEVER change code before reading trace.
Bedrock Agent TraceCloudWatch Logs InsightsAWS X-Ray
Debug flowchart: Wrong tool → trace THOUGHT blocks → fix action group descriptions. Low Faithfulness → Guardrails Grounding. Low Context Recall → hybrid search. Quality dropped → Model Monitor → rollback.
Interactive Learning Tools

Mind Maps & Visual Tools

Dark sci-fi interactive concept maps with expandable nodes, exam tips, code examples, and animated walkthroughs. Opens in a new tab for the best experience.

Live now — AIP-C01
D1 · FULL INTERACTIVE MAP
Domain 1 — FM Integration & RAG
Radial mind map with 9 topic clusters, 60+ nodes. Collapsible/expandable, pan & zoom canvas, exam tips on every node, code demo mode. Dark sci-fi UI.
Live nowRadial SVGExam tipsCode demos
DECISION MAP
RAG vs Fine-Tuning — Decision Guide
Interactive decision framework. Compare tab (7 dimensions). Scenario demo walkthroughs. Critical AIP-C01 topic — tested heavily on exam.
Live nowCompare tab7 dimensionsScenarios
Coming soon
D2 · IN PROGRESS
Domain 2 — Implementation & Agents
Bedrock Agents, ReAct, multi-agent (Agent Squad, Strands), MCP, streaming, deployment modes, Q Business, Q Developer.
D3 · PLANNED
Domain 3 — AI Safety & Security
Guardrails 6-filter map, HIPAA stack, VPC/KMS/IAM, Responsible AI, Macie vs Guardrails PII.
D4 · PLANNED
Domain 4 — Optimization
Cost hierarchy pyramid, X-Ray bottleneck, AppConfig routing, provisioned TP, CloudWatch key metrics.
D5 · PLANNED
Domain 5 — Testing & Validation
RAGAS 4-metric framework with fixes, Bedrock Model Evaluation, golden test gates, Model Monitor, rollback.
Real-World GenAI Implementations

Case Studies

Real-world GenAI implementation patterns across industries, each mapping to specific AIP-C01 domains and AWS services, showing how architecture decisions play out in practice.

Healthcare & Life Sciences
Healthcare · HIPAA · D1 + D3
Clinical Document RAG — Patient Summary Assistant
Regional hospital needed clinicians to query 10+ years of patient records in real-time without exposing PHI. Built RAG with Bedrock KB, OpenSearch, VPC Endpoints. Guardrails PII Redaction on all outputs.
92% reduction in time to find patient history. Zero PHI exposure incidents post-deployment.
Bedrock KBOpenSearchVPC EndpointsKMS CMKGuardrails PII
Pharma · Multi-Agent · D1 + D2 + D3
Drug Interaction Research Agent — FDA Submission Prep
Pharmaceutical company automated literature review for FDA submissions. Multi-agent: Orchestrator (Bedrock Agents) → Literature Agent (PubMed RAG) → Interaction Agent (drug DB) → Synthesis Agent. HITL via Step Functions .waitForTaskToken for medical validation.
Literature review: 6 weeks → 3 days. Regulatory team validates, not writes from scratch.
Bedrock AgentsAgent SquadStep Functions HITL
Financial Services
Banking · ECOA · D3 + D5
Lending Decision Explainability — ECOA Compliance
Consumer bank needed to explain ML loan denials to regulators (ECOA). Fine-tuned model with SageMaker, integrated Clarify SHAP values per prediction, generated natural-language adverse action notices via Bedrock, documented with Model Cards.
100% ECOA compliant notices. Regulator audit passed first attempt.
SageMaker ClarifyModel CardsAmazon Bedrock
Insurance · Cost Optimization · D4
Claims Triage Chatbot — 41% Cost Reduction
Insurance provider handling 50K daily claims queries. AppConfig model routing: Haiku handles 65% of simple status/FAQ queries (8× cheaper). Prompt Caching on 800-token system prompt saves 90% of prefix cost. Provisioned TP for business hours eliminates throttling.
41% total FM cost reduction. P99 latency improved 35%.
AppConfig routingPrompt CachingProvisioned TP
Legal & Professional Services
Legal · Document Analysis · D1 + D5
Contract Review Assistant — Hierarchical RAG
Law firm reviewing M&A contracts needed clause extraction and risk flagging. Hierarchical chunking (parent 800T + child 200T) for long legal documents. Hybrid BM25+ANN catches exact clause numbers AND semantic meaning. Reranker reduces noise 20→5 chunks.
Review time: 4 hrs → 25 min per document. Faithfulness score 0.94 on RAGAS.
Hierarchical chunkingOpenSearch HybridBedrock Reranker
Consulting · Knowledge Mgmt · D2
Enterprise Knowledge Base — Q Business Deployment
Big 4 firm with 50K employees needed secure access to internal guides — respecting strict access controls. Amazon Q Business with SharePoint connector, automatic ACL enforcement. Employees only see docs they're authorized to access.
Deployed in 2 weeks (vs 6-month custom RAG estimate). Zero unauthorized access incidents.
Amazon Q BusinessIAM Identity CenterSharePoint connector
Retail, Manufacturing & Media
Retail · Fine-Tuning · D1 + D4
Product Description Generator — Brand Voice LoRA
Global retailer with 2M SKUs needed consistent brand-voice descriptions in 12 languages. Fine-tuned via LoRA adapters (multiple adapter sets from single base model). Bedrock Batch Inference for nightly bulk generation at 70% lower cost.
2M descriptions/month. Brand voice consistency 91%. 70% cost reduction vs real-time.
LoRA / PEFTSageMakerBedrock Batch
E-commerce · Agentic · D1 + D2
Shopping Assistant Agent — Real-Time Inventory
E-commerce platform built conversational shopping assistant that checks live inventory (Lambda action group), searches product catalog (RAG), applies personalization, completes orders. WebSocket API GW for <150ms first token. maxIterations=8 prevents runaway loops.
23% increase in conversion rate. 40% reduction in average session time.
Bedrock AgentsWebSocket API GWBedrock Streaming
Manufacturing · Predictive Maintenance · D2 + D4
Equipment Failure Prediction — Async SageMaker Pipeline
Automotive manufacturer analyzing 500K sensor readings/shift. Kinesis → Lambda → SageMaker Async Inference (jobs >60s per machine) → SNS alert. Model Monitor tracks drift; Model Registry rollback if accuracy drops.
Unplanned downtime reduced 67%. $4.2M annual maintenance savings.
SageMaker AsyncKinesisModel MonitorSNS
Media · Governance · D1 + D3
Newsroom AI Assistant — Prompt Governance at Scale
National broadcaster with 200 journalists needed standardized AI writing assistance. Bedrock Prompt Management with approval workflows: editorial board reviews prompt versions before production. CloudTrail logs every invocation. A2I for sensitive stories. Denied Topics blocks competitor mentions.
35% productivity improvement. Zero unauthorized prompt changes reached production.
Bedrock Prompt ManagementCloudTrailA2IGuardrails
Curated Learning Ecosystem

Learning Resources

Curated resources for AIP-C01 certification prep — official AWS training, online courses, YouTube channels, conferences, and communities.

Official AWS training — start here
Online courses — structured learning
YouTube channels — free video content
Conferences & communities
Recommended study sequence for AIP-C01
Week 1-2: Official AWS Exam Guide PDF + This hub's domain pages + AWS Skill Builder. Week 3-4: Udemy (Sundog) + Hands-on Bedrock free tier + Bedrock samples repo. Week 5-6: Practice questions + Mind maps for visual reinforcement + AWS re:Invent GenAI sessions on YouTube. Week 7: Mock exams + Review weak domains + Case studies to reinforce patterns.
Multi-Cloud Service Mapping

Cross-Cloud AI Service Comparison

Map AWS GenAI services to Azure, Google Cloud, Oracle, VMware, and HPE equivalents. Essential for multi-cloud practitioners designing cloud-agnostic solutions.

CapabilityAWSAzureGoogle CloudOCIVMware / HPE
FM API PlatformAmazon BedrockAzure OpenAI ServiceVertex AIOCI Generative AIPlaceholder
Managed RAGBedrock Knowledge BasesAzure AI Search + RAGVertex AI SearchOCI AI KnowledgePlaceholder
Vector DatabaseOpenSearch / pgvector / MemoryDBAzure AI Search (vector)Vertex AI Vector SearchOCI OpenSearchPlaceholder
AI AgentsBedrock Agents + Agent SquadAzure AI Agent ServiceVertex AI Agent BuilderOCI Digital AssistantPlaceholder
AI Safety / GuardrailsBedrock GuardrailsAzure Content SafetyVertex AI SafetyOCI Content Mod.Placeholder
Model Fine-TuningSageMaker (LoRA/PEFT)Azure ML fine-tuningVertex AI Model GardenOCI Data SciencePlaceholder
Prompt GovernanceBedrock Prompt ManagementAzure AI StudioVertex AI Prompt MgmtOCI AI PlaygroundPlaceholder
Enterprise AI ChatbotAmazon Q BusinessMicrosoft Copilot M365Gemini for WorkspaceOCI Digital Asst.Placeholder
Developer Coding AIAmazon Q DeveloperGitHub CopilotGemini Code AssistOCI Code AssistPlaceholder
Bias / ExplainabilitySageMaker Clarify (SHAP)Azure Responsible AIVertex Explainable AIOCI AI FairnessPlaceholder
Compliance EvidenceAWS Audit ManagerMicrosoft PurviewChronicle SecurityOCI Security AdvisorPlaceholder
Distributed TracingAWS X-RayAzure App InsightsCloud TraceOCI APMPlaceholder
Model MonitoringSageMaker Model MonitorAzure ML monitoringVertex AI Model MonitorOCI AI MonitoringPlaceholder
Contributing
Azure, GCP, Oracle, VMware, and HPE cells marked "Placeholder" are being filled by subject matter experts. The table updates as the community grows.
Reference

Glossary

Key terms aligned with AIP-C01 exam definitions. Each definition includes the AWS service context.

Foundation Model (FM)
Large pre-trained transformer model accessed via API on AWS through Amazon Bedrock. NOT trained from scratch by end users — that's out of scope for AIP-C01.
RAG — Retrieval-Augmented Generation
Retrieves relevant documents at query time, injects them as FM context. Prevents hallucination on proprietary knowledge. AWS: Bedrock Knowledge Bases + OpenSearch + Titan Embeddings.
ReAct Loop
Reason + Act. Agent pattern: THOUGHT → ACTION (tool call) → OBSERVATION → repeat. Implemented in Bedrock Agents. Debug via trace mode. maxIterations prevents infinite loops.
LoRA — Low-Rank Adaptation
PEFT technique training adapter matrices (1-5% of params). 90% cheaper than full fine-tuning. Multiple adapters from one base model. Use for style/behavior, not knowledge injection (use RAG for facts).
RAGAS
RAG Assessment: 4 metrics. Faithfulness (hallucination → Grounding Check), Answer Relevancy (off-topic → tighten prompt), Context Recall (retrieval fail → hybrid search), Context Precision (noise → Reranker).
MCP — Model Context Protocol
Open JSON-RPC 2.0 standard for agent-tool communication. AWS: Lambda for stateless MCP, ECS for stateful. Reduces N×M integrations to N+M.
Hierarchical Chunking
Parent 800T (generation context) + child 200T (retrieval precision) linked in metadata. Most tested chunking on AIP-C01. Best for long structured documents.
Provisioned Throughput
Flat hourly rate for Bedrock FM access. Break-even ≈40% sustained utilization. Eliminates throttling AND saves money above threshold.
Grounding Check
Bedrock Guardrails filter 3. Post-generation — compares FM response against retrieved context. Blocks claims not supported by context. Primary anti-hallucination in RAG systems.
Hybrid Search (BM25 + ANN)
BM25 (exact terms) + ANN vector (semantic meaning) merged via Reciprocal Rank Fusion. 15-30% better recall than either alone. Implemented in OpenSearch.
SHAP Values (Clarify)
Shapley Additive Explanations. Per-prediction feature importance scores. Required for ECOA adverse action notices in lending decisions.
Prompt Caching
Bedrock caches static prefix. Cache reads ≈10% of standard rate → 90% savings on prefix tokens. Highest ROI cost optimization technique.
AWS Certification

AWS AI Practitioner (AIF-C01)

Foundational GenAI certification. Recommended prerequisite for AIP-C01.

AIF-C01 — Recommended before AIP-C01
Foundational AI/ML concepts: what foundation models are, basic prompt engineering, GenAI use cases, responsible AI principles. ↗ Official AIF-C01 Page
🎍
Detailed AIF-C01 content coming soon
Domain deep-dives and study resources in preparation
Microsoft Azure

Azure AI Certifications

AI Engineer Associate (AI-102), AI Fundamentals (AI-900), Azure ML certifications.

Azure AI content in preparation
AI-102 · AI-900 · Azure OpenAI · Azure AI Studio
Google Cloud

Google Cloud AI Certifications

Professional ML Engineer, Vertex AI, and Gemini certification pathways.

Google Cloud AI content planned
Professional ML Engineer · Vertex AI · Gemini API
Enterprise Hyperscalers

Enterprise AI Platforms

Oracle Cloud Infrastructure, VMware Private AI, and HPE GreenLake AI platforms.

Soon
Oracle Cloud Infrastructure
OCI Generative AI Service, AI Platform, Oracle Digital Assistant, Data Science service.
Soon
VMware Private AI
VMware Private AI Foundation with NVIDIA, vSphere 8 AI workloads, VCF AI integrations.
Soon
HPE GreenLake AI
HPE Machine Learning Dev Environment, GreenLake AI cloud services, Ezmeral, Cray HPC.
Architecture Reference

Architecture Patterns

Reusable GenAI architecture patterns for enterprise deployment.

📐
Architecture patterns coming soon
RAG reference architectures · Agentic workflow patterns · HIPAA GenAI stack · Multi-cloud AI patterns