How to build a chatbot with Snowflake Cortex Search

MES Architect | Consultant | Digital Transformation| Strategic Initiative For Operations | ServiceNow

Can a chatbot truly understand your data? I decided to find out using Snowflake Cortex Search. The results were eye-opening 👇 1. Data Preparation – Collect, clean, and store your data in Snowflake for processing. 2. Cortex Search Setup – Enable and configure Snowflake Cortex Search services. 3. Indexing – Generate embeddings and create searchable indexes for your data. 4. Query Setup – Define chatbot query flow and connect it to Cortex Search APIs. 5. LLM Integration – Combine Cortex Search with an LLM to generate contextual answers. 6. Chatbot Development – Build and integrate the chatbot interface with backend logic. 7. Testing & Validation – Verify chatbot accuracy and refine prompts or data. 8. Deployment – Launch the chatbot and monitor real-time performance for improvements. #keeplearning #ai

To view or add a comment, sign in

More Relevant Posts

Selva Muthukumar

Principal Data Architect | Snowflake • Databricks • Microsoft Fabric (ADF, ADLS Gen2, Power BI) • Palantir Foundry • Azure | Data Governance & Modeling• Gen AI (Text 2 SQL)
3w
Report this post
🚀 CAG vs RAG — What’s the real impact in enterprise AI? In the world of GenAI, two architectural patterns are shaping how organizations bring intelligence into their data workflows — RAG (Retrieval-Augmented Generation) and CAG (Context-Augmented Generation). Here’s how I see the distinction 👇 🔹 RAG → “Look it up before answering.” The model retrieves relevant facts from an external knowledge source (e.g. vector DB) at query time to ground responses. It’s great for factual accuracy and freshness — think chatbots, document Q&A, or dynamic policy lookup. 🔹 CAG → “Remember everything that matters before reasoning.” Goes beyond documents. It fuses retrieved data + user context + system state + semantic signals to make responses situationally aware. This is where AI begins to act more like an agent — continuously learning from history, user intent, and operational context. In simple terms: RAG grounds your model in facts. CAG grounds your model in reality. As GenAI adoption scales, I’m curious about how teams are applying these patterns beyond prototypes — especially in governed data environments or agentic workflows. 👉 Have you seen real-world benefits or challenges from implementing RAG or CAG (or both)? What’s working — and what’s still theoretical? On the snowflake flavour 💡 Snowflake Cortex is becoming a strong enabler for both approaches — RAG through native Vector Search, Functions, and Document AI, making it easy to ground LLMs directly on governed enterprise data. CAG through Cortex AI Studio and Agent Framework, helping orchestrate context-rich, secure, and multi-step reasoning — all within the Snowflake data boundary. #GenAI #DataArchitecture #AIEngineering #RAG #CAG #LLM #Snowflake #AIagents #EnterpriseAI
Like Comment
To view or add a comment, sign in
Nick Akincilar

GenAI Builder | Data Strategist | Cloud Architect | Tech Writer
1w Edited
Report this post
Every #AI product looks cool in the lab. But let's be honest: when you deploy to production with your #business users, reality sinks in. That user on the right? That's your product when it hits the fan because you didn't account for human inaccuracy. Your users will NOT ask questions that exactly match your database tables. You know it, and I know it. And that's where most AI fails. What separates Snowflake Cortex Analyst and Snowflake Intelligence from the competition isn't the chatbot UI or the cool charts; it's the cold, hard Accuracy, compensating for the very real inaccuracy of your users. Actually this is just one of the major differentiators out of many. We don't guarantee perfect users; we guarantee better results. How do we back that up? - Cortex Search Service performs a fuzzy search against high cardinality columns (like #Customer_name). - We use a hybrid vector + keyword search engine to find the closest match which would be the exact match in your database entry. - We use that actual value to construct an EQUAL filter and not use ILIKE that could return nothing or even worse: false positives. If your #AI breaks the moment a user misspells a name, you don't have a production-ready solution. You have a glorified demo. Behind the scenes: The query Snowflake generates will have the following filter & will get you the correct results every single time: - WHERE customer_name = "Smith, John" vs. competition will end up with no results in each case: - WHERE customer_name ILIKE "%John Smith%" - WHERE customer_name ILIKE "%Smith John%" - WHERE customer_name ILIKE "%Smith, Jon%" When you are making decisions on AI & Talk to your data type solutions, test your POCs with real users asking less than perfect questions. That is what separates great demo solutions from good production ones. For more info: Cortext Analyst Service(Text-2-SQL) https://lnkd.in/gQv6YPCi Cortex Search Service (Fuzzy Search) https://lnkd.in/gZX3-x7h Snowflake Intelligence (Agentic Chat UI that combines multiple instances of Cortex Analyst, Search for different data domains) https://lnkd.in/eDt2aFuD
31 Comments
Like Comment
To view or add a comment, sign in
Eric Heilman

Sr. Solutions Engineer at Snowflake | SnowPro Core Certified | Snowflake Certification SME
1w
Report this post
Snowflake continues to deliver world class #data and #AI products to the market. The inclusion of Cortex Search Service with our semantic views for Cortex Analyst text-to-sql queries can be the difference from missing the mark entirely to nailing the bullseye every single time!
Nick Akincilar

GenAI Builder | Data Strategist | Cloud Architect | Tech Writer
1w Edited

Every #AI product looks cool in the lab. But let's be honest: when you deploy to production with your #business users, reality sinks in. That user on the right? That's your product when it hits the fan because you didn't account for human inaccuracy. Your users will NOT ask questions that exactly match your database tables. You know it, and I know it. And that's where most AI fails. What separates Snowflake Cortex Analyst and Snowflake Intelligence from the competition isn't the chatbot UI or the cool charts; it's the cold, hard Accuracy, compensating for the very real inaccuracy of your users. Actually this is just one of the major differentiators out of many. We don't guarantee perfect users; we guarantee better results. How do we back that up? - Cortex Search Service performs a fuzzy search against high cardinality columns (like #Customer_name). - We use a hybrid vector + keyword search engine to find the closest match which would be the exact match in your database entry. - We use that actual value to construct an EQUAL filter and not use ILIKE that could return nothing or even worse: false positives. If your #AI breaks the moment a user misspells a name, you don't have a production-ready solution. You have a glorified demo. Behind the scenes: The query Snowflake generates will have the following filter & will get you the correct results every single time: - WHERE customer_name = "Smith, John" vs. competition will end up with no results in each case: - WHERE customer_name ILIKE "%John Smith%" - WHERE customer_name ILIKE "%Smith John%" - WHERE customer_name ILIKE "%Smith, Jon%" When you are making decisions on AI & Talk to your data type solutions, test your POCs with real users asking less than perfect questions. That is what separates great demo solutions from good production ones. For more info: Cortext Analyst Service(Text-2-SQL) https://lnkd.in/gQv6YPCi Cortex Search Service (Fuzzy Search) https://lnkd.in/gZX3-x7h Snowflake Intelligence (Agentic Chat UI that combines multiple instances of Cortex Analyst, Search for different data domains) https://lnkd.in/eDt2aFuD
Like Comment
To view or add a comment, sign in
TrustGraph

270 followers
1w
Report this post
RAG is broken. Knowledge Cores are the solution. 💡 One of the biggest challenges facing enterprise AI? Reusability. Every time you ingest data for RAG, you rebuild knowledge graphs and vector embeddings from scratch - wasting compute, time, and money. TrustGraph’s Knowledge Cores solve this elegantly: 📦 Reusable AI Assets: Process your data once, package the resulting graph edges and vector embeddings into a Knowledge Core, then load it instantly across any TrustGraph deployment. 🔄 Portable Intelligence: Share Knowledge Cores across teams, projects, and environments. Think of them as “Docker containers for AI knowledge” - standardized, versioned, and instantly deployable. 🎯 Context Engineering at Scale • Automated Knowledge Graph Construction - Extract entities, topics, and relationships from source data • Deterministic Graph Retrieval - Combine vector similarity search with graph traversal for deep context • Configurable Subgraph Context - Control the depth (number of hops) and breadth of knowledge available to agents ⚡ Production-Ready Integration When you load a Knowledge Core, TrustGraph queues and loads the graph edges and embeddings into your chosen stores automatically - no manual ETL required. This is context engineering the way it should work: modular, reusable, and built for enterprise data engineers who solve real problems - not toy demos. Ready to revolutionize how you build AI context? Sample Knowledge Cores are available for download. The platform is waiting. 🔗 https://lnkd.in/gz-GtFMP #KnowledgeGraphs #ContextEngineering #TrustGraph #OpenSource #RAG
Like Comment
To view or add a comment, sign in
Ilija Tozija

Oxonian | Project & Programme Management | Tech & AI Enthusiast
1mo
Report this post
🚀 Are AI Agents Moving from Experiments to Production — Faster Than We Think? I attended the AI Show & Tell at Microsoft’s R&D lab in NYC last night, hosted by Cedric Vidal and Cassidy Fein. Three talks showed what production-ready AI agents actually look like: Nimbleway AI – Solving the Live Data Problem 🍕 Roee M. posed a deceptively simple question: “Can AI really tell you the best pizza in NYC?” LLMs give confident answers that can be outdated, biased, or just plain wrong because they lack access to current information. Roee compared various approaches: Traditional APIs → accurate but rigid Manual browsing → reliable but doesn’t scale Nimble’s browser agents → grounding multi-agent systems in live, structured web data The same infrastructure that helps you find the best pizza is already powering enterprise pricing intelligence and market analysis at scale. 📊 TextQL – 100,000 Tables, Zero Configuration Ethan Ding showed how TextQL’s analytics agents query petabytes of data with natural language and zero setup. What used to take days of waiting for SQL queries now happens in seconds of conversation. The data team bottleneck → automated away. 🧠 Arc Intelligence Agents That Compound Intelligence Jarrod Barnes presented perhaps the most intriguing idea: agents that actually learn from experience. Arc is an open-source continual-learning framework that uses online prompt optimization and reward modeling in production. Their demo: an SRE Agent resolving Kubernetes incidents — and getting smarter with each attempt. Most agents perform identically on task 1 and task 100. Arc’s don’t. They self improve. 🔮 What’s Next? If today’s agents can fetch live data, automate analytics, and learn on the fly… 👉 how far will they go in reshaping our daily work and decisions? Will they stay as assistants — or become teammates that replace entire workflows? #AIAgents #ContinualLearning
3 Comments
Like Comment
To view or add a comment, sign in
Steve Jackson

Principal Software Engineer | Enabling Teams to Build at AI Speed | 20+ Years Combining Deep Experience with Cutting-Edge AI Techniques
1w
Report this post
An important trick I've learned for working with AI on legacy codebases? Give your AI assistant access to your database schema. We maintain schema-only dumps as part of our codebase (no data, just structure). A GitHub Action ensures they're always current up to date. Why does this matter? Your code only tells part of the story, where your database helps explain relationships, constraints, business logic, and optimizations. This gives AI a second dimension of understanding beyond just your code. It sees the data model that drives your application logic. For legacy codebases especially, this is transformative. The schema often encodes decades of business logic that isn't documented anywhere else. Are you including your schema in AI context? What other artifacts do you commit to help AI understand your codebase? #AI #DatabaseDesign #AIAssistedDevelopment #SoftwareEngineering #LegacyCode
Like Comment
To view or add a comment, sign in
Milind Kelkar

Chief AI Scientist | Transforming Business with Self-Service AI Platforms
2w
Report this post
Do not Blame the AI. Blame the Dataflow From fresh grads to Fortune 500s, everyone is busy creating AI-powered insights, automation, and decision support platforms. Skilled talent is easier to find than ever, and AI-based APIs are just a click away. The promise sounds perfect, let the machine handle the chaos so you can focus on the work that counts. The hard reality is, if your dataflow is broken, no algorithm can save you. After shipping multiple AI automation pipelines, we have learned that most breakdowns do not happen in the model, they happen in the data plumbing underneath. Ten years ago, we built a small internal tool to code open-ended customer feedback into themes. It worked not because of the model, but because we got the dataflow right. We paid attention to everything before and after the coding step, making sure insights could move cleanly into client systems. Our dataflow did the heavy lifting long before the AI touched a single word. # Normalized messy inputs (synonyms, spelling, punctuation). # Rebuilt multi-sheet workbooks into structured datasets. # Split feedback into positive / negative / mixed buckets. # Generated QC reports clients could trust. What came out was not just cleaner data, it was brand-aware, context-aware feedback that teams could use straight away, without hours of manual Excel wrangling. Today, we have built that same discipline into a no-code, low-code solution, because the people who understand the data best are already inside the client teams. They know what’s noise, what’s signal, and what needs to flow cleanly downstream. Our platform simply makes that stewardship easier, faster, and more reliable. The goal is not to replace human judgment with automation; it is to give it a cleaner canvas to work from. Do not blame the AI. Blame the dataflow … and then fix it. P.S. None of this would look this seamless without Suseegaran Murugan, who turns backend logic into simple, intuitive front-end design. Lucky to have that balance on our team.

1 Comment
Like Comment
To view or add a comment, sign in
Milind Kelkar

Chief AI Scientist | Transforming Business with Self-Service AI Platforms
2w
Report this post
Do not Blame the AI. Blame the Dataflow From fresh grads to Fortune 500s, everyone is busy creating AI-powered insights, automation, and decision support platforms. Skilled talent is easier to find than ever, and AI-based APIs are just a click away. The promise sounds perfect, let the machine handle the chaos so you can focus on the work that counts. The hard reality is, if your dataflow is broken, no algorithm can save you. After shipping multiple AI automation pipelines, we have learned that most breakdowns do not happen in the model, they happen in the data plumbing underneath. Ten years ago, we built a small internal tool to code open-ended customer feedback into themes. It worked not because of the model, but because we got the dataflow right. We paid attention to everything before and after the coding step, making sure insights could move cleanly into client systems. Our dataflow did the heavy lifting long before the AI touched a single word. # Normalized messy inputs (synonyms, spelling, punctuation). # Rebuilt multi-sheet workbooks into structured datasets. # Split feedback into positive / negative / mixed buckets. # Generated QC reports clients could trust. What came out was not just cleaner data, it was brand-aware, context-aware feedback that teams could use straight away, without hours of manual Excel wrangling. Today, we have built that same discipline into a no-code, low-code solution, because the people who understand the data best are already inside the client teams. They know what’s noise, what’s signal, and what needs to flow cleanly downstream. Our platform simply makes that stewardship easier, faster, and more reliable. The goal is not to replace human judgment with automation; it is to give it a cleaner canvas to work from. Do not blame the AI. Blame the dataflow … and then fix it. P.S. None of this would look this seamless without Suseegaran Murugan, who turns backend logic into simple, intuitive front-end design. Lucky to have that balance on our team.
Like Comment
To view or add a comment, sign in
Zoë Van Noppen

Databricks MVP | Data Lead
2w
Report this post
I test. You learn. This week: Semantic metadata for metric views. You can now add semantic metadata to metric views in Databricks. 🌈 The promise: - Make #Genie smarter - Make AI/BI dashboards cleaner and more consistent 👉 Three types of metadata you can add to metric views: - Synonyms: Should help Genie understand what you mean. Users say "Customer tier," it will more easily capture that you're referring to "Customer segment." - Display names: Replace technical column names automatically with labels humans actually understand in visualization tools. No more explaining what "cust_acq_cost_mtd" means. - Format specifications: Control how values should be displayed, ensuring consistency. ⚠️What I found in testing: The semantic metadata doesn't seem to be implemented on the consumption side yet (or it's not rolled out to my workspace): - I would for example expect when used in AI/BI dashboards, it would use the display name as label of the axis, or automatically use the format. - Genie didn't pick up my synonym tests, however I used very unrelated aliases, so maybe I made it too hard? ❗ If you want to try this out, don't forget to switch your SQL warehouse channel to preview.

6 Comments
Like Comment
To view or add a comment, sign in
Sumanth P

Machine Learning Developer Advocate | LLMs, AI Agents & RAG | Shipping Open Source AI Apps | AI Engineering
1mo
Report this post
Build dynamic memory for AI Agents in just 6 lines of code! Cognee let's you build memory for Agents and replace RAG using scalable, modular ECL (Extract, Cognify, Load) pipelines. Sending large volumes of data to AI agents often leads to bloat and hallucinations. Cognee connects data points and establishes ground truths to improve the accuracy of your AI agents and LLMs. Key Features: • Interconnect and retrieve your past conversations, documents, images and audio transcriptions • Replaces RAG systems and reduces developer effort, and cost. • Load data to graph and vector databases using only Pydantic • Manipulate your data while ingesting from 30+ data sources • Local UI with interactive notebooks for easy data loading, graph visualization, and querying It also supports continuous improvement through a feedback mechanism that captures the relevance of search results from real user interactions. Over time, this feedback directly updates the knowledge graph, helping your agents adapt and provide increasingly accurate responses. The best part? It's 100% Open Source Link to the Github Repo in the comments!
36 Comments
Like Comment
To view or add a comment, sign in

1,015 followers

57 Posts

View Profile Follow

LinkedIn respects your privacy

How to build a chatbot with Snowflake Cortex Search

Explore content categories

How to build a chatbot with Snowflake Cortex Search

More Relevant Posts

Explore related topics

Explore content categories