Docling Adds Structured Data Extraction from Documents

Principal Research Staff Member, Master Inventor, Manager of "AI for Knowledge" group in IBM Research Zurich; Chair of the technical steering committee of Docling in the Linux Foundation for AI and Data

1mo

🚀 New in Docling: Structured Data Extraction from Documents! 🚀 We’ve just added a brand-new functionality: extraction of structured data from complex documents using free-form schemas 🤩 What does that mean? 🔹 You can now skip the conversion step — instead of turning documents into text or JSON first, Docling directly extracts the structured fields you care about. 🔹 The requested fields are defined in a free-form schema, so you can instantly align the extraction with the schemas of your own databases. 🔹 This makes it ideal for data-pipelines that don’t need full document conversion, but do need to populate structured databases from messy, unstructured documents. ✨ It’s: 1️⃣ Super simple to use (check out the code snippet 👉 https://lnkd.in/eK6vaMEe ) 2️⃣ 100% open-source and runs fully local – no API calls needed 🙌 3️⃣ Powered by cutting-edge models from NuMind (YC S22) 4️⃣ Perfect for data-pipelines where you need to populate structured databases from documents (think invoices, Curriculum Vitae, contracts, product datasheets, etc) 5️⃣ Currently focused on PDFs and images (PNG) — support for pure text coming soon! The example below shows how easy it is to define a schema and extract structured fields directly from an invoice. 👉 Try it out, break it, send us feedback, and if you like what we’re building, don’t forget to ⭐ the repo (https://lnkd.in/d4UT-6_2)! After a long and fruitful summer, the Docling team has been cooking up many new features — and this is just the beginning! 🌟 #opensource #AI #documentAI #docling #IBM #IBMResearch

83 Comments

Luis Molina

Technical Lead AI - Engineer AI

1mo

Congrats for the release. I have to ask this question, langExtract from Google does something similar, what is the difference?

3 Reactions

Arvind Rajpurohit

IBM Partner Data and AI Leader | Open Group Certified

1mo

Built an Agent with Docling that reads 'U.S. Customs forms' and turns them into structured, usable information. Great work team

5 Reactions

Christopher Helm

1mo

Peter W. J. Staar -> I just added docling to idp-software.com feel free to make edits via Pull Request, I'll merge it https://idp-software.com/vendors/docling/

4 Reactions

Shoeb Masood

1mo

Peter W. J. Staar : Absolutely amazing! 🚀 This is such a powerful addition. I’ve been handling something similar with a lot of glue code, but this feature will really help streamline the entire data ingestion pipeline. Excited to try it out soon and see how it performs in real-world workflows. Kudos to the team! 👏 Quick question: does it also support nested schemas (e.g., line items in invoices), or is it mainly for flat field extraction?**

3 Reactions

Aashish Chaudhary

Technical Product Leader in AI/ML, Real-Time 3D, and Open Source | Bridging Technology with Business Strategy

1mo

Awesome! We are big fans of Docling.

2 Reactions

Stéphane M.

AI | ML | Graph

1mo

Thanks for the awesome product. It is getting better. Privacy is gold.

2 Reactions

Jeebesh Chandra Podder

MLOps & AI Platform Engineer | Architecting & Deploying Scalable AI on AWS, Azure and GCP | RAG & Agentic AI

1mo

Loved this — super clear! 🙌 Curious: which Document Loader worked best for mixed PDFs + webpages in your experiments?

2 Reactions

Daniel Svonava

Build better AI Search with Superlinked | xYouTube

1mo

How flexible is it when document layouts vary widely? Curious because invoice/contract formats can be really inconsistent.

8 Reactions

Aashish Jangid

1mo

Accuracy is very good thanks for sharing.

1 Reaction

Kürşad Laçin

Senior Forward Deployed Agentic Engineer (FDAE) - Allianz Türkiye | M.Sc. Universität Passau

1mo

Burak Bolat Hilal Onur Cunedioğlu Fatih Kıyıkçı

4 Reactions

See more comments

To view or add a comment, sign in

More Relevant Posts

Saeed Kasmani, Ph.D.

Let’s Innovative with AI | AI Leader | Advisor | Mentor |Ex-Redhatter |Ex-CSIRO researcher
1mo
Report this post
Very cool 🚀 — this feels like a must-have tool for any data or AI engineer working with documents in enterprise settings. Extracting structured fields directly from messy PDFs or images without extra conversion is a huge time saver. Excited to see how this evolves! #AI #Docling #DocumentAI #DataEngineering #MLOps #IBMResearch
Peter W. J. Staar

Principal Research Staff Member, Master Inventor, Manager of "AI for Knowledge" group in IBM Research Zurich; Chair of the technical steering committee of Docling in the Linux Foundation for AI and Data
1mo

🚀 New in Docling: Structured Data Extraction from Documents! 🚀 We’ve just added a brand-new functionality: extraction of structured data from complex documents using free-form schemas 🤩 What does that mean? 🔹 You can now skip the conversion step — instead of turning documents into text or JSON first, Docling directly extracts the structured fields you care about. 🔹 The requested fields are defined in a free-form schema, so you can instantly align the extraction with the schemas of your own databases. 🔹 This makes it ideal for data-pipelines that don’t need full document conversion, but do need to populate structured databases from messy, unstructured documents. ✨ It’s: 1️⃣ Super simple to use (check out the code snippet 👉 https://lnkd.in/eK6vaMEe ) 2️⃣ 100% open-source and runs fully local – no API calls needed 🙌 3️⃣ Powered by cutting-edge models from NuMind (YC S22) 4️⃣ Perfect for data-pipelines where you need to populate structured databases from documents (think invoices, Curriculum Vitae, contracts, product datasheets, etc) 5️⃣ Currently focused on PDFs and images (PNG) — support for pure text coming soon! The example below shows how easy it is to define a schema and extract structured fields directly from an invoice. 👉 Try it out, break it, send us feedback, and if you like what we’re building, don’t forget to ⭐ the repo (https://lnkd.in/d4UT-6_2)! After a long and fruitful summer, the Docling team has been cooking up many new features — and this is just the beginning! 🌟 #opensource #AI #documentAI #docling #IBM #IBMResearch
Like Comment
To view or add a comment, sign in
Harish Pillay
1mo
Report this post
Good to see the steady progress Docling is making. These are important innovations that would make the use of AI tools run locally far more efficient and with added privacy and confidentiality built-in. #FOSS ftw!
Peter W. J. Staar

Principal Research Staff Member, Master Inventor, Manager of "AI for Knowledge" group in IBM Research Zurich; Chair of the technical steering committee of Docling in the Linux Foundation for AI and Data
1mo

🚀 New in Docling: Structured Data Extraction from Documents! 🚀 We’ve just added a brand-new functionality: extraction of structured data from complex documents using free-form schemas 🤩 What does that mean? 🔹 You can now skip the conversion step — instead of turning documents into text or JSON first, Docling directly extracts the structured fields you care about. 🔹 The requested fields are defined in a free-form schema, so you can instantly align the extraction with the schemas of your own databases. 🔹 This makes it ideal for data-pipelines that don’t need full document conversion, but do need to populate structured databases from messy, unstructured documents. ✨ It’s: 1️⃣ Super simple to use (check out the code snippet 👉 https://lnkd.in/eK6vaMEe ) 2️⃣ 100% open-source and runs fully local – no API calls needed 🙌 3️⃣ Powered by cutting-edge models from NuMind (YC S22) 4️⃣ Perfect for data-pipelines where you need to populate structured databases from documents (think invoices, Curriculum Vitae, contracts, product datasheets, etc) 5️⃣ Currently focused on PDFs and images (PNG) — support for pure text coming soon! The example below shows how easy it is to define a schema and extract structured fields directly from an invoice. 👉 Try it out, break it, send us feedback, and if you like what we’re building, don’t forget to ⭐ the repo (https://lnkd.in/d4UT-6_2)! After a long and fruitful summer, the Docling team has been cooking up many new features — and this is just the beginning! 🌟 #opensource #AI #documentAI #docling #IBM #IBMResearch
1 Comment
Like Comment
To view or add a comment, sign in
David Robson

Healthcare Systems Integration | Healthcare AI | LLM | HL7 Specialist | Data Analyst | Statistics | Biology
1mo
Report this post
Interesting update from Docling on structured data extraction 🔹 No need to convert to text or JSON first. 🔹 Define free-form schemas and extract exactly the fields you need. 🔹 Looks perfect for messy → structured workflows. I can already see use cases in areas where documents are complex and schema alignment is critical. Definitely worth a look 👇
Peter W. J. Staar

Principal Research Staff Member, Master Inventor, Manager of "AI for Knowledge" group in IBM Research Zurich; Chair of the technical steering committee of Docling in the Linux Foundation for AI and Data
1mo

🚀 New in Docling: Structured Data Extraction from Documents! 🚀 We’ve just added a brand-new functionality: extraction of structured data from complex documents using free-form schemas 🤩 What does that mean? 🔹 You can now skip the conversion step — instead of turning documents into text or JSON first, Docling directly extracts the structured fields you care about. 🔹 The requested fields are defined in a free-form schema, so you can instantly align the extraction with the schemas of your own databases. 🔹 This makes it ideal for data-pipelines that don’t need full document conversion, but do need to populate structured databases from messy, unstructured documents. ✨ It’s: 1️⃣ Super simple to use (check out the code snippet 👉 https://lnkd.in/eK6vaMEe ) 2️⃣ 100% open-source and runs fully local – no API calls needed 🙌 3️⃣ Powered by cutting-edge models from NuMind (YC S22) 4️⃣ Perfect for data-pipelines where you need to populate structured databases from documents (think invoices, Curriculum Vitae, contracts, product datasheets, etc) 5️⃣ Currently focused on PDFs and images (PNG) — support for pure text coming soon! The example below shows how easy it is to define a schema and extract structured fields directly from an invoice. 👉 Try it out, break it, send us feedback, and if you like what we’re building, don’t forget to ⭐ the repo (https://lnkd.in/d4UT-6_2)! After a long and fruitful summer, the Docling team has been cooking up many new features — and this is just the beginning! 🌟 #opensource #AI #documentAI #docling #IBM #IBMResearch
Like Comment
To view or add a comment, sign in
Sugato Ray

VP, Data Scientist @ Truist | Physicist | MBA | MSc Physics | Data Science, ML and AI | Computer Vision | ex-IBM | IITB
1mo Edited
Report this post
🔥 Cool update in Docling: Structured Data Extraction. Docking helps with your document AI needs. It’s an opensource project from IBM and has had collaboration from Hugging Face. 👉 Context in JSON out. 🍓 Check out the example here: https://lnkd.in/g32tjatY 🎁 GitHub Repo: https://lnkd.in/gvJTRGEC #LLMs #structuredData #JSON #docling #documentAI #ml
Peter W. J. Staar

Principal Research Staff Member, Master Inventor, Manager of "AI for Knowledge" group in IBM Research Zurich; Chair of the technical steering committee of Docling in the Linux Foundation for AI and Data
1mo

🚀 New in Docling: Structured Data Extraction from Documents! 🚀 We’ve just added a brand-new functionality: extraction of structured data from complex documents using free-form schemas 🤩 What does that mean? 🔹 You can now skip the conversion step — instead of turning documents into text or JSON first, Docling directly extracts the structured fields you care about. 🔹 The requested fields are defined in a free-form schema, so you can instantly align the extraction with the schemas of your own databases. 🔹 This makes it ideal for data-pipelines that don’t need full document conversion, but do need to populate structured databases from messy, unstructured documents. ✨ It’s: 1️⃣ Super simple to use (check out the code snippet 👉 https://lnkd.in/eK6vaMEe ) 2️⃣ 100% open-source and runs fully local – no API calls needed 🙌 3️⃣ Powered by cutting-edge models from NuMind (YC S22) 4️⃣ Perfect for data-pipelines where you need to populate structured databases from documents (think invoices, Curriculum Vitae, contracts, product datasheets, etc) 5️⃣ Currently focused on PDFs and images (PNG) — support for pure text coming soon! The example below shows how easy it is to define a schema and extract structured fields directly from an invoice. 👉 Try it out, break it, send us feedback, and if you like what we’re building, don’t forget to ⭐ the repo (https://lnkd.in/d4UT-6_2)! After a long and fruitful summer, the Docling team has been cooking up many new features — and this is just the beginning! 🌟 #opensource #AI #documentAI #docling #IBM #IBMResearch
1 Comment
Like Comment
To view or add a comment, sign in
Shirin Khosravi Jam

Sr. Data Scientist/ AI Engineer | Data Science, RAG, AI Agents, & MLOps | Germany’s Top Female Voice in AI 🇩🇪
1mo
Report this post
Spent 6 months learning RAG the hard way. Here's the checklist I wish I had from day one: (valid for interviews as well)👇 𝗥𝗲𝗮𝗹𝗶𝘁𝘆 𝗖𝗵𝗲𝗰𝗸: RAG = Information Retrieval + LLM grounding Goal: Fetch closest, useful context to ground your LLM Why: Token limits, fewer hallucinations, private data But here's how it's really built in production (not just OpenAI + vector DB): 1️⃣ 𝗣𝗿𝗼𝗯𝗹𝗲𝗺 𝗩𝗮𝗹𝗶𝗱𝗮𝘁𝗶𝗼𝗻 → What's the user problem? → Do we really need RAG? Rules or simple search might work better → What's the feature we're enabling? 2️⃣ 𝗗𝗮𝘁𝗮 𝗔𝘂𝗱𝗶𝘁 & 𝗔𝗻𝗮𝗹𝘆𝘀𝗶𝘀 → Document types, lengths, structure, freshness → Text + images? OCR/vision embeddings needed → Clean? Complete? Extraction required? 3️⃣ 𝗙𝗲𝗮𝘁𝘂𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 (𝗖𝗵𝘂𝗻𝗸𝗶𝗻𝗴) → Section-aware chunks (title+content) → Keep fields separate for boosting → Metadata extraction, HTML cleanup, language detection 4️⃣ 𝗕𝗮𝘀𝗲𝗹𝗶𝗻𝗲 & 𝗣𝗿𝗼𝘁𝗼𝘁𝘆𝗽𝗶𝗻𝗴 → Create query ↔ chunk pairs (your ground truth) → Try BM25 (lexical), then vector similarity (semantic) → Score with Recall@k, nDCG, MRR → Start small: get 10-20 examples right 5️⃣ 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 𝗦𝘁𝗿𝗮𝘁𝗲𝗴𝘆 → BM25 → vectors → Hybrid → re-rankers → Tune hyperparameters (k1/b of bm25, top-k, similarity thresholds) → Add business logic, filters, fallback mechanisms 6️⃣ 𝗟𝗟𝗠 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻 → Model selection (OpenAI, Claude, Mistral) → Prompt design + guardrails → Handle hallucinations, costs, context limits 7️⃣ 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 (𝗖𝗿𝗶𝘁𝗶𝗰𝗮𝗹!) → Offline: nDCG, Recall@k, latency, cost/query → Online: A/B tests, user feedback, RAGAS scores → Citation correctness, success rates 8️⃣ 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗜𝗻𝗳𝗿𝗮 → Vector DB setup (not just local FAISS) → Data pipelines, CI/CD, monitoring → Logs, alerts, dashboards (latency, failure rates) → Schema validation, PII redaction, timeouts 9️⃣ 𝗙𝗲𝗲𝗱𝗯𝗮𝗰𝗸 & 𝗜𝘁𝗲𝗿𝗮𝘁𝗶𝗼𝗻 → Mine real queries + hard negatives → Continuous improvement: embeddings, re-ranking → New data? Repeat preprocessing pipeline → Requirements will change (they always do) 𝗕𝗼𝗻𝘂𝘀 𝗥𝗲𝗮𝗹𝗶𝘁𝗶𝗲𝘀 → Legal/GDPR compliance (always remember this constraint) → Explaining retrieval logic to non-tech teams → Stakeholder analytics requests → Cost optimization under scale 𝗕𝗼𝘁𝘁𝗼𝗺 𝗹𝗶𝗻𝗲: Start lexical, add semantic, fuse, then re-rank. Measure at every step. It's system design, not magic. The process never ends. That's the point. Keep iterating! What's your biggest RAG challenge right now? ♻️ Repost and share to help others on similar journey 😇
36 Comments
Like Comment
To view or add a comment, sign in
Lior Alexander Lior Alexander is an Influencer

Covering the latest in AI Engineering/Research • MIT Lecturer • Building AlphaSignal into the largest source of news for AI devs
1mo
Report this post
Finally, the cursor for data scientists is here. Software development and data science are not the same, most IDEs assume schemas live in your head, not your files. Notebooks break, kernels crash, and messy data derails progress. Zerve AI is an agentic development environment for data scientists. → It generates code, orchestrates compute, and adapts to your workflow. → You stay in control, preview data, edit code, configure compute. It comes with everything you need: • Tracks data and code across every iteration • Scales from one experiment to thousands in parallel • Captures and versions every artifact and result for sharing • Exposes workflows as secure APIs or interfaces • Deploys in the cloud, on-prem, or self-hosted Test → I gave the task: “Summarize the latest developments in LLM training techniques.” Zerve: ▸ Pulled recent research papers and GitHub repo updates ▸ Filtered for relevance automatically ▸ Ran summarization and scheduled it to repeat daily ▸ Used context from past SQL queries, notebooks, and pipelines Go try it yourself, they have a free tier. Link in comments.
35 Comments
Like Comment
To view or add a comment, sign in
Bhagwat Chate

Generative AI & ML . Multi-Agent Orchestration . RAG . LLMOps . Cloud (AWS, Python)
1mo
Report this post
LexiFlow Series — Post 1/15 When building LexiFlow, my document intelligence system, I realized something early: The toughest part is not APIs or cloud infra — it’s the data model. In Designing Data-Intensive Applications (DDIA), Martin Kleppmann says: “The most important part of a data system is the data model it exposes.” That hit me hard. In LexiFlow, a PDF is not just a file — it’s a knowledge unit. And that shift in perspective reshaped the whole architecture. Our evolving data model became a layered structure: 🔹 Raw content — bytes, scanned pages, OCR text 🔹 Metadata — title, author, timestamps, compliance tags 🔹 Semantic vectors — embeddings powering retrieval & GenAI chat 🔹 Relationships — cross-document links, version comparisons, traceability Why does this matter? Because every downstream decision flows from it: 1. Ingestion pipelines became smoother (adding new formats is just another data layer, not a redesign). 2. Semantic search scales naturally across thousands of contracts, research reports, or compliance docs. 3. Chat & comparison features feel natural, because they’re grounded in a consistent, extensible data model. Instead of patching features on top, we’re building upwards from the core data model — just as DDIA emphasizes. This reinforced a principle I now carry into every system I design: Get the data model right, and features will follow. Get it wrong, and you’ll fight complexity forever. This is Post #1 in my LexiFlow series — where I connect real engineering choices with timeless system design principles. In the coming days, I’ll share how we applied DDIA’s other foundations (reliability, scalability, maintainability) to LexiFlow’s AWS + FastAPI stack. Next up → Query Models in LexiFlow: balancing flexibility vs performance. Repo: 🔗 https://lnkd.in/dc9ST8YU #GenAI #ArtificialIntelligence #LLMOps #MachineLearning #AIMLProjects #SystemDesign #DataIntensiveApplications #Scalability #Reliability #Maintainability #AWS #Fargate #CloudComputing #DevOps #MLOps #LexiFlow #OpenSource #TechLeadership #Innovation #AIProjects #BhagwatChate
Like Comment
To view or add a comment, sign in
Bhagwat Chate

Generative AI & ML . Multi-Agent Orchestration . RAG . LLMOps . Cloud (AWS, Python)
1mo
Report this post
GenAI 90 Days Challenge – Post 14/90 Series: Document Loaders and Text Splitting in LangChain To build retrieval-augmented generation (RAG) pipelines, we need a reliable way to transform raw files, web pages, or APIs into usable formats for LLMs. LangChain provides document loaders and text splitters that standardize content and prepare it for embeddings, retrieval, and prompts. Highlights from Post 14: 1. Document Loaders: Convert diverse formats like PDFs, HTML, JSON, DOCX, emails, and S3-hosted files into standardized Document objects. Each object contains page_content (text) and metadata (file name, URL, page number). 2. Common Loaders: • UnstructuredFileLoader for PDFs, DOCX, PPTX, and emails • BSHTMLLoader for cleaning HTML pages with BeautifulSoup • PyMuPDFLoader for fine-grained PDF page loading • JSONLoader for structured extraction with jq queries • S3FileLoader for enterprise cloud storage ingestion • RecursiveUrlLoader for crawling entire websites 4. Why Text Splitting Matters: LLMs have token limits. Splitting documents into smaller overlapping chunks preserves context, ensures efficiency, and improves retrieval quality. 5. Key Text Splitters: • RecursiveCharacterTextSplitter: Preferred for natural breakpoints • CharacterTextSplitter: Simple and fast for pre-cleaned text • HTMLHeaderTextSplitter: Retains structure from HTML and Markdown headers 6. Chunking Best Practices: Keep 400–1000 tokens per chunk, maintain 10–20% overlap, avoid mid-sentence cuts, and always preserve metadata for traceability. 7. Real-World Pipeline: Load → Split → Embed → Retrieve → Prompt. This sequence is the backbone of production RAG systems. Why it matters? Proper loading and splitting directly impact retrieval accuracy, embedding quality, and final response coherence. Without structured ingestion and thoughtful chunking, even the best LLMs can produce irrelevant or inconsistent outputs. Full details, loader examples, and pipeline code are inside the attached PDF. Next, we continue with Post 15 in the GenAI series. #GenAI #AgenticAI #SystemDesign #LangChain #RAG #90DaysChallenge #BhagwatChate #AI
Like Comment
To view or add a comment, sign in
Iuri Almeida

Software Engineer | Python Specialist
4w
Report this post
Big O Notation: do you really get it? 📈 🔎 What is it? Big O is a way to measure the efficiency of an algorithm, showing how it behaves in terms of time and space as the input size grows. ⚡ What is it for? It helps compare algorithms and make better choices: which one is faster, which uses less memory, which one scales better as the data increases. 💡 Why use it? Because in practice, performance matters. A piece of code that works fine with 100 items may crash with 1 million. Big O gives us a mathematical lens to anticipate these scenarios. 👩💻 Who is it for? For software engineers, data scientists, and developers who want to write scalable solutions and make conscious architecture decisions. ⸻ Classic examples of Time Complexity (execution time ⏱️) • O(1) → constant: accessing an item in an array. • O(log n) → logarithmic: binary search. • O(n) → linear: iterating through a list. • O(n log n) → quasi-linear: efficient sorting algorithms like Merge Sort. • O(n²) → quadratic: nested loops, e.g., comparing each element with all others. • O(2ⁿ) → exponential: brute-force recursion, like naive Fibonacci. ⸻ Examples of Space Complexity (memory 💾) • A fixed-size array → O(n) (needs space proportional to the number of elements). • Simple variables → O(1) (constant space). • Recursive algorithms → often include stack usage, which grows with input size. ⸻ ⚡ In short: Big O isn’t just academic theory, it’s a practical tool for building systems that can grow and perform well at scale. 💬 What about you? Can you remember the last time you had to analyze the complexity of an algorithm in your code? #BigO #algorithms #datastructures #softwareengineering #performance #dev
1 Comment
Like Comment
To view or add a comment, sign in

17,556 followers

View Profile Follow

LinkedIn respects your privacy

Docling Adds Structured Data Extraction from Documents

More from this author

Exploring Molecules in non-US Patent Literature

Searching for Molecules in Patents?

ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents

Explore content categories

Docling Adds Structured Data Extraction from Documents

More Relevant Posts

More from this author

Exploring Molecules in non-US Patent Literature

Searching for Molecules in Patents?

ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents

Explore related topics

Explore content categories