AI can do skilled work, claims OpenAI's GDPval

1,083 followers

Can AI truly do the work of skilled professionals? OpenAI says yes—and they have data to prove it. Introducing: GDPval. 👉 Real-world tasks built by experts 👉 Models perform equal to or better than humans nearly half the time 👉 100x faster and cheaper Paul Roetzer and Mike Kaput unpack the implications for the workforce, the economy, and beyond in Episode 170 of The Artificial Intelligence Show. https://lnkd.in/g92acWV7

[The AI Show Episode 170]: How ChatGPT Is Used at Work, New GDPval Benchmark, AI “Workslop,” ChatGPT Pulse, Meta Vibes & More AI Economy Warnings podcast.smarterx.ai

To view or add a comment, sign in

More Relevant Posts

William Mo, MBA, PMP

Technical Program Management | Platform Engineering | AI Infrastructure
1w
Report this post
The future of AI isn't about replacing the worker, it's about making your best worker scalable. 🚀 My key takeaway from watching the "OpenAI on OpenAI" presentation is clear: AI is moving to a new runway, one where deep integration into daily workflows is essential for productivity and future growth. The big shift is from merely using AI to drive efficiency (the "old" question) to using it to amplify expertise (the new, crucial question) [01:44]. The goal is to capture the unique craft of your top operators—your "Sophies"—and distribute it across the entire organization so everyone operates at the highest level [02:28]. OpenAI is doing this with internal applications that are truly woven into the workflow: • The Go-to-Market Assistant: Codifies the best sales strategies into a system that integrates with Slack and ChatGPT, saving reps an entire day of work per week [13:20]. • Openhouse: Centralizes scattered institutional knowledge and connects employees with the right internal experts, enabling rapid access to institutional intelligence [19:24]. • Self-Improving Support: A system that learns from every customer ticket, directly updating its Standard Operating Procedures (SOPs) for continuous, scalable improvement [22:19]. We're in a "golden age of internal building." If you want to drive a 10x increase in your company's agility, start by finding your top performer, documenting their genius, and building an AI to distribute it. The power of AI is no longer a bolt-on feature—it's the new operating system for your team's best work. What part of your company's internal expertise is ready to be amplified? #AI #Productivity #Workflow #DigitalTransformation #FutureofWork https://lnkd.in/g3h-Z_aD

OpenAI on OpenAI: Applying AI to Our Own Workflows

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Irving Wang (アーヴィング・ワン)

CEO & Founder at DeepCap Protocol | IPxWeb3
3w Edited
Report this post
Inaugural Hybrid Operator study launched to measure how AI — from OpenAI to in‑house bots — changes workplace efficiency and productivity. Short (~5‑minute) survey sent to ~1.1M subscribers. Complete it for early access to results; we’ll repeat periodically to track change and follow with detailed analysis and tactical insights. Questions? Ping @Noam Segal. Quick, painless, and statistically curious — care to help us quantify whether AI is a productivity booster or an elegantly dressed distraction? #AI #Productivity #Workplace
Like Comment
To view or add a comment, sign in
Joe Fargnoli, Sr.

Assistant Professor - Department Coordinator, Management, Dean College, Dean R. Sanders ’47 School of Business
3w
Report this post
This MIT study offers important insights into the AI transformation narrative. While tools like ChatGPT boost individual productivity, translating that to bottom-line impact remains challenging for most companies. The "workslop" phenomenon, low-effort AI-generated content, is worth discussing as teams integrate these tools.

AI tools aren't making much of a difference for companies fastcompany.com

1 Comment
Like Comment
To view or add a comment, sign in
Burhan Saiyed

Certified Director | Alumni & Advancements | Marketing & Narrative Builder
3w
Report this post
AI-generated 'workslop' is destroying productivity and teams, researchers say Just like you wouldn't drive an expensive four wheel drive through some tight crowded streets, #ai is not meant to be used for every little thing. We've implemented Ai without recognizing it's viability and impact in the regular workplace and are now seeing pushback against it. Employees are using Ai to generate workslop or "wordslop" where a basic email becomes a very well written and very wordy email that is just a waste of everyone's time. Similarly, thanks to the extremely generic nature of responses generated through Ai, managers and leaders now have to waste more time fixing documents then it would have taken to have manually written one. https://lnkd.in/deyRZsh2

AI-generated 'workslop' is here. It's killing teamwork and causing a multimillion dollar productivity problem, researchers say cnbc.com
Like Comment
To view or add a comment, sign in
Coleen Schofield
3w
Report this post
I understand the rush to adopt all things AI to remain competitive. However, successful organizations will find the perfect mix of using AI to improve productivity for mundane tasks and relying heavily on the creativity and insights from real, live human beings to differentiate their offerings and make their clients look smart! https://lnkd.in/enSwe-u5

AI-generated 'workslop' is here. It's killing teamwork and causing a multimillion dollar productivity problem, researchers say cnbc.com

1 Comment
Like Comment
To view or add a comment, sign in
Sean McCormack

C-Suite Corporate Affairs & Communications Leader | Fortune 20 CCO | State Department Assistant Secretary and Spokesman | White House Deputy Spokesman and Boeing Exec | Strategic Advisor and Mentor
1w
Report this post
Reading about OpenAI's deal stream, the battle for AI dominance is clearly on, but what if the future isn't about one winner? We often talk about the "best" AI model, but this race feels less like a competition and more like a specialization. As a user, I want the best model for my purposes, ideally one that "knows" my business. While prediction markets have models like Gemini and ChatGPT battling for the top spot for the general user audience, It would seem businesses will want increasingly to refine their own, specialized AI agents. Take a company's internal communications team, for example. Instead of relying on a generic model for everything, they could train a specialized AI agent on their unique voice, internal documents, and corporate culture. This could allow for personalized, on-brand messaging at scale and avoid workslop. Is it more valuable to have one tool that does a lot of things well, or a whole toolbox of specialized AI agents built for specific tasks? Or perhaps company like OpenAI aims for the general user market while others focus on more specialized applications? I'm curious to hear what you think in the comments. 👇 #AI #GenerativeAI #MachineLearning #Tech #BusinessStrategy #Communications

1 Comment
Like Comment
To view or add a comment, sign in
demandDrive

7,494 followers
1w
Report this post
Everyone’s chasing AI search visibility. But visibility alone doesn’t pay the bills. If you’re showing up in ChatGPT or Google AI Overviews but not connecting that visibility to pipeline… you’re just collecting vanity metrics. That’s why we built our AI Journey Optimization (AJO) framework: to turn AI mentions into measurable revenue. GEO gets you cited by AI. AJO connects it all back to real, measurable revenue. Read how AJO bridges visibility → engagement → conversion → revenue: https://hubs.ly/Q03MFLX90 #AIJourneyOptimization #GEO #AIforSEO #RevenueGrowth
Like Comment
To view or add a comment, sign in
AI Academy Asia

1,160 followers
2w Edited
Report this post
🍁 October brings a wave of breakthrough AI releases. This week alone we've seen: OpenAI's SORA 2, Anthropic's Claude Sonnet 4.5, Deepseek-V3.2, Instant Checkout in ChatGPT (with Stripe), Google Deepmind's Dreamer 4 and Zhipu AI's open-source model GLM 4.6. But, here's the bigger question: how fast are AI capabilities actually advancing? A study from METR offers a compelling answer. They measure model performance through the length of tasks models can complete. The results are striking: → The length of tasks models can complete (at 50% success) has grown exponentially over the past 6 years. → This capacity is doubling every 7 months. → Models now achieve near 100% success on tasks under 4 minutes. → Tasks over 4 hours? still under 10% success rate. What this means: One compelling metric is to characterize any model by the length of human equivalent tasks it can complete with x% probability. Today's frontier models such as GPT-5, Grok 4, and Claude Opus 4.1, are at the top right of the graph, achieving the best performance by completing tasks up to an hour in length at 50% success rate. Interestingly, Sonnet 4.5 arrived exactly 7 months after version 3.7, matching METR's doubling timeline. Looking ahead 2-4 years (if trends hold), the research highlights: → Generalist AI agents will be capable of handling a wide range of week-long tasks. → Frontier AI systems will be capable of autonomously carrying out month-long projects. The gap between "quick task" and "strategic project" is closing faster than most realize. Check out the full study at: https://lnkd.in/gWpeChk3
Like Comment
To view or add a comment, sign in
Temuulen (Temi) Munguu

Technical Program Manager @AI Academy Asia | Haverford Alum
2w
Report this post
The METR study highlights something crucial that often gets lost in AI hype: we now have a promising quantifiable measure of progress. The 7-month doubling rate means strategic planning cycles need to account for this pace. If this trend holds, we're roughly 3-4 doubling periods away from AI handling week-long tasks, which will likely require different types of reasoning, error recovery and contextual judgment.
AI Academy Asia

1,160 followers
2w Edited

🍁 October brings a wave of breakthrough AI releases. This week alone we've seen: OpenAI's SORA 2, Anthropic's Claude Sonnet 4.5, Deepseek-V3.2, Instant Checkout in ChatGPT (with Stripe), Google Deepmind's Dreamer 4 and Zhipu AI's open-source model GLM 4.6. But, here's the bigger question: how fast are AI capabilities actually advancing? A study from METR offers a compelling answer. They measure model performance through the length of tasks models can complete. The results are striking: → The length of tasks models can complete (at 50% success) has grown exponentially over the past 6 years. → This capacity is doubling every 7 months. → Models now achieve near 100% success on tasks under 4 minutes. → Tasks over 4 hours? still under 10% success rate. What this means: One compelling metric is to characterize any model by the length of human equivalent tasks it can complete with x% probability. Today's frontier models such as GPT-5, Grok 4, and Claude Opus 4.1, are at the top right of the graph, achieving the best performance by completing tasks up to an hour in length at 50% success rate. Interestingly, Sonnet 4.5 arrived exactly 7 months after version 3.7, matching METR's doubling timeline. Looking ahead 2-4 years (if trends hold), the research highlights: → Generalist AI agents will be capable of handling a wide range of week-long tasks. → Frontier AI systems will be capable of autonomously carrying out month-long projects. The gap between "quick task" and "strategic project" is closing faster than most realize. Check out the full study at: https://lnkd.in/gWpeChk3
Like Comment
To view or add a comment, sign in
Vera Vista Labs

137 followers
4w
Report this post
Every day, thousands of people share their challenges and ideas on Reddit. Hidden in those posts are valuable insights, but most of them disappear in the noise. At Vera Vista Labs, we asked ourselves a simple question: 💡 What if we could capture those conversations, summarize them, and transform them into actionable solutions and project ideas? So we built an automated workflow using n8n + OpenAI: 🔎 It scans a specific subreddit for new posts 📝 Summarizes them into clear, concise takeaways 💡 Identifies problems and generates possible solutions or opportunities 📊 Stores everything in Google Sheets for easy access Instead of endless scrolling, we now have a stream of structured inspiration ready to spark innovation. This is how we turn conversations into opportunities for growth. #workflows #AI #n8n #reddit #Automation
Like Comment
To view or add a comment, sign in

1,083 followers

View Profile Follow

LinkedIn respects your privacy

AI can do skilled work, claims OpenAI's GDPval

Explore content categories

AI can do skilled work, claims OpenAI's GDPval

More Relevant Posts

OpenAI on OpenAI: Applying AI to Our Own Workflows

https://www.youtube.com/

Explore content categories