0% found this document useful (0 votes)

33 views4 pages

MSRI Projects

Uploaded by

malaviyajayesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views4 pages

MSRI Projects

Uploaded by

malaviyajayesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Multilingual LLMs

This research focus is on the evaluation and advancement of multilingual large language
models (LLMs), with a strong emphasis on linguistic inclusivity and real-world relevance. Efforts
go beyond English-centric benchmarks, incorporating low-resource languages and non Western
cultures to better represent the world’s full linguistic and cultural diversity. Key areas include
the development of culturally grounded synthetic datasets and instruction tuning strategies that
enhance model performance across a wide range of languages and contexts. The research aims
to uncover hidden model failures and design solutions that ensure LLMs are more equitable and
robust for speakers of all languages.

PhD students with interest in multilingual natural language processing, fairness and equity in AI,
and the representation of under-resourced languages and cultures are encouraged to explore
opportunities within this area.

AI Infrastructure

This research focus is centered on developing robust and efficient systems that support cutting
edge AI and machine learning workloads, with a particular emphasis on the training and serving
of large language models (LLMs). The work spans multiple layers of the technology stack, from
ML-based algorithmic innovations to low-level kernel and systems implementations. A key
aspect of this research is designing systems that leverage an in-depth understanding of
workload characteristics, enabling the discovery of novel optimizations in the rapidly evolving
hardware and software landscape of AI. Recent efforts in this area include scheduling
optimizations to achieve high throughput and low latency in LLM serving, advanced memory
management techniques for large models, and near-zero overhead checkpointing for
distributed machine learning training.

PhD students interested in systems for AI, large-scale machine learning infrastructure,
algorithmic optimization, and end-to-end performance in LLM deployments are encouraged to
consider opportunities in this research area.

Grounded and Verifiable Reasoners

This research focus addresses the critical challenge of ensuring that advanced reasoning
models produce outputs firmly grounded in real-world constraints, particularly when reasoning
over private or domain-specific data. The aim is to develop efficient reasoning models whose
outputs consistently align with domain knowledge, formal logic, and established systems of
reasoning, as well as to design verifiers that can rigorously assess this grounding. Key research
questions in this area include:

1. How do we train reasoning models such that their outputs are grounded and verifiable by
design?

2. How do we verify a model's reasoning output in structured domains like math and code,
where only final answers may be easily checkable?

3. Beyond math and code, how do we define and verify grounding in less structured tasks such
as document generation?

4. How do we build efficient reasoning models, including efficiency at both training and
inference time?
PhD students with a strong interest in reasoning, verification, formal methods, and the
intersection of machine learning with structured and unstructured domains are encouraged to
explore opportunities within this research area.

GenAI for Education

This research focus explores the transformative potential of GenAI to improve education
globally, with particular emphasis on low-resource settings and serving the needs of the global
majority. The aim is to create innovative, equitable, and accessible AI-driven educational
solutions that empower both students and educators. Key objectives and areas of inquiry
include:

1. Understand the Impact of GenAI in Education: Investigate how GenAI can support
educational practices, improve learning outcomes, and broaden access to quality education—
particularly in resource-constrained environments.

2. Enhancing Mathematical Reasoning: Develop GenAI that effectively assist students in

grasping complex mathematical concepts through step-by-step explanations, clear reasoning,
and accurate solutions.

3. Advancing Visual Reasoning in Education: Improve the capabilities of GenAI to understand

and explain complex visual content, such as geometric figures, diagrams, and charts—
especially relevant to STEM subjects.

4. Multimodal Content Generation: Create rich interactive educational materials that integrate
text, images, videos, and other media as effective visual aids for educators.

5. Designing Accessible Learning Experiences: Co-design, iteratively improve, and evaluate

educational experiences for children and teachers in schools for the blind. This involves
generating accessible content—including audio-first, tactile-first, and Braille materials—and
ensuring effective delivery in both physical and hybrid educational settings.

Research initiatives such as Shiksha Copilot and Ludic Design for Accessibility reflect these
goals, advancing the mission to enhance learning, promote equity, and empower educators and
learners worldwide.

PhD students seeking to harness GenAI to transform education, especially in underserved and
low-resource settings, should explore this research area to develop equitable and accessible
AI-driven learning solutions.

LLMs for Healthcare

This research focus investigates the transformative applications of Large Language Models
(LLMs) in healthcare, with a vision of enabling faster, more accessible, and better-informed
decision-making across clinical and community settings. LLMs have the potential to
revolutionize care delivery—from empowering chatbots that offer reliable, easy-to-understand
health information to patients, to reducing clinical burden by supporting healthcare
professionals with timely assistance. Additionally, with advancements in multimodal
foundation models, LLMs are now integrating diverse data types, including biomedical imaging
and ECG lead data, to assist doctors in generating and refining radiology reports, saving time
and improving consistency.

Key research directions include:

1. Patient Engagement and Education: Using LLM-powered conversational agents to answer
patient queries, provide health education, and support patient self management, especially in
resource-constrained environments.

2. Clinical Documentation and Support: Assisting medical professionals in generating and

refining radiology reports, saving time, and improving the consistency and quality of service.

3. Decision Support for Community Health Workers: Delivering timely and accurate guidance
grounded in medical guidelines to frontline workers, strengthening the continuum of care
beyond hospital walls.

4. Medical Coding and Billing: Leveraging the reasoning capabilities of LLMs to power medical
coding and billing workflows for revenue cycle management (RCM) providers, further
streamlining administrative processes.

These research themes underscore the potential for LLMs to bridge gaps in healthcare access,
quality, and equity across diverse care settings.

PhD students interested in AI for healthcare, clinical decision support, health informatics, or
equitable technology implementation are encouraged to explore opportunities within this area.

Next-Generation Retrieval Models for Chat, Search, and Recommendation

This research aims to tackle complex, large-scale challenges in information retrieval. The
primary goal is to achieve state-of-the-art retrieval accuracy and efficiency across various
applications, including search, recommendation systems, and retrieval-augmented generation
(RAG) within Copilot experiences.

Following are the specific focus areas:

1. Generative retrieval, an alternative to dense retrieval that seeks to directly generate the
identifiers of the most relevant documents for a given input (e.g., Scaling the Vocabulary of Non-
autoregressive Models for Efficient Generative Retrieval [KDD ‘25]).
2. New transformer architectures for improved retrieval accuracy and reduced latency across
multiple applications.
3. Methods for injecting parametric knowledge into language models (e.g., MOGIC: a Metadata-
infused Oracle Guidance framework for Improved Extreme Classification [ICML ‘25]).
4. Efficient similarity search over high-dimensional datasets, incorporating techniques in high-
dimensional geometry, graph-based algorithms, vector quantization, and data compression
(e.g., DiskANN: fast accurate billion-point nearest neighbor search on a single node [NeurIPS
‘19]).

PhD applicants with interests in information retrieval, large language models, system
optimization, or novel algorithms for search and recommendation are encouraged to consider
joining this research area.

Reliable Agentic Systems

This research theme focuses on advancing the reliability of agentic systems—AI agents
powered by large language models (LLMs) capable of autonomous decision-making and tool
use. A full-stack approach is taken, innovating across infrastructure, model design, reasoning
strategies, and real-world applications. Current research focus is on two core areas:
1. Agentic Reasoning: enhancing the ability of LLM agents to plan, adapt, and coordinate
actions across multi-step tasks. By integrating reasoning and tool use through reinforcement
learning, significant improvements in both performance and interpretability can be realized.
This results in more accurate and robust systems that set new benchmarks in complex problem
solving (Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning - Microsoft
Research).

2. Agentic Safety: understanding and mitigating the risks posed by autonomous agents in
everyday tasks. By building a benchmark suite to rigorously evaluate safety and potential harms
across domains such as web interaction, code generation, and textual reasoning, a
comprehensive risk assessment can be achieved. In parallel, developing mitigation techniques
to reduce unsafe behaviors and improve overall agent alignment ensures more reliable and
trustworthy systems.

PhD students interested in AI safety, reinforcement learning, human-AI interaction, or scalable

agentic reasoning will find this research area especially valuable.

Practical Cryptography and AI Security

This research focuses on various problems in practical cryptography (e.g., secure multi-party
computation, differential privacy) and security problems in AI systems (e.g., cryptographic
solutions to preventing information leakage in AI systems). Some recent works include Project
EzPC, and Private Benchmarking for AI. A variety of problems are pursued, requiring either the
development of new cryptographic protocols or the design of secure systems based on sound
cryptographic principles.

PhD students with a strong interest in cryptography, privacy-preserving technologies, and AI

security are encouraged to engage with this topic.

Lec1.2 - AI Research
No ratings yet
Lec1.2 - AI Research
25 pages
01 - ML Research Topics
No ratings yet
01 - ML Research Topics
9 pages
AI Datasets for Higher Education
No ratings yet
AI Datasets for Higher Education
14 pages
LLM Optimization and Acceleration Solutions
No ratings yet
LLM Optimization and Acceleration Solutions
12 pages
Impact Robotic
No ratings yet
Impact Robotic
21 pages
NeurIPS 2023 Openagi When LLM Meets Domain Experts Paper Datasets - and - Benchmarks
No ratings yet
NeurIPS 2023 Openagi When LLM Meets Domain Experts Paper Datasets - and - Benchmarks
30 pages
1.1. Background On Reasoning in Large Language Models (LLMS)
No ratings yet
1.1. Background On Reasoning in Large Language Models (LLMS)
64 pages
CS Graduate Student Handbook - Fall 2024. Final
No ratings yet
CS Graduate Student Handbook - Fall 2024. Final
37 pages
Conclusion of Research Papers
No ratings yet
Conclusion of Research Papers
3 pages
Responsible and Safe AI Course Overview
No ratings yet
Responsible and Safe AI Course Overview
183 pages
The Future of Large Language Models A Futuristic Dissection 3f5kbyor
No ratings yet
The Future of Large Language Models A Futuristic Dissection 3f5kbyor
8 pages
Escholarship UC Item 6kf0r28s
No ratings yet
Escholarship UC Item 6kf0r28s
45 pages
10.2478 - Picbe 2024 0018
No ratings yet
10.2478 - Picbe 2024 0018
14 pages
BeyondAI Proceedings 2024
No ratings yet
BeyondAI Proceedings 2024
19 pages
Unveiling The Mathematical Reasoning in Deepseek Models: A Comparative Study of Large Language Models
No ratings yet
Unveiling The Mathematical Reasoning in Deepseek Models: A Comparative Study of Large Language Models
27 pages
Surveying The Future of Computer and Data Science Education Prospects and Pitfalls of Generative AI On Pedagogical Approaches
No ratings yet
Surveying The Future of Computer and Data Science Education Prospects and Pitfalls of Generative AI On Pedagogical Approaches
12 pages
Research Paper On AI
No ratings yet
Research Paper On AI
4 pages
Navigating The Frontiers of Computer Science
No ratings yet
Navigating The Frontiers of Computer Science
18 pages
Rita 3381842 PP
No ratings yet
Rita 3381842 PP
10 pages
Artificial Intelligence-Enabled Intelligent Assistant
No ratings yet
Artificial Intelligence-Enabled Intelligent Assistant
23 pages
New Trends in Applied Machine Intelligence
No ratings yet
New Trends in Applied Machine Intelligence
17 pages
1533-Article Text-5901-2-10-20240528
No ratings yet
1533-Article Text-5901-2-10-20240528
11 pages
Ethical Considerations For Companies Implementing LLMs in Education Software
No ratings yet
Ethical Considerations For Companies Implementing LLMs in Education Software
6 pages
Question-Answer System On Medical Domain With LLMS Using Various Fine-Tuning Methods
No ratings yet
Question-Answer System On Medical Domain With LLMS Using Various Fine-Tuning Methods
15 pages
Retrieval Augmented Generation For Public/Private Institution Data: Case Study of University of Pisa
No ratings yet
Retrieval Augmented Generation For Public/Private Institution Data: Case Study of University of Pisa
17 pages
Web OS AI - Integrating Large Language Models and Domain-Specific AI Into The Future Web Operating System
No ratings yet
Web OS AI - Integrating Large Language Models and Domain-Specific AI Into The Future Web Operating System
4 pages
LLM Agents Berkeley Intro Sp25
No ratings yet
LLM Agents Berkeley Intro Sp25
16 pages
RAG For Educational Application
No ratings yet
RAG For Educational Application
14 pages
An Interactive Agent Foundation Model
No ratings yet
An Interactive Agent Foundation Model
22 pages
Large Language Models
No ratings yet
Large Language Models
6 pages
IJRPR29621
No ratings yet
IJRPR29621
7 pages
RAG Model for Generative AI Solutions
No ratings yet
RAG Model for Generative AI Solutions
7 pages
Collective Insight Summary
No ratings yet
Collective Insight Summary
4 pages
AID Researchers Are Concentrated in Areas Such As
No ratings yet
AID Researchers Are Concentrated in Areas Such As
1 page
Topictures
No ratings yet
Topictures
6 pages
Feduc 2 1522841
No ratings yet
Feduc 2 1522841
11 pages
Autonomous AI Agents
No ratings yet
Autonomous AI Agents
44 pages
AI Module Guide 2023: Study Path
No ratings yet
AI Module Guide 2023: Study Path
21 pages
Recent Advances in Generative AI and Large Language Models Current Status Challenges and Perspectives-3
No ratings yet
Recent Advances in Generative AI and Large Language Models Current Status Challenges and Perspectives-3
21 pages
Generative AI For Students
100% (1)
Generative AI For Students
5 pages
Anonimous 11
No ratings yet
Anonimous 11
15 pages
M Tech Artificial Intelligence - MODIFIED
No ratings yet
M Tech Artificial Intelligence - MODIFIED
40 pages
From Artificial To Organic - The Evolution of Intelligence in Technology
No ratings yet
From Artificial To Organic - The Evolution of Intelligence in Technology
2 pages
Statement of Purpose - Exemplar 3
No ratings yet
Statement of Purpose - Exemplar 3
2 pages
AI Literacy's Role in Prompt Engineering
No ratings yet
AI Literacy's Role in Prompt Engineering
14 pages
RP Topics 2024 June
No ratings yet
RP Topics 2024 June
2 pages
شات القانزن السعودي
No ratings yet
شات القانزن السعودي
19 pages
Paper+26+ (2024 6 1) +Advancements+and+Applications+of+Generative +JCSTS+
No ratings yet
Paper+26+ (2024 6 1) +Advancements+and+Applications+of+Generative +JCSTS+
7 pages
Generative Artificial Intelligence (AI) in Higher Education: A Comprehensive Review of Opportunities, Challenges and Implications
No ratings yet
Generative Artificial Intelligence (AI) in Higher Education: A Comprehensive Review of Opportunities, Challenges and Implications
20 pages
ChaosGPT: AI Risks and Implications
No ratings yet
ChaosGPT: AI Risks and Implications
159 pages
Li Et Al. - 2023 - Multimodal Foundation Models From Specialists To
No ratings yet
Li Et Al. - 2023 - Multimodal Foundation Models From Specialists To
119 pages
Report 5
No ratings yet
Report 5
1 page
Artificial Intelligence-Enabled Intelligent Assistant For Personalized and Adaptive Learning in Higher Education
No ratings yet
Artificial Intelligence-Enabled Intelligent Assistant For Personalized and Adaptive Learning in Higher Education
29 pages
General Framework For Artificial Intelligence in Higher Education
No ratings yet
General Framework For Artificial Intelligence in Higher Education
7 pages
Om PDF
No ratings yet
Om PDF
4 pages
Toward A Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators
No ratings yet
Toward A Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators
10 pages
PREPRINT
No ratings yet
PREPRINT
21 pages
Deep Learning Course Overview
100% (2)
Deep Learning Course Overview
639 pages
PS-4607 Final Paper - Duration Overruns
No ratings yet
PS-4607 Final Paper - Duration Overruns
13 pages
BPOPS103/203 Module 5 Notes
No ratings yet
BPOPS103/203 Module 5 Notes
25 pages
1 B.Tech Mini Project Instructions
No ratings yet
1 B.Tech Mini Project Instructions
2 pages
Linux Log and Monitoring Guide
No ratings yet
Linux Log and Monitoring Guide
153 pages
2025-03-29
No ratings yet
2025-03-29
10 pages
OSS Applications in E-Learning
No ratings yet
OSS Applications in E-Learning
4 pages
Getting Started With The Cúram Application Development Environment
No ratings yet
Getting Started With The Cúram Application Development Environment
4 pages
Luis Miguel Romances
No ratings yet
Luis Miguel Romances
88 pages
Embedded System Design - PYQ
No ratings yet
Embedded System Design - PYQ
2 pages
Cit End of Term Exams
No ratings yet
Cit End of Term Exams
2 pages
University of Dhaka Department of Computer Science and Engineering
No ratings yet
University of Dhaka Department of Computer Science and Engineering
6 pages
Select Operating System or SAM - AD Files Location
No ratings yet
Select Operating System or SAM - AD Files Location
3 pages
Synonym Reference Guide
No ratings yet
Synonym Reference Guide
271 pages
Estéticas de la Dispersión eBook
No ratings yet
Estéticas de la Dispersión eBook
1 page
Servicemanual Panasonic kv-s3105c, kv-s3085s s1
No ratings yet
Servicemanual Panasonic kv-s3105c, kv-s3085s s1
35 pages
ARM7 TDMI Processor Overview
No ratings yet
ARM7 TDMI Processor Overview
20 pages
Business Statistics by SP Gupta MP Gupta Amctopore: Canli - Atauni.edu - TR 1
No ratings yet
Business Statistics by SP Gupta MP Gupta Amctopore: Canli - Atauni.edu - TR 1
6 pages
Introduction To HTML
No ratings yet
Introduction To HTML
63 pages
Measure and Error Analysis Lab 1
No ratings yet
Measure and Error Analysis Lab 1
7 pages
FNT Whitepaper Unplanned Component Replacements en
No ratings yet
FNT Whitepaper Unplanned Component Replacements en
15 pages
Customer Relationship Management in Hotel Industry
100% (3)
Customer Relationship Management in Hotel Industry
64 pages
1900 65A Datasheet
No ratings yet
1900 65A Datasheet
22 pages
SAP Legacy System Migration
No ratings yet
SAP Legacy System Migration
10 pages
3.33 Best Sites To Buy Telegram Accounts (PVA, Full Access, & Verified)
No ratings yet
3.33 Best Sites To Buy Telegram Accounts (PVA, Full Access, & Verified)
17 pages
Debugging and The Scientific Method
No ratings yet
Debugging and The Scientific Method
7 pages
Shivaji University, Kolhapur Online Statement of Marks For: B.A.NEP.2.0 Part 1 Sem 2 Examination: March-2025
No ratings yet
Shivaji University, Kolhapur Online Statement of Marks For: B.A.NEP.2.0 Part 1 Sem 2 Examination: March-2025
2 pages
Unit Iii XML: Cs T63 - Web Technology - Unit 3
No ratings yet
Unit Iii XML: Cs T63 - Web Technology - Unit 3
46 pages
Brochure Teco 820A
No ratings yet
Brochure Teco 820A
4 pages
HSE MS Rollout Sep 2018 R3 PowerPoint Format
100% (1)
HSE MS Rollout Sep 2018 R3 PowerPoint Format
58 pages
DVT Unit 2
No ratings yet
DVT Unit 2
13 pages