Introduction to the Thunder Cloud Project Technical CommitteeAtomu Hidaka
This document introduces the activities of the Thunder Cloud Project Technical Committee, which were explained at the community booth at the Microsoft Developer Day "Maximizing Developer Power with AI" held in November 2024.
The Thunder Cloud Project has focused on promoting DX and IoT, but recently has also been actively working to promote AI.
In particular, we would like to introduce an example of data analysis using RAG, which was achieved by importing data with meaning attached to the CSV format of EnOcena IoT using Azure AI Studio.
The Thunder Cloud Project is incorporating the latest technologies in this way to promote DX and IoT in various areas.
映像やCG制作の現場において、AIの技術は様々な自動化・クリエイティブで利用され始めています。そのAI そのものは、従来のデジタル制作と異なる性質も持っており、道具としての AI を正しく理解しておくことも重要です。このセッションでは、既存のAI技術を紹介しつつ、どのようにクリエイティブの現場に取り入れ、理解していくのかをご紹介します。
映像やCG制作の現場において、AIの技術は様々な自動化・クリエイティブで利用され始めています。そのAI そのものは、従来のデジタル制作と異なる性質も持っており、道具としての AI を正しく理解しておくことも重要です。このセッションでは、既存のAI技術を紹介しつつ、どのようにクリエイティブの現場に取り入れ、理解していくのかをご紹介します。
Dev Containers Customization Short versionTakao Tetsuro
describe customization of dev containers, it has three ways how to containerization.
First is that add dev container configuration to existing program. And define the base image of dev container that is runtime of programming language as require for your project then add features. The last one is that create program with .net template then dev containerize.
Developers Containers for Basis, for team development.Takao Tetsuro
Development containers (aka. Dev containers) are one option for streamlining team development. You can create a development environment that suits your development very easily with Visual Studio Code. In this session, we will provide examples of how to create several development environments and also explain sources for collecting information, such as the Dev Containers community.
Service Mesh endpoint needs features such as the Logging feature, the Hardware abstraction feature, Authentication and Authorization and so on, these features are provided several cloud venders as a service, or also can use the Envoy server and the Istio service mesh pilot feature. But creating the service mesh endpoint with ASP.NET Cor Web API minimal template is efficient to learn these cloud native architecture.
The Options Pattern can build a hierarchical settings values structure. In the previous article [ASP .NET Core Options Pattern], a settings values of The .NET Generic Host that created by the host builder were registered to the host as a service as it is, and were used in the UI layer although, the Options Pattern in .NET Core must be applied the Options Pattern as the configuration service before registered to the host.
For team development, Microservices fits for team development, Atomic Design is well working to Microservices development if layout is devides from contents.
WebAssemblyとBlazor 、WebAssembly System Interfaceでコンテナライズの設計を解説Takao Tetsuro
WebAssembly(WASM)とWebAssembly System Interface(WASI)は、コンテナライゼーションのアーキテクチャのひとつです。DockerやWSL(Windows Subsystem for Linux)と同じく、皆さんの業務ロジックにモビリティとスケーラビリティを与えてくれます。モビリティとスケーラビリティを考慮したプログラムを作る一例として、Rust、Nodeなどの技術を交えコンテナライゼーションを解説します。
4. Phi-2:2.7billion
Foundation model
Microsoft Prometheus(GPT-4):1trillion
for Bing AI ?
Copilot:billions
Pretrained model
(Bing Chat Copilot:1.7billion)
Microsoft & NVIDIA Megatron-Turing NLG:530billion
Microsoft Turing NLG:17billion
Foundation model
Microsoft Research Data&AI
“Generate solutions from a wide range of options”
≠
“Fast inference calculations” and “Computations with low power consumption”
The power required for calculation is small and inference
calculations are fast. A task-specific AI that generates solutions
from a limited range of attributes.
AGI
Artificial
General
Intelligence
To get closer to human thinking using AI orchestration.
The purpose is to exceed the limits of AI (as of 2023, it is said to be “Baby AGI”)
MAI
multimodal
Artifical
Intelligence
Services in the area of advertising
creative production:13billion
Cyber Agent Japanese LLM:6.8billion
Llama 2:7/13/70 billion
Foundation model
GPT-4:1.76trillion
Foundation model
GPT-5:17trillion? (GPT3x100)
Foundation model
Google PaLM2:340 billion
Exact numbers are unknown due to internal document leak
Technical report says
PaLM 2-L(Unicorn):340billion
PaLM 2-M(Bison): 147billion
PaLM 2-S(Gecko):30billion
Google FLAN-UL5:50billion FLAN-UL2:20billion
Google Pathways:540billion
Claude 2:52billion
基盤モデル
BloombergGPT:50billion
Finance
AWS Titan Foundation Model:100billion
Amazon Olympus:2trillion
for Alexa ?
Amazon Alexa model:20billion
AWS Titan Foundation Model
IBM Japanese LLM:8billion
IBM Granite:13billion
Foundation mode;
IBM watsonx Code Assistant for Z:20billion
115Languages support
Such as gradual converting Cobol to Java
Oracle Text Embeddings:355million
Oracle Text Generation:52billion
Oracle Text Summarization: 52billion
Large scale Specialized
Gemini Ultra:540billion Gemini Pro:60billion Gemini Nano-1:1.8billion
Gemini Nano-2:3.25billion
Orca-2:13billion 7billion
Small model trainer model
5. AppleはAIビデオ圧縮のスタートアップWaveOneを買収したり、元Googleの検索責任者ジョン・ジャナンドレアを雇用したりするなど、AIに投資している。
同様に、Googleは2023年のGoogle IOで、Googleフォトから始まったMagic EraserがMagic Editorにアップグレードされたことを発表しており、すでにPixel 8にはG3チップが搭載されています。
AI
AI AI
会話のリアルタイム翻訳
メールの概要生成
会議記録の生成
写真編集や撮影補助
本人確認が必要
インターネット接続が必要
クラウド コンピューティングの能力に依存
遅延が発生
https://blog.google/products/photos/google-photos-magic-editor-pixel-io-2023/
Magic Editor in Google Photos
Apple unveils M3, M3 Pro, and M3 Max
https://www.apple.com/newsroom/2023/10/apple-unveils-m3-m3-pro-and-m3-
max-the-most-advanced-chips-for-a-personal-computer/
Google Tensor G3
https://blog.google/products/pixel/google-tensor-g3-pixel-8/
10. Copilot for
Fabric
Copilot for
Microsoft 365
秘密度ラベル
DLPポリシー
図の出典:https://learn.microsoft.com/ja-jp/training/modules/introduction-end-analytics-use-microsoft-fabric/2-explore-analytics-fabric
Microsoft Fabric でのガバナンスとコンプライアンス
https://learn.microsoft.com/ja-jp/fabric/governance/governance-compliance-overview
13. Information source
is Microsoft 365
data
YES Microsoft 365
account access
Customize for using
company’s data
YES No code
Low code
YES Azure AI Studio
NO Semantic Kernel
programming
NO Copilot Studio
Microsoft Syntex
NO Identity Federation
is complete
YES
with Entra ID
Entra ID controls
access
Azure storages
Azure AI Studio
Semantic Kernel
programming
with non-Entra ID
No code
Low code
YES
Any tools of identity
provider (if
possible)
NO
programming
(LangChain,
Semantic Kernel)
Connector is
existing
Azure AI Studio
NO
Several AI
schemas of data
source in individual
access permission
Multimodal AI
orchestration
programming
工数とトレードオフ 推奨
凡例
ステップ1: Microsoft 365 データを活用しますか?
ステップ2: カスタマイズは必要ですか?
開発方法は?
15. AI service
Orchestration
Models
(Vector Embeddings,
NLP※1)
Vector Memory
Storage
Persistent Layer
Microsoft Copilot(AI orchestration)
(Microsoft 365 Copilot, …※2)
Copilot Studio
Copilot
(ex. GitHub X is Codex + GPT-
4)
Copilot
Microsoft Azure tenant storage
(SharePoint, GitHub, OneLake)
Azure OpenAI Service
Azure AI Studio
Open AI
(GPT3.5, 4)
Azure AI Search
JSONL file / Azure BLOB
Programming area
AI を導入時は、データ レイヤー、AI サービス、ベクター埋め込み機能、およびこれらのリソースにアクセスできるアカウントを設計します。
※1:NLP (Natural Language Processing)
ベクトル埋め込みによって文字やテキストを定量
化し、感情分析、機械翻訳、テキスト分類などを
実行します。 学習により、常識、言語理解、論
理的推論が可能になります。
※2:Windows Copilot, GitHub Copilot,
Security Copilot, Bing Chat Copilot,
Power Platform Copilot, Dynamics 365
Copilot, Microsoft Syntex, Copilot for
Azure, Fabric Copilot (Copilot for Data
Science and Data Engineering, Copilot for
Data Factory, Copilot for Power BI)
Custom Web UI
Semantic Kernel
Phi-2
(& SLM container)
Cosmos DB
Ollama(後述)
MongoDB
30. docker pull milvusdb/milvus
Ollamaの機能
GPU Acceleration
Effortless Model Management
Automatic Memory Management
Support for a Wide Range of Models
Effortless Setup and Seamless Switching
Accessible Web User Interface (WebUI) Options
36. OpenAI Brand guidelines
https://openai.com/brand
Google Brand Resource Center: Logos list
https://about.google/brand-resource-center/logos-list/
PaLM 2 Technical Report
https://ai.google/static/documents/palm2techreport.pdf
MongoDB Brand Resources
https://www.mongodb.com/brand-resources
Gemini Cheat Sheet: Google’s State-of-the-Art Multimodal Assistant Explained
https://gradientflow.com/gemini-cheat-sheet-googles-state-of-the-art-multimodal-assistant-explained/
Microsoft Research Data&AI
https://www.microsoft.com/en-us/research/group/dataai/
Microsoft Copilot Studio
https://www.microsoft.com/en-us/microsoft-copilot/microsoft-copilot-studio
Azure AI Studio
https://azure.microsoft.com/ja-jp/products/ai-studio
クイック スタート: 独自のデータを使用して Azure OpenAI モデルとチャットする
https://learn.microsoft.com/ja-jp/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line%2Cpython&pivots=programming-language-studio
THE BALANCING ACT OF TRAINING GENERATIVE AI
https://www.nextplatform.com/2023/07/17/the-balancing-act-of-training-generative-ai/
【Oracle Cloud ウェビナー】 LLM(大規模言語モデル)などの生成AIで圧倒的なコスト・パフォーマンスを提供するOracle AI インフラストラクチャ
https://speakerdeck.com/oracle4engineer/ocwc_20231004_generativeai?slide=7
Pretrained Foundational Models in Generative AI
https://docs.oracle.com/en-us/iaas/Content/generative-ai/pretrained-models.htm
VizSeek: AI-based visual search platform deployment on Oracle Cloud
https://docs.oracle.com/en/solutions/vizseek-on-oci/index.html#GUID-8F7CCB28-AAC9-4317-AD90-39246E19D29A
37. Oracle’s generative AI strategy
https://blogs.oracle.com/ai-and-datascience/post/generative-ai-strategy
Azure AI Search
https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search
Chroma
https://docs.trychroma.com/
Pinecone (C#)
https://about.google/brand-resource-center/logos-list/
Postgres (C#)
https://about.google/brand-resource-center/logos-list/
Qdrant (C#)
https://about.google/brand-resource-center/logos-list/
Redis (C#)
https://about.google/brand-resource-center/logos-list/
SQLite (C#)
https://about.google/brand-resource-center/logos-list/
Weaviate (C#) and for Python
https://about.google/brand-resource-center/logos-list/
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
https://arxiv.org/abs/2201.11903
Orca-2: Teaching Small Language Models How to Reason
https://www.microsoft.com/en-us/research/publication/orca-2-teaching-small-language-models-how-to-reason/
TensorFlow Hub
https://www.tensorflow.org/hub?hl=en
MODEL ZOO
https://pytorch.org/serve/model_zoo.html
38. How to Deploy Computer Vision Models Offline
https://blog.roboflow.com/deploy-computer-vision-models-offline/
Use metadata to find content in document libraries in Microsoft Syntex
https://learn.microsoft.com/en-us/microsoft-365/syntex/metadata-search
Key concepts - Use Power Automate connectors in Microsoft Copilot Studio (Preview)
https://learn.microsoft.com/en-us/microsoft-copilot-studio/advanced-connectors
Manage your multi-cloud identity infrastructure with Microsoft Entra
https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/manage-your-multi-cloud-identity-infrastructure-with-microsoft/ba-p/3709677
Customize a model with fine-tuning
https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo%2Cpython&pivots=programming-language-studio
Microsoft Copilot for Microsoft 365 overview
https://learn.microsoft.com/en-us/microsoft-365-copilot/microsoft-365-copilot-overview
Introducing BloombergGPT, Bloomberg’s 50-billion parameter large language model, purpose-built from scratch for finance
https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/
GPTQ: ACCURATE POST-TRAINING QUANTIZATION FOR GENERATIVE PRE-TRAINED TRANSFORMERS
https://arxiv.org/pdf/2210.17323.pdf
Hugging Face
https://huggingface.co/
TensorFlow Hub
https://www.tensorflow.org/hub?hl=en
PyTorch Zoo
https://www.microsoft.com/en-us/research/publication/orca-2-teaching-small-language-models-how-to-reason/
Introducing Atlas Vector Search: Build Intelligent Applications with Semantic Search and AI Over Any Type of Data
https://www.mongodb.com/blog/post/introducing-atlas-vector-search-build-intelligent-applications-semantic-search-ai
Microsoft.SemanticKernel.Connectors.MongoDB
https://github.com/microsoft/semantic-kernel/tree/main/dotnet/src/Connectors/Connectors.Memory.MongoDB
39. GPU は不要。localllm を使用してローカル CPU で生成 AI アプリを開発
https://cloud.google.com/blog/ja/products/application-development/new-localllm-lets-you-develop-gen-ai-apps-locally-without-gpus?hl=ja
Docker hub: ollama/quantize
https://hub.docker.com/r/ollama/quantize
GitHub: Ollama WebUI
https://github.com/open-webui/open-webui
Open WebUI (Formerly Ollama WebUI)
https://github.com/open-webui/open-webui
Introducing Gemini: our largest and most capable AI model
https://blog.google/technology/ai/google-gemini-ai/#sundar-note
Editor's Notes
#2: Microsoft のSLMであるPhi-2をローカルで動かします。データ層と言語モデル、Semantic Karnel、UIのアーキテクチャーを解説し、RAGやベクターインデックスとの関連のお話をします。1月の振り返りでは、少し追加した情報があります。
#10: もちろんエンタープライズReadyなMicrosoftも同様に簡単なAI利用から大規模な企業データの包括的な利活用に利用できるサービスがそろっています。Microsoft 365のデータ利活用からCopilot Studioを使ったビジネスインテリジェンス基盤であるPower Platformへの統合、OneLake上で統合されるDelta Parquet 形式のデータはSynapse Data ScienceによってAIのモデル作成・管理に利用できます。このモデルを使ってAzure AI Studioで作成されたカスタムCopilotはCopilot Studioで再利用することもできますし、Copilot for Fabric側で利用するフローを構築することもできます。OneLakeのデータは企業に導入されているPurviewの機能をそのまま使えます【説明:情報保護とデータ損失防止】。
#12: ひとつめは、企業でのAIによるMicrosoft 365データの利活用で最も簡単なのは用語ストアの出力とAzure Open AI ServiceのAdd your dataへのインポートです。ただし、左の画面を見ていただくとわかるのですがSharePoint管理センターの機能ですからMicrosoft 365管理者のロールが必要です。企業によってはこのロールを許可していない場合もあります。
#15: Microsoft 365のデータように高度にセキュリティやプライバシー、企業のガバナンスが守られているという点について、私が推奨するコンテンツがあります。これはMicrosoft 365習熟度モデルという設計概念で、レベル100~500までの習熟度が定義されていますので、皆さんの企業、または皆さんの顧客のレベルを評価し、上のレベルの習熟度に向けた取り組みを提案、実施していただけるといいと思います。