fatwang2
diff --git a/‎.github/workflows/tests.yaml
Lines changed: 2 additions & 2 deletions b/‎.github/workflows/tests.yaml
Lines changed: 2 additions & 2 deletions
diff --git a/‎CHANGELOG.md
Lines changed: 198 additions & 2 deletions b/‎CHANGELOG.md
Lines changed: 198 additions & 2 deletions
diff --git a/‎README.md
Lines changed: 5 additions & 11 deletions b/‎README.md
Lines changed: 5 additions & 11 deletions
diff --git a/‎dev-requirements.txt
Lines changed: 3 additions & 2 deletions b/‎dev-requirements.txt
Lines changed: 3 additions & 2 deletions
diff --git a/‎examples/README.md
Lines changed: 2 additions & 2 deletions b/‎examples/README.md
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/bot-ready-signalling/README.md
Lines changed: 45 additions & 0 deletions b/‎examples/bot-ready-signalling/README.md
Lines changed: 45 additions & 0 deletions
diff --git a/‎examples/bot-ready-signalling/client/javascript/README.md
Lines changed: 27 additions & 0 deletions b/‎examples/bot-ready-signalling/client/javascript/README.md
Lines changed: 27 additions & 0 deletions
@@ -1,4 +1,4 @@
-name: test
+name: tests
 
 on:
   workflow_dispatch:
@@ -49,4 +49,4 @@ jobs:
       - name: Test with pytest
         run: |
           source .venv/bin/activate
-          pytest --ignore-glob="*to_be_updated*" --ignore-glob=*pipeline_source* src tests
+          pytest
@@ -5,12 +5,205 @@ All notable changes to **Pipecat** will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.0.55] - 2025-02-05
+
+### Added
+
+- Added a new `start_metadata` field to `PipelineParams`. The provided metadata
+  will be set to the initial `StartFrame` being pushed from the `PipelineTask`.
+
+- Added new fields to `PipelineParams` to control audio input and output sample
+  rates for the whole pipeline. This allows controlling sample rates from a
+  single place instead of having to specify sample rates in each
+  service. Setting a sample rate to a service is still possible and will
+  override the value from `PipelineParams`.
+
+- Introduce audio resamplers (`BaseAudioResampler`). This is just a base class
+  to implement audio resamplers. Currently, two implementations are provided
+  `SOXRAudioResampler` and `ResampyResampler`. A new
+  `create_default_resampler()` has been added (replacing the now deprecated
+  `resample_audio()`).
+
+- It is now possible to specify the asyncio event loop that a `PipelineTask` and
+  all the processors should run on by passing it as a new argument to the
+  `PipelineRunner`. This could allow running pipelines in multiple threads each
+  one with its own event loop.
+
+- Added a new `utils.TaskManager`. Instead of a global task manager we now have
+  a task manager per `PipelineTask`. In the previous version the task manager
+  was global, so running multiple simultaneous `PipelineTask`s could result in
+  dangling task warnings which were not actually true. In order, for all the
+  processors to know about the task manager, we pass it through the
+  `StartFrame`. This means that processors should create tasks when they receive
+  a `StartFrame` but not before (because they don't have a task manager yet).
+
+- Added `TelnyxFrameSerializer` to support Telnyx calls. A full running example
+  has also been added to `examples/telnyx-chatbot`.
+
+- Allow pushing silence audio frames before `TTSStoppedFrame`. This might be
+  useful for testing purposes, for example, passing bot audio to an STT service
+  which usually needs additional audio data to detect the utterance stopped.
+
+- `TwilioSerializer` now supports transport message frames. With this we can
+  create Twilio emulators.
+
+- Added a new transport: `WebsocketClientTransport`.
+
+- Added a `metadata` field to `Frame` which makes it possible to pass custom
+  data to all frames.
+
+- Added `test/utils.py` inside of pipecat package.
+
+### Changed
+
+- `GatedOpenAILLMContextAggregator` now require keyword arguments. Also, a new
+  `start_open` argument has been added to set the initial state of the gate.
+
+- Added `organization` and `project` level authentication to
+  `OpenAILLMService`.
+
+- Improved the language checking logic in `ElevenLabsTTSService` and
+  `ElevenLabsHttpTTSService` to properly handle language codes based on model
+  compatibility, with appropriate warnings when language codes cannot be
+  applied.
+
+- Updated `GoogleLLMContext` to support pushing `LLMMessagesUpdateFrame`s that
+  contain a combination of function calls, function call responses, system
+  messages, or just messages.
+
+- `InputDTMFFrame` is now based on `DTMFFrame`. There's also a new
+  `OutputDTMFFrame` frame.
+
+### Deprecated
+
+- `resample_audio()` is now deprecated, use `create_default_resampler()`
+  instead.
+
+### Removed
+
+- `AudioBufferProcessor.reset_audio_buffers()` has been removed, use
+  `AudioBufferProcessor.start_recording()` and
+  ``AudioBufferProcessor.stop_recording()` instead.
+
+### Fixed
+
+- Fixed a `AudioBufferProcessor` that would cause crackling in some recordings.
+
+- Fixed an issue in `AudioBufferProcessor` where user callback would not be
+  called on task cancellation.
+
+- Fixed an issue in `AudioBufferProcessor` that would cause wrong silence
+  padding in some cases.
+
+- Fixed an issue where `ElevenLabsTTSService` messages would return a 1009
+  websocket error by increasing the max message size limit to 16MB.
+
+- Fixed a `DailyTransport` issue that would cause events to be triggered before
+  join finished.
+
+- Fixed a `PipelineTask` issue that was preventing processors to be cleaned up
+  after cancelling the task.
+
+- Fixed an issue where queuing a `CancelFrame` to a pipeline task would not
+  cause the task to finish. However, using `PipelineTask.cancel()` is still the
+  recommended way to cancel a task.
+
+### Other
+
+- Improved Unit Test `run_test()` to use `PipelineTask` and
+  `PipelineRunner`. There's now also some control around `StartFrame` and
+  `EndFrame`. The `EndTaskFrame` has been removed since it doesn't seem
+  necessary with this new approach.
+
+- Updated `twilio-chatbot` with a few new features: use 8000 sample rate and
+  avoid resampling, a new client useful for stress testing and testing locally
+  without the need to make phone calls. Also, added audio recording on both the
+  client and the server to make sure the audio sounds good.
+
+- Updated examples to use `task.cancel()` to immediately exit the example when a
+  participant leaves or disconnects, instead of pushing an `EndFrame`. Pushing
+  an `EndFrame` causes the bot to run through everything that is internally
+  queued (which could take some seconds). Note that using `task.cancel()` might
+  not always be the best option and pushing an `EndFrame` could still be
+  desirable to make sure all the pipeline is flushed.
+
+## [0.0.54] - 2025-01-27
+
+### Added
+
+- In order to create tasks in Pipecat frame processors it is now recommended to
+  use `FrameProcessor.create_task()` (which uses the new
+  `utils.asyncio.create_task()`). It takes care of uncaught exceptions, task
+  cancellation handling and task management. To cancel or wait for a task there
+  is `FrameProcessor.cancel_task()` and `FrameProcessor.wait_for_task()`. All of
+  Pipecat processors have been updated accordingly. Also, when a pipeline runner
+  finishes, a warning about dangling tasks might appear, which indicates if any
+  of the created tasks was never cancelled or awaited for (using these new
+  functions).
+
+- It is now possible to specify the period of the `PipelineTask` heartbeat
+  frames with `heartbeats_period_secs`.
+
+- Added `DailyMeetingTokenProperties` and `DailyMeetingTokenParams` Pydantic models
+  for meeting token creation in `get_token` method of `DailyRESTHelper`.
+
+- Added `enable_recording` and `geo` parameters to `DailyRoomProperties`.
+
+- Added `RecordingsBucketConfig` to `DailyRoomProperties` to upload recordings to a custom AWS bucket.
+
+### Changed
+
+- Enhanced `UserIdleProcessor` with retry functionality and control over idle
+  monitoring via new callback signature `(processor, retry_count) -> bool`.
+  Updated the `17-detect-user-idle.py` to show how to use the `retry_count`.
+
+- Add defensive error handling for `OpenAIRealtimeBetaLLMService`'s audio
+  truncation. Audio truncation errors during interruptions now log a warning
+  and allow the session to continue instead of throwing an exception.
+
+- Modified `TranscriptProcessor` to use TTS text frames for more accurate assistant
+  transcripts. Assistant messages are now aggregated based on bot speaking boundaries
+  rather than LLM context, providing better handling of interruptions and partial
+  utterances.
+
+- Updated foundational examples `28a-transcription-processor-openai.py`,
+  `28b-transcript-processor-anthropic.py`, and
+  `28c-transcription-processor-gemini.py` to use the updated
+  `TranscriptProcessor`.
+
+### Fixed
+
+- Fixed an `GeminiMultimodalLiveLLMService` issue that was preventing the user
+  to push initial LLM assistant messages (using `LLMMessagesAppendFrame`).
+
+- Added missing `FrameProcessor.cleanup()` calls to `Pipeline`,
+  `ParallelPipeline` and `UserIdleProcessor`.
+
+- Fixed a type error when using `voice_settings` in `ElevenLabsHttpTTSService`.
+
+- Fixed an issue where `OpenAIRealtimeBetaLLMService` function calling resulted
+  in an error.
+
+- Fixed an issue in `AudioBufferProcessor` where the last audio buffer was not
+  being processed, in cases where the `_user_audio_buffer` was smaller than the
+  buffer size.
+
+### Performance
+
+- Replaced audio resampling library `resampy` with `soxr`. Resampling a 2:21s
+  audio file from 24KHz to 16KHz took 1.41s with `resampy` and 0.031s with
+  `soxr` with similar audio quality.
+
+### Other
+
+- Added initial unit test infrastructure.
+
 ## [0.0.53] - 2025-01-18
 
 ### Added
 
-- Added `ElevenLabsHttpTTSService` and the
-  `07d-interruptible-elevenlabs-http.py` foundational example.
+- Added `ElevenLabsHttpTTSService` which uses EleveLabs' HTTP API instead of the
+  websocket one.
 
 - Introduced pipeline frame observers. Observers can view all the frames that go
   through the pipeline without the need to inject processors in the
@@ -1381,6 +1574,9 @@ async def on_connected(processor):
 
 ### Changed
 
+- `FrameSerializer.serialize()` and `FrameSerializer.deserialize()` are now
+  `async`.
+
 - `Filter` has been renamed to `FrameFilter` and it's now under
   `processors/filters`.
 
 
@@ -2,7 +2,7 @@
  <img alt="pipecat" width="300px" height="auto" src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png">
 </div></h1>
 
-[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) <a href="https://app.commanddash.io/agent/github_pipecat-ai_pipecat"><img src="https://img.shields.io/badge/AI-Code%20Agent-EB9FDA"></a>
+[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) <a href="https://app.commanddash.io/agent/github_pipecat-ai_pipecat"><img src="https://img.shields.io/badge/AI-Code%20Agent-EB9FDA"></a>
 
 Pipecat is an open source Python framework for building voice and multimodal conversational agents. It handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions, letting you focus on creating engaging experiences.
 
@@ -53,13 +53,7 @@ To keep things lightweight, only the core framework is included by default. If y
 pip install "pipecat-ai[option,...]"
 ```
 
-Or you can install all of them with:
-
-```shell
-pip install "pipecat-ai[all]"
-```
-
-Available options include:
+### Available services
 
 | Category            | Services                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | Install Command Example                 |
 | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- |
@@ -87,7 +81,7 @@ Here is a very basic Pipecat bot that greets a user when they join a real-time s
 ```python
 import asyncio
 
-from pipecat.frames.frames import EndFrame, TextFrame
+from pipecat.frames.frames import TextFrame
 from pipecat.pipeline.pipeline import Pipeline
 from pipecat.pipeline.task import PipelineTask
 from pipecat.pipeline.runner import PipelineRunner
@@ -128,7 +122,7 @@ async def main():
   # Register an event handler to exit the application when the user leaves.
   @transport.event_handler("on_participant_left")
   async def on_participant_left(transport, participant, reason):
-    await task.queue_frame(EndFrame())
+    await task.cancel()
 
   # Run the pipeline task
   await runner.run(task)
@@ -195,7 +189,7 @@ pip install "path_to_this_repo[option,...]"
 From the root directory, run:
 
 ```shell
-pytest --doctest-modules --ignore-glob="*to_be_updated*" --ignore-glob=*pipeline_source* src tests
+pytest
 ```
 
 ## Setting up your editor
 
@@ -1,10 +1,11 @@
 build~=1.2.2
-grpcio-tools~=1.69.0
+grpcio-tools~=1.67.1
 pip-tools~=7.4.1
 pre-commit~=4.0.1
 pyright~=1.1.392
 pytest~=8.3.4
+pytest-asyncio~=0.25.2
 ruff~=0.9.1
-setuptools~=75.8.0
+setuptools~=70.0.0
 setuptools_scm~=8.1.0
 python-dotenv~=1.0.1
@@ -39,10 +39,10 @@ Next, follow the steps in the README for each demo.
 | [Translation Chatbot](translation-chatbot)   | Listens for user speech, then translates that speech to Spanish and speaks the translation back. Demonstrates multi-participant use-cases. | Deepgram, Azure, OpenAI, Daily, Daily Prebuilt UI                 |
 | [Moondream Chatbot](moondream-chatbot)       | Demonstrates how to add vision capabilities to GPT4. **Note: works best with a GPU**                                                       | Deepgram, ElevenLabs, OpenAI, Moondream, Daily, Daily Prebuilt UI |
 | [Patient intake](patient-intake)             | A chatbot that can call functions in response to user input.                                                                               | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI            |
-| [Dialin Chatbot](dialin-chatbot)             | A chatbot that connects to an incoming phone call from Daily or Twilio.                                                                    | Deepgram, ElevenLabs, OpenAI, Daily, Twilio                       |
+| [Phone Chatbot](phone-chatbot)             | A chatbot that connects to PSTN/SIP phone calls, powered by Daily or Twilio.                                                                    | Deepgram, ElevenLabs, OpenAI, Daily, Twilio                       |
 | [Twilio Chatbot](twilio-chatbot)             | A chatbot that connects to an incoming phone call from Twilio.                                                                             | Deepgram, ElevenLabs, OpenAI, Daily, Twilio                       |
 | [studypal](studypal)                         | A chatbot to have a conversation about any article on the web                                                                              |                                                                   |
-| [WebSocket Chatbot Server](websocket-server) | A real-time websocket server that handles audio streaming and bot interactions with speech-to-text and text-to-speech capabilities | `python-websockets`, `openai`, `deepgram`, `silero-tts`, `numpy` |
+| [WebSocket Chatbot Server](websocket-server) | A real-time websocket server that handles audio streaming and bot interactions with speech-to-text and text-to-speech capabilities. | Cartesia, Deepgram, OpenAI, Websockets |
 
 > [!IMPORTANT]
 > These example projects use Daily as a WebRTC transport and can be joined using their hosted Prebuilt UI.
 
@@ -0,0 +1,45 @@
+# Bot ready signaling
+
+A simple Pipecat example demonstrating how to handle signaling between the client and the bot, 
+ensuring that the bot starts sending audio only when the client is available, 
+thereby avoiding the risk of cutting off the beginning of the audio.
+
+## Quick Start
+
+### First, start the bot server:
+
+1. Navigate to the server directory:
+   ```bash
+   cd server
+   ```
+2. Create and activate a virtual environment:
+   ```bash
+   python3 -m venv venv
+   source venv/bin/activate  # On Windows: venv\Scripts\activate
+   ```
+3. Install requirements:
+   ```bash
+   pip install -r requirements.txt
+   ```
+4. Copy env.example to .env and configure:
+   - Add your API keys
+5. Start the server:
+   ```bash
+   python server.py
+   ```
+
+### Next, connect using the client app:
+
+For client-side setup, refer to the [JavaScript Guide](client/javascript/README.md).
+
+## Important Note
+
+Ensure the bot server is running before using any client implementations.
+
+## Requirements
+
+- Python 3.10+
+- Node.js 16+ (for JavaScript)
+- Daily API key
+- Cartesia API key
+- Modern web browser with WebRTC support
@@ -0,0 +1,27 @@
+# JavaScript Implementation
+
+Basic implementation using the [Pipecat JavaScript SDK](https://docs.pipecat.ai/client/js/introduction).
+
+## Setup
+
+1. Run the bot server. See the [server README](../../README).
+
+2. Navigate to the `client/javascript` directory:
+
+```bash
+cd client/javascript
+```
+
+3. Install dependencies:
+
+```bash
+npm install
+```
+
+4. Run the client app:
+
+```
+npm run dev
+```
+
+5. Visit http://localhost:5173 in your browser.