Skip to content

Conversation

@Vasyl198
Copy link

@Vasyl198 Vasyl198 commented Dec 24, 2025

????????? pytest-?????, ???????????? CI ??? GitHub Actions ? pre-commit ? ?????????? ??????????????. ??? ?????????? ??????, ?????????? ?????? ????? ? ???????????? CI.

Summary by CodeRabbit

  • Chores

    • Established automated CI workflow for code validation on push and pull requests.
    • Configured pre-commit hooks for code formatting and linting.
    • Added development dependencies and updated ignore files for environment secrets.
  • Tests

    • Introduced comprehensive test suite covering code execution, utilities, and CLI functionality.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 24, 2025

📝 Walkthrough

Walkthrough

This PR establishes development infrastructure by adding a GitHub Actions CI workflow for automated linting and testing, configuring pre-commit hooks (Black, isort, Flake8), updating gitignore to exclude environment files, adding development dependencies, and introducing comprehensive test suites for code agent, smoke tests, utility formatters, and worker components.

Changes

Cohort / File(s) Summary
CI/CD & Development Infrastructure
.github/workflows/ci.yml, .pre-commit-config.yaml, requirements-dev.txt
Adds GitHub Actions workflow triggering on push/PR to main/master, running linting (Black, isort, Flake8) and pytest on ubuntu-latest. Configures pre-commit hooks with Black 25.11.0, isort 5.12.0, Flake8 7.1.0 (max line-length 88), and standard file fixers. Adds dev dependencies: pytest, pillow, black, flake8, isort, pre-commit.
Git Configuration
.gitignore
Reinstates .DS_Store entry and adds .env to ignore local environment secrets.
Test Suite
tests/test_code_agent.py, tests/test_smoke.py, tests/test_utils_formatters.py, tests/test_worker.py
Introduces tests for code extraction/execution (extract_code_block, execute_code), smoke tests with mocked LMMAgent and external dependencies (pytesseract, pyautogui), formatter utility tests (parse_code_from_string, extract_agent_functions), and worker prediction tests with synthetic screenshot generation. Uses monkeypatching to avoid heavy runtime dependencies.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A burrow prepared with care today,
With workflows, tests, and linters at play.
Pre-commit hooks guard the code so clean,
The finest CI infrastructure ever seen!
This rabbit's den is properly secured,
With quality gates and tests assured. 🛡️

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main changes: adding tests, linters, CI workflow, and pre-commit configuration, all of which are substantiated by the file summaries.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
tests/test_smoke.py (2)

51-84: Code duplication: FakeLMMAgent is duplicated.

This class is identical to the one in tests/test_worker.py. Please see the refactoring suggestion in that file's review to extract this to a shared test utility module.


87-91: Code duplication: Screenshot utility is duplicated.

This function is nearly identical to _create_screenshot in tests/test_worker.py. Consider extracting to a shared test utility module.

🧹 Nitpick comments (4)
requirements-dev.txt (1)

1-6: Consider pinning dependency versions for reproducible builds.

Unpinned dependencies can lead to non-reproducible builds when newer versions introduce breaking changes or unexpected behavior.

🔎 Example with version constraints
-pytest
-pillow
-black
-flake8
-isort
-pre-commit
+pytest>=7.0.0,<9.0.0
+pillow>=10.0.0,<12.0.0
+black==25.11.0
+flake8>=7.0.0,<8.0.0
+isort==5.12.0
+pre-commit>=3.0.0,<5.0.0

Note: Adjust version ranges based on your compatibility requirements.

tests/test_worker.py (2)

11-43: Extract FakeLMMAgent to a shared test utility module.

The FakeLMMAgent class is duplicated in both tests/test_worker.py and tests/test_smoke.py. This violates the DRY principle and makes maintenance harder.

🔎 Suggested approach

Create a new file tests/conftest.py or tests/test_utils.py:

# tests/test_utils.py
class FakeLMMAgent:
    def __init__(self, engine_params=None, system_prompt=None, engine=None):
        self.messages = []
        self.system_prompt = system_prompt or "You are a helpful assistant."

    def reset(self):
        self.messages = [
            {
                "role": "system",
                "content": [{"type": "text", "text": self.system_prompt}],
            }
        ]

    def add_system_prompt(self, prompt):
        self.system_prompt = prompt

    def add_message(self, text_content=None, image_content=None, role=None, **kwargs):
        self.messages.append(
            {
                "role": role or "user",
                "content": [{"type": "text", "text": text_content}],
            }
        )

    def get_response(self, *args, **kwargs):
        return "<thoughts>thinking</thoughts><answer>```python\nagent.wait(0.5)\n```</answer>"

Then import and use in both test files.


46-50: Extract screenshot utility to shared test module.

The _create_screenshot function is duplicated (as _create_screenshot_bytes) in tests/test_smoke.py. Consider extracting to a shared test utility module.

tests/test_smoke.py (1)

48-48: Remove unnecessary noqa directive.

The # noqa: E402 directive is unnecessary because the E402 rule (module level import not at top of file) is not enabled in your Flake8 configuration.

🔎 Proposed fix
-import gui_agents.s3.core.mllm as mllm  # noqa: E402
+import gui_agents.s3.core.mllm as mllm
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2cb57fb and 9902f3d.

📒 Files selected for processing (8)
  • .github/workflows/ci.yml
  • .gitignore
  • .pre-commit-config.yaml
  • requirements-dev.txt
  • tests/test_code_agent.py
  • tests/test_smoke.py
  • tests/test_utils_formatters.py
  • tests/test_worker.py
🧰 Additional context used
🧬 Code graph analysis (3)
tests/test_utils_formatters.py (1)
gui_agents/s3/utils/common_utils.py (1)
  • extract_agent_functions (169-179)
tests/test_worker.py (1)
gui_agents/s3/agents/agent_s.py (1)
  • AgentS3 (48-94)
tests/test_code_agent.py (1)
gui_agents/s3/agents/code_agent.py (2)
  • extract_code_block (11-29)
  • execute_code (32-49)
🪛 actionlint (1.7.9)
.github/workflows/ci.yml

15-15: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 Ruff (0.14.10)
tests/test_smoke.py

13-13: Unused static method argument: image

(ARG004)


13-13: Unused static method argument: output_type

(ARG004)


48-48: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)


52-52: Unused method argument: engine_params

(ARG002)


52-52: Unused method argument: engine

(ARG002)


67-67: Unused method argument: image_content

(ARG002)


67-67: Unused method argument: kwargs

(ARG002)


75-75: Unused method argument: args

(ARG002)


75-75: Unused method argument: kwargs

(ARG002)


119-119: Unpacked variable info is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

tests/test_worker.py

12-12: Unused method argument: engine_params

(ARG002)


12-12: Unused method argument: engine

(ARG002)


27-27: Unused method argument: image_content

(ARG002)


27-27: Unused method argument: kwargs

(ARG002)


35-35: Unused method argument: args

(ARG002)


35-35: Unused method argument: kwargs

(ARG002)


73-73: Unpacked variable info is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

tests/test_code_agent.py

14-14: Unused method argument: timeout

(ARG002)

🔇 Additional comments (9)
.github/workflows/ci.yml (2)

24-30: LGTM! Linter configuration is correct.

The linter steps properly set PYTHONPATH and run Black, isort, and Flake8 in check mode, aligning with the pre-commit configuration.


32-36: LGTM! Test execution is properly configured.

The test step correctly sets PYTHONPATH and runs pytest in quiet mode.

tests/test_utils_formatters.py (1)

5-15: LGTM! Tests are clear and focused.

The tests effectively verify:

  • Parsing Python code blocks from markdown-style backtick notation
  • Extracting agent function calls from code strings
tests/test_code_agent.py (2)

4-15: LGTM! Mock controller interface is appropriate.

The DummyEnvController correctly implements the interface expected by execute_code, providing deterministic responses for testing. The timeout parameter in run_bash_script is intentionally unused as this is a test mock.


18-36: LGTM! Tests comprehensively cover code extraction and execution.

The tests verify:

  • Extracting Python code blocks with language tags
  • Executing Python code via the mock controller
  • Executing Bash code via the mock controller
tests/test_worker.py (1)

53-79: LGTM! Worker test correctly verifies action generation.

The test properly:

  • Sets up the grounding agent and AgentS3 with mock parameters
  • Invokes predict with an instruction and screenshot
  • Asserts that actions are generated and contain the expected wait/sleep directive

The unused info variable can be prefixed with _ if you want to silence the Ruff warning, but it's not critical.

tests/test_smoke.py (3)

9-45: LGTM! Dummy module injection is an effective testing pattern.

The approach of injecting lightweight dummy modules (pytesseract, pyautogui) into sys.modules before importing the code under test is a valid pattern for avoiding heavy external dependencies during testing.


94-124: LGTM! Smoke test effectively validates the agent workflow.

The test properly verifies:

  • Agent can be instantiated with mock parameters
  • predict executes without errors
  • Actions are generated containing the expected sleep directive

The unused info variable can be prefixed with _ to silence the Ruff warning if desired, but it's not critical.


127-142: LGTM! CLI help test validates basic CLI functionality.

The test effectively verifies that the CLI module can be imported with dummy dependencies and that the --help flag executes successfully (exits with code 0).

Comment on lines +15 to +17
- uses: actions/setup-python@v4
with:
python-version: "3.11"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Update actions/setup-python to v5.

The workflow uses actions/setup-python@v4, which is flagged by actionlint as too old to run on current GitHub Actions runners. This may cause the workflow to fail.

🔎 Proposed fix
-      - uses: actions/setup-python@v4
+      - uses: actions/setup-python@v5
         with:
           python-version: "3.11"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- uses: actions/setup-python@v4
with:
python-version: "3.11"
- uses: actions/setup-python@v5
with:
python-version: "3.11"
🧰 Tools
🪛 actionlint (1.7.9)

15-15: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🤖 Prompt for AI Agents
In .github/workflows/ci.yml around lines 15 to 17 the workflow references
actions/setup-python@v4 which is outdated; update the action to
actions/setup-python@v5 and keep the existing python-version input (e.g.,
"3.11"). Edit the uses line to point to the v5 tag and run the workflow to
verify the runner accepts the updated action.

Comment on lines +164 to +167
.DS_Store

# Local env file for secrets
.env
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove duplicate .env entry; consolidate with existing Environments section.

The .env pattern already exists at line 125 within the "Environments" section. The duplicate entry at line 167 should be removed to maintain DRY and improve .gitignore clarity.

Consider either:

  1. Removing lines 166–167 entirely (redundant since .env is already covered), or
  2. Optionally adding an inline comment at line 125 if the intent is to emphasize secrets exclusion.
🔎 Proposed fix: remove duplicate entry
 logs/
 .DS_Store
-
-# Local env file for secrets
-.env
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.DS_Store
# Local env file for secrets
.env
.DS_Store
🤖 Prompt for AI Agents
In .gitignore around lines 164 to 167, there's a duplicate ".env" entry that is
already listed in the "Environments" section (~line 125); remove the redundant
".env" entry (delete lines 166–167) so the file remains DRY, or if emphasis is
desired add a short inline comment at the original entry instead of duplicating
it.

Comment on lines +1 to +20
repos:
- repo: https://github.com/psf/black
rev: 25.11.0
hooks:
- id: black
language_version: python3.11
- repo: https://github.com/PyCQA/isort
rev: 5.12.0
hooks:
- id: isort
- repo: https://github.com/pre-commit/mirrors-flake8
rev: 7.1.0
hooks:
- id: flake8
args: ["--max-line-length=88", "--extend-ignore=E203,W503"]
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: end-of-file-fixer
- id: trailing-whitespace
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

latest stable version Black Python formatter 2024 2025

💡 Result:

The latest stable Black release is 25.12.0 (released Dec 8, 2025). [1][2]

Sources:

  • PyPI: black 25.12.0 release info. [1]
  • Black changelog / docs (25.12.0). [2]

🌐 Web query:

latest stable version isort Python 2024 2025

💡 Result:

Latest stable isort: 7.0.0 — released Oct 11, 2025. [1][2]

Sources:
[1] PyPI isort project page.
[2] Safety DB package entry for isort.


🌐 Web query:

latest stable version Flake8 linter 2024 2025

💡 Result:

The latest stable Flake8 release is 7.3.0 (released June 20, 2025). [1][2]

Sources:
[1] PyPI — flake8 release page.
[2] Flake8 documentation (release/FAQ pages).


🌐 Web query:

latest stable version pre-commit-hooks 2024 2025

💡 Result:

  • Latest stable (as of Dec 24, 2025): pre-commit-hooks v6.0.0. [1][2]
  • Latest stable in 2024 (most-recent 2024 series): v4.6.0 (the 4.x line). [2]

Sources:
[1] PyPI — pre-commit-hooks (release files show v6.0.0 uploaded Aug 9, 2025).
[2] GitHub — pre-commit/pre-commit-hooks releases/tags (shows v6.0.0, v5.0.0, v4.6.0, etc.).


Update pre-commit hook versions to current stable releases.

The pre-commit configuration is well-structured and Flake8 arguments correctly accommodate Black's formatting (line length 88, ignoring E203 and W503).

However, all hook versions are outdated:

  • Black: update from 25.11.0 to 25.12.0
  • isort: update from 5.12.0 to 7.0.0 (major version upgrade)
  • Flake8: update from 7.1.0 to 7.3.0
  • pre-commit-hooks: update from v4.6.0 to v6.0.0 (major version upgrade)

Verify compatibility with your codebase when updating isort and pre-commit-hooks due to their major version changes.

🤖 Prompt for AI Agents
.pre-commit-config.yaml lines 1-20: update the hook revisions to the requested
newer stable releases; change black rev to 25.12.0, isort rev to 7.0.0,
mirrors-flake8 rev to 7.3.0, and pre-commit-hooks rev to v6.0.0, then run
pre-commit autoupdate or reinstall hooks and run the test suite/linting to
verify isort and pre-commit-hooks major-version compatibility with the codebase.

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- uses: actions/checkout@v4
- uses: actions/checkout@v6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants