ci: add tests, linters, CI and pre-commit config #160

Vasyl198 · 2025-12-24T13:07:25Z

????????? pytest-?????, ???????????? CI ??? GitHub Actions ? pre-commit ? ?????????? ??????????????. ??? ?????????? ??????, ?????????? ?????? ????? ? ???????????? CI.

Summary by CodeRabbit

Chores
- Established automated CI workflow for code validation on push and pull requests.
- Configured pre-commit hooks for code formatting and linting.
- Added development dependencies and updated ignore files for environment secrets.
Tests
- Introduced comprehensive test suite covering code execution, utilities, and CLI functionality.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-24T13:07:34Z

📝 Walkthrough

Walkthrough

This PR establishes development infrastructure by adding a GitHub Actions CI workflow for automated linting and testing, configuring pre-commit hooks (Black, isort, Flake8), updating gitignore to exclude environment files, adding development dependencies, and introducing comprehensive test suites for code agent, smoke tests, utility formatters, and worker components.

Changes

Cohort / File(s)	Summary
CI/CD & Development Infrastructure `.github/workflows/ci.yml`, `.pre-commit-config.yaml`, `requirements-dev.txt`	Adds GitHub Actions workflow triggering on push/PR to main/master, running linting (Black, isort, Flake8) and pytest on ubuntu-latest. Configures pre-commit hooks with Black 25.11.0, isort 5.12.0, Flake8 7.1.0 (max line-length 88), and standard file fixers. Adds dev dependencies: pytest, pillow, black, flake8, isort, pre-commit.
Git Configuration `.gitignore`	Reinstates `.DS_Store` entry and adds `.env` to ignore local environment secrets.
Test Suite `tests/test_code_agent.py`, `tests/test_smoke.py`, `tests/test_utils_formatters.py`, `tests/test_worker.py`	Introduces tests for code extraction/execution (extract_code_block, execute_code), smoke tests with mocked LMMAgent and external dependencies (pytesseract, pyautogui), formatter utility tests (parse_code_from_string, extract_agent_functions), and worker prediction tests with synthetic screenshot generation. Uses monkeypatching to avoid heavy runtime dependencies.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A burrow prepared with care today,
With workflows, tests, and linters at play.
Pre-commit hooks guard the code so clean,
The finest CI infrastructure ever seen!
This rabbit's den is properly secured,
With quality gates and tests assured. 🛡️

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main changes: adding tests, linters, CI workflow, and pre-commit configuration, all of which are substantiated by the file summaries.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (2)

tests/test_smoke.py (2)

51-84: Code duplication: FakeLMMAgent is duplicated.

This class is identical to the one in tests/test_worker.py. Please see the refactoring suggestion in that file's review to extract this to a shared test utility module.

87-91: Code duplication: Screenshot utility is duplicated.

This function is nearly identical to _create_screenshot in tests/test_worker.py. Consider extracting to a shared test utility module.

🧹 Nitpick comments (4)

requirements-dev.txt (1)
1-6: Consider pinning dependency versions for reproducible builds.

Unpinned dependencies can lead to non-reproducible builds when newer versions introduce breaking changes or unexpected behavior.
🔎 Example with version constraints
-pytest
-pillow
-black
-flake8
-isort
-pre-commit
+pytest>=7.0.0,<9.0.0
+pillow>=10.0.0,<12.0.0
+black==25.11.0
+flake8>=7.0.0,<8.0.0
+isort==5.12.0
+pre-commit>=3.0.0,<5.0.0
Note: Adjust version ranges based on your compatibility requirements.
tests/test_worker.py (2)
11-43: Extract FakeLMMAgent to a shared test utility module.

The FakeLMMAgent class is duplicated in both tests/test_worker.py and tests/test_smoke.py. This violates the DRY principle and makes maintenance harder.
🔎 Suggested approach

Create a new file tests/conftest.py or tests/test_utils.py:
# tests/test_utils.py
class FakeLMMAgent:
    def __init__(self, engine_params=None, system_prompt=None, engine=None):
        self.messages = []
        self.system_prompt = system_prompt or "You are a helpful assistant."

    def reset(self):
        self.messages = [
            {
                "role": "system",
                "content": [{"type": "text", "text": self.system_prompt}],
            }
        ]

    def add_system_prompt(self, prompt):
        self.system_prompt = prompt

    def add_message(self, text_content=None, image_content=None, role=None, **kwargs):
        self.messages.append(
            {
                "role": role or "user",
                "content": [{"type": "text", "text": text_content}],
            }
        )

    def get_response(self, *args, **kwargs):
        return "<thoughts>thinking</thoughts><answer>```python\nagent.wait(0.5)\n```</answer>"
Then import and use in both test files.
46-50: Extract screenshot utility to shared test module.

The _create_screenshot function is duplicated (as _create_screenshot_bytes) in tests/test_smoke.py. Consider extracting to a shared test utility module.
tests/test_smoke.py (1)
48-48: Remove unnecessary noqa directive.

The # noqa: E402 directive is unnecessary because the E402 rule (module level import not at top of file) is not enabled in your Flake8 configuration.
🔎 Proposed fix
-import gui_agents.s3.core.mllm as mllm  # noqa: E402
+import gui_agents.s3.core.mllm as mllm

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2cb57fb and 9902f3d.

📒 Files selected for processing (8)

.github/workflows/ci.yml
.gitignore
.pre-commit-config.yaml
requirements-dev.txt
tests/test_code_agent.py
tests/test_smoke.py
tests/test_utils_formatters.py
tests/test_worker.py

🧰 Additional context used

🧬 Code graph analysis (3)

tests/test_utils_formatters.py (1)

gui_agents/s3/utils/common_utils.py (1)

extract_agent_functions (169-179)

tests/test_worker.py (1)

gui_agents/s3/agents/agent_s.py (1)

AgentS3 (48-94)

tests/test_code_agent.py (1)

gui_agents/s3/agents/code_agent.py (2)

extract_code_block (11-29)

execute_code (32-49)

🪛 actionlint (1.7.9)

.github/workflows/ci.yml

15-15: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 Ruff (0.14.10)

tests/test_smoke.py

13-13: Unused static method argument: image

(ARG004)

13-13: Unused static method argument: output_type

(ARG004)

48-48: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)

52-52: Unused method argument: engine_params

(ARG002)

52-52: Unused method argument: engine

(ARG002)

67-67: Unused method argument: image_content

(ARG002)

67-67: Unused method argument: kwargs

(ARG002)

75-75: Unused method argument: args

(ARG002)

75-75: Unused method argument: kwargs

(ARG002)

119-119: Unpacked variable info is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

tests/test_worker.py

12-12: Unused method argument: engine_params

(ARG002)

12-12: Unused method argument: engine

(ARG002)

27-27: Unused method argument: image_content

(ARG002)

27-27: Unused method argument: kwargs

(ARG002)

35-35: Unused method argument: args

(ARG002)

35-35: Unused method argument: kwargs

(ARG002)

73-73: Unpacked variable info is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

tests/test_code_agent.py

14-14: Unused method argument: timeout

(ARG002)

🔇 Additional comments (9)

.github/workflows/ci.yml (2)

24-30: LGTM! Linter configuration is correct.

The linter steps properly set PYTHONPATH and run Black, isort, and Flake8 in check mode, aligning with the pre-commit configuration.

32-36: LGTM! Test execution is properly configured.

The test step correctly sets PYTHONPATH and runs pytest in quiet mode.

tests/test_utils_formatters.py (1)

5-15: LGTM! Tests are clear and focused.

The tests effectively verify:

Parsing Python code blocks from markdown-style backtick notation

Extracting agent function calls from code strings

tests/test_code_agent.py (2)

4-15: LGTM! Mock controller interface is appropriate.

The DummyEnvController correctly implements the interface expected by execute_code, providing deterministic responses for testing. The timeout parameter in run_bash_script is intentionally unused as this is a test mock.

18-36: LGTM! Tests comprehensively cover code extraction and execution.

The tests verify:

Extracting Python code blocks with language tags

Executing Python code via the mock controller

Executing Bash code via the mock controller

tests/test_worker.py (1)

53-79: LGTM! Worker test correctly verifies action generation.

The test properly:

Sets up the grounding agent and AgentS3 with mock parameters

Invokes predict with an instruction and screenshot

Asserts that actions are generated and contain the expected wait/sleep directive

The unused info variable can be prefixed with _ if you want to silence the Ruff warning, but it's not critical.

tests/test_smoke.py (3)

9-45: LGTM! Dummy module injection is an effective testing pattern.

The approach of injecting lightweight dummy modules (pytesseract, pyautogui) into sys.modules before importing the code under test is a valid pattern for avoiding heavy external dependencies during testing.

94-124: LGTM! Smoke test effectively validates the agent workflow.

The test properly verifies:

Agent can be instantiated with mock parameters

predict executes without errors

Actions are generated containing the expected sleep directive

The unused info variable can be prefixed with _ to silence the Ruff warning if desired, but it's not critical.

127-142: LGTM! CLI help test validates basic CLI functionality.

The test effectively verifies that the CLI module can be imported with dummy dependencies and that the --help flag executes successfully (exits with code 0).

coderabbitai · 2025-12-24T13:12:30Z

.github/workflows/ci.yml

+      - uses: actions/setup-python@v4
+        with:
+          python-version: "3.11"


⚠️ Potential issue | 🟠 Major

Update actions/setup-python to v5.

The workflow uses actions/setup-python@v4, which is flagged by actionlint as too old to run on current GitHub Actions runners. This may cause the workflow to fail.

🔎 Proposed fix

- - uses: actions/setup-python@v4 + - uses: actions/setup-python@v5 with: python-version: "3.11"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- uses: actions/setup-python@v4

with:

python-version: "3.11"

- uses: actions/setup-python@v5

with:

python-version: "3.11"

🧰 Tools

🪛 actionlint (1.7.9)

15-15: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🤖 Prompt for AI Agents

In .github/workflows/ci.yml around lines 15 to 17 the workflow references actions/setup-python@v4 which is outdated; update the action to actions/setup-python@v5 and keep the existing python-version input (e.g., "3.11"). Edit the uses line to point to the v5 tag and run the workflow to verify the runner accepts the updated action.

coderabbitai · 2025-12-24T13:12:30Z

.gitignore

+.DS_Store
+
+# Local env file for secrets
+.env


⚠️ Potential issue | 🟡 Minor

Remove duplicate .env entry; consolidate with existing Environments section.

The .env pattern already exists at line 125 within the "Environments" section. The duplicate entry at line 167 should be removed to maintain DRY and improve .gitignore clarity.

Consider either:

Removing lines 166–167 entirely (redundant since .env is already covered), or

Optionally adding an inline comment at line 125 if the intent is to emphasize secrets exclusion.

🔎 Proposed fix: remove duplicate entry

logs/ .DS_Store - -# Local env file for secrets -.env

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

.DS_Store

# Local env file for secrets

.env

.DS_Store

🤖 Prompt for AI Agents

In .gitignore around lines 164 to 167, there's a duplicate ".env" entry that is already listed in the "Environments" section (~line 125); remove the redundant ".env" entry (delete lines 166–167) so the file remains DRY, or if emphasis is desired add a short inline comment at the original entry instead of duplicating it.

coderabbitai · 2025-12-24T13:12:30Z

.pre-commit-config.yaml

+repos:
+  - repo: https://github.com/psf/black
+    rev: 25.11.0
+    hooks:
+      - id: black
+        language_version: python3.11
+  - repo: https://github.com/PyCQA/isort
+    rev: 5.12.0
+    hooks:
+      - id: isort
+  - repo: https://github.com/pre-commit/mirrors-flake8
+    rev: 7.1.0
+    hooks:
+      - id: flake8
+        args: ["--max-line-length=88", "--extend-ignore=E203,W503"]
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.6.0
+    hooks:
+      - id: end-of-file-fixer
+      - id: trailing-whitespace


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

latest stable version Black Python formatter 2024 2025

💡 Result:

The latest stable Black release is 25.12.0 (released Dec 8, 2025). [1][2]

Sources:

PyPI: black 25.12.0 release info. [1]

Black changelog / docs (25.12.0). [2]

🌐 Web query:

latest stable version isort Python 2024 2025

💡 Result:

Latest stable isort: 7.0.0 — released Oct 11, 2025. [1][2]

Sources:
[1] PyPI isort project page.
[2] Safety DB package entry for isort.

🌐 Web query:

latest stable version Flake8 linter 2024 2025

💡 Result:

The latest stable Flake8 release is 7.3.0 (released June 20, 2025). [1][2]

Sources:
[1] PyPI — flake8 release page.
[2] Flake8 documentation (release/FAQ pages).

🌐 Web query:

latest stable version pre-commit-hooks 2024 2025

💡 Result:

Latest stable (as of Dec 24, 2025): pre-commit-hooks v6.0.0. [1][2]

Latest stable in 2024 (most-recent 2024 series): v4.6.0 (the 4.x line). [2]

Sources:
[1] PyPI — pre-commit-hooks (release files show v6.0.0 uploaded Aug 9, 2025).
[2] GitHub — pre-commit/pre-commit-hooks releases/tags (shows v6.0.0, v5.0.0, v4.6.0, etc.).

Update pre-commit hook versions to current stable releases.

The pre-commit configuration is well-structured and Flake8 arguments correctly accommodate Black's formatting (line length 88, ignoring E203 and W503).

However, all hook versions are outdated:

Black: update from 25.11.0 to 25.12.0

isort: update from 5.12.0 to 7.0.0 (major version upgrade)

Flake8: update from 7.1.0 to 7.3.0

pre-commit-hooks: update from v4.6.0 to v6.0.0 (major version upgrade)

Verify compatibility with your codebase when updating isort and pre-commit-hooks due to their major version changes.

🤖 Prompt for AI Agents

.pre-commit-config.yaml lines 1-20: update the hook revisions to the requested newer stable releases; change black rev to 25.12.0, isort rev to 7.0.0, mirrors-flake8 rev to 7.3.0, and pre-commit-hooks rev to v6.0.0, then run pre-commit autoupdate or reinstall hooks and run the test suite/linting to verify isort and pre-commit-hooks major-version compatibility with the codebase.

ScottBrenner · 2026-01-04T18:41:24Z

.github/workflows/ci.yml

+    runs-on: ubuntu-latest
+
+    steps:
+      - uses: actions/checkout@v4


Suggested change

- uses: actions/checkout@v4

- uses: actions/checkout@v6

ci: add tests, CI, and pre-commit config (clean backport)

9902f3d

coderabbitai bot reviewed Dec 24, 2025

View reviewed changes

ScottBrenner reviewed Jan 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ci: add tests, linters, CI and pre-commit config #160

ci: add tests, linters, CI and pre-commit config #160

Uh oh!

Vasyl198 commented Dec 24, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 24, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 24, 2025

Uh oh!

coderabbitai bot Dec 24, 2025

Uh oh!

coderabbitai bot Dec 24, 2025

Uh oh!

ScottBrenner Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ci: add tests, linters, CI and pre-commit config #160

Are you sure you want to change the base?

ci: add tests, linters, CI and pre-commit config #160

Uh oh!

Conversation

Vasyl198 commented Dec 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

ScottBrenner Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Vasyl198 commented Dec 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 24, 2025 •

edited

Loading