hawk 🦅

Modern data analysis tool for JSON, YAML, CSV, and text files

hawk combines the simplicity of awk with the power of pandas, bringing unified data processing to your command line. Process any data format with the same intuitive syntax.

⚡ Quick Start

Installation

# Homebrew (macOS/Linux)
brew install kyotalab/tools/hawk

# Cargo (Rust)
cargo install hawk-data

# Verify installation
hawk --version

30-Second Demo

# JSON/CSV analysis - same syntax!
hawk '.users[] | select(.age > 30) | count' users.json
hawk '.[] | group_by(.department) | avg(.salary)' employees.csv

# Text/log processing with slicing (NEW!)
hawk -t '. | select(. | contains("ERROR|WARN")) | .[-100:]' app.log
hawk -t '. | map(. | split(" ")[0:3]) | unique' access.log

# Advanced string operations with multiple fields
hawk '.posts[] | map(.title, .content | trim | lower)' blog.json
hawk '.[] | group_by(.category) | .[0:10] | avg(.price)' products.json

🚀 Why hawk?

Feature	hawk	jq	awk	pandas
Multi-format	✅ JSON, YAML, CSV, Text	❌ JSON only	❌ Text only	❌ Python required
Unified syntax	✅ Same queries everywhere	❌ JSON-specific	❌ Line-based	❌ Complex setup
String operations	✅ 14 built-in + slicing	⚠️ Limited	⚠️ Basic	✅ Extensive
Statistical analysis	✅ Built-in median, stddev	❌ None	❌ None	✅ Full suite
Learning curve	🟢 Familiar pandas-like	🟡 Steep	🟢 Simple	🔴 High

🎯 Key Features

Universal Data Processing

Process any format with identical syntax:

hawk '.items[] | select(.price > 100)' data.json   # JSON
hawk '.items[] | select(.price > 100)' data.csv    # CSV
hawk '.items[] | select(.price > 100)' data.yaml   # YAML
hawk -t '. | select(. | contains("$"))' data.txt   # Text

Advanced Text Processing (NEW in v0.2.3!)

# Split with slicing - extract exactly what you need
echo "2024-01-15 10:30:45 INFO message" | hawk -t '. | map(. | split(" ")[0:2])'
# → ["2024-01-15", "10:30:45"]

# OR conditions for flexible filtering
hawk -t '. | select(. | contains("GET|POST|PUT"))' access.log

# Powerful slicing for any operation result
hawk '.[] | sort(.revenue) | .[-10:]' companies.json  # Top 10
hawk '.[] | group_by(.category) | .[0:5]' products.json  # 5 from each group

Statistical Analysis Made Simple

# Instant insights from your data
hawk '.sales[] | group_by(.region) | median(.amount)' sales.json
hawk '.users[] | select(.active) | stddev(.session_time)' analytics.json
hawk '.metrics[] | unique(.user_id) | count' engagement.json

📚 Documentation

Get Started in 5 Minutes

🚀 Quick Start Guide - Essential basics
📖 Query Language Reference - Complete syntax
🧵 String Operations - Text processing guide

Master Advanced Features

📊 Data Analysis - Statistical workflows
📄 Text Processing - Log analysis and text manipulation
💼 Real-world Examples - Industry-specific use cases

Use Case Guides(In progress)

🔍 Log Analysis - Docker, nginx, application logs
⚙️ DevOps Workflows - Kubernetes, CI/CD, monitoring
📈 Data Science - CSV analysis, statistics, ML prep

🌟 Popular Workflows

Log Analysis

# Find error patterns in application logs
hawk -t '. | select(. | contains("ERROR")) | map(. | split(" ")[0:2]) | unique' app.log

# Analyze Docker container performance
hawk -t '. | group_by(. | split(" ")[1]) | count' docker.log

Data Exploration

# Quick dataset overview
hawk '. | info' unknown-data.json

# Statistical analysis
hawk '.users[] | group_by(.department) | median(.salary)' employees.csv

DevOps Automation

# Kubernetes resource analysis
hawk '.items[] | select(.status.phase == "Running") | count' pods.json

# Performance monitoring
hawk '.metrics[] | group_by(.service) | avg(.response_time)' monitoring.json

⭐ What's New in v0.2.3

🎯 Advanced Slicing: .[0:10], .[-5:], group_by(.field) | .[0:3]
✂️ Split with Slicing: split(" ")[0:3], split(",")[-2:]
🔍 OR Conditions: contains("GET|POST"), starts_with("ERROR|WARN")
📊 Stratified Sampling: Sample from each group for unbiased analysis
⚡ Performance: Optimized for large datasets with efficient memory usage

🤝 Contributing

We welcome contributions! See our Contributing Guide.

git clone https://github.com/kyotalab/hawk.git
cd hawk
cargo build --release
cargo test

📄 License

MIT License - see LICENSE for details.

Ready to transform your data workflows? Start with our 5-minute tutorial 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
docs		docs
examples		examples
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

hawk 🦅

⚡ Quick Start

Installation

30-Second Demo

🚀 Why hawk?

🎯 Key Features

Universal Data Processing

Advanced Text Processing (NEW in v0.2.3!)

Statistical Analysis Made Simple

📚 Documentation

Get Started in 5 Minutes

Master Advanced Features

Use Case Guides(In progress)

🌟 Popular Workflows

Log Analysis

Data Exploration

DevOps Automation

⭐ What's New in v0.2.3

🤝 Contributing

📄 License

About

Uh oh!

Releases 5

Packages

Languages

License

kyotalab/hawk

Folders and files

Latest commit

History

Repository files navigation

hawk 🦅

⚡ Quick Start

Installation

30-Second Demo

🚀 Why hawk?

🎯 Key Features

Universal Data Processing

Advanced Text Processing (NEW in v0.2.3!)

Statistical Analysis Made Simple

📚 Documentation

Get Started in 5 Minutes

Master Advanced Features

Use Case Guides(In progress)

🌟 Popular Workflows

Log Analysis

Data Exploration

DevOps Automation

⭐ What's New in v0.2.3

🤝 Contributing

📄 License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

Packages