Skip to content

kyotalab/hawk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

80 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

hawk πŸ¦…

Modern data analysis tool for JSON, YAML, CSV, and text files

Rust License Crates.io Crates.io GitHub Stars GitHub Release

hawk combines the simplicity of awk with the power of pandas, bringing unified data processing to your command line. Process any data format with the same intuitive syntax.

⚑ Quick Start

Installation

# Homebrew (macOS/Linux)
brew install kyotalab/tools/hawk

# Cargo (Rust)
cargo install hawk-data

# Verify installation
hawk --version

30-Second Demo

# JSON/CSV analysis - same syntax!
hawk '.users[] | select(.age > 30) | count' users.json
hawk '.[] | group_by(.department) | avg(.salary)' employees.csv

# Text/log processing with slicing (NEW!)
hawk -t '. | select(. | contains("ERROR|WARN")) | .[-100:]' app.log
hawk -t '. | map(. | split(" ")[0:3]) | unique' access.log

# Advanced string operations with multiple fields
hawk '.posts[] | map(.title, .content | trim | lower)' blog.json
hawk '.[] | group_by(.category) | .[0:10] | avg(.price)' products.json

πŸš€ Why hawk?

Feature hawk jq awk pandas
Multi-format βœ… JSON, YAML, CSV, Text ❌ JSON only ❌ Text only ❌ Python required
Unified syntax βœ… Same queries everywhere ❌ JSON-specific ❌ Line-based ❌ Complex setup
String operations βœ… 14 built-in + slicing ⚠️ Limited ⚠️ Basic βœ… Extensive
Statistical analysis βœ… Built-in median, stddev ❌ None ❌ None βœ… Full suite
Learning curve 🟒 Familiar pandas-like 🟑 Steep 🟒 Simple πŸ”΄ High

🎯 Key Features

Universal Data Processing

Process any format with identical syntax:

hawk '.items[] | select(.price > 100)' data.json   # JSON
hawk '.items[] | select(.price > 100)' data.csv    # CSV
hawk '.items[] | select(.price > 100)' data.yaml   # YAML
hawk -t '. | select(. | contains("$"))' data.txt   # Text

Advanced Text Processing (NEW in v0.2.3!)

# Split with slicing - extract exactly what you need
echo "2024-01-15 10:30:45 INFO message" | hawk -t '. | map(. | split(" ")[0:2])'
# β†’ ["2024-01-15", "10:30:45"]

# OR conditions for flexible filtering
hawk -t '. | select(. | contains("GET|POST|PUT"))' access.log

# Powerful slicing for any operation result
hawk '.[] | sort(.revenue) | .[-10:]' companies.json  # Top 10
hawk '.[] | group_by(.category) | .[0:5]' products.json  # 5 from each group

Statistical Analysis Made Simple

# Instant insights from your data
hawk '.sales[] | group_by(.region) | median(.amount)' sales.json
hawk '.users[] | select(.active) | stddev(.session_time)' analytics.json
hawk '.metrics[] | unique(.user_id) | count' engagement.json

πŸ“š Documentation

Get Started in 5 Minutes

Master Advanced Features

Use Case Guides(In progress)

🌟 Popular Workflows

Log Analysis

# Find error patterns in application logs
hawk -t '. | select(. | contains("ERROR")) | map(. | split(" ")[0:2]) | unique' app.log

# Analyze Docker container performance
hawk -t '. | group_by(. | split(" ")[1]) | count' docker.log

Data Exploration

# Quick dataset overview
hawk '. | info' unknown-data.json

# Statistical analysis
hawk '.users[] | group_by(.department) | median(.salary)' employees.csv

DevOps Automation

# Kubernetes resource analysis
hawk '.items[] | select(.status.phase == "Running") | count' pods.json

# Performance monitoring
hawk '.metrics[] | group_by(.service) | avg(.response_time)' monitoring.json

⭐ What's New in v0.2.3

  • 🎯 Advanced Slicing: .[0:10], .[-5:], group_by(.field) | .[0:3]
  • βœ‚οΈ Split with Slicing: split(" ")[0:3], split(",")[-2:]
  • πŸ” OR Conditions: contains("GET|POST"), starts_with("ERROR|WARN")
  • πŸ“Š Stratified Sampling: Sample from each group for unbiased analysis
  • ⚑ Performance: Optimized for large datasets with efficient memory usage

🀝 Contributing

We welcome contributions! See our Contributing Guide.

git clone https://github.com/kyotalab/hawk.git
cd hawk
cargo build --release
cargo test

πŸ“„ License

MIT License - see LICENSE for details.


Ready to transform your data workflows? Start with our 5-minute tutorial πŸš€

About

Modern data processing tool combining the power of pandas with the simplicity of awk

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Languages