Higinio Martí’s Post

Senior Software Engineer | Engineering Manager

3w Edited

Consistent hashing solves a tricky problem: when scaling distributed systems, adding or removing nodes usually means rehashing almost all your data, causing massive data movement and downtime. Consistent hashing minimizes this by only moving a small fraction of keys when nodes change, making scaling smooth and efficient. The internet is full of high-level explanations of consistent hashing, but finding an actual implementation of the algorithm is rare. Finding a near-real implementation with nodes acting as cache partitions is even rarer. I built a hands-on system in Golang with a hash ring, an API to store and retrieve keys, add and remove Dockerized nodes, and real-time D3 visualization. It’s a small system, but it gives a clear view of how data moves and how nodes handle load. Check it out here: https://lnkd.in/dZvYNy5F

Exploring Consistent Hashing higim.github.io

To view or add a comment, sign in

More Relevant Posts

Pavan Manala

AWS Solutions Architect | Java/Spring | ECS & Serverless (SQS, SNS, Lambda), CI/CD, Kafka | Secure APIs
1w
Report this post
𝗠𝗼𝘀𝘁 𝘀𝗰𝗮𝗹𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝗶𝘀𝘀𝘂𝗲𝘀 𝗮𝗿𝗲𝗻'𝘁 𝗰𝗮𝘂𝘀𝗲𝗱 𝗯𝘆 𝗰𝗼𝗱𝗲 — 𝘁𝗵𝗲𝘆'𝗿𝗲 𝗰𝗮𝘂𝘀𝗲𝗱 𝗯𝘆 𝗱𝗲𝘀𝗶𝗴𝗻. 𝗦𝘆𝘀𝘁𝗲𝗺𝘀 𝗯𝗿𝗲𝗮𝗸 𝘄𝗵𝗲𝗻 𝗹𝗮𝘆𝗲𝗿𝘀 𝗴𝗿𝗼𝘄 𝗮𝘁 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝘀𝗽𝗲𝗲𝗱𝘀. 🧠 𝗦𝗰𝗮𝗹𝗶𝗻𝗴 𝗮𝗰𝗿𝗼𝘀𝘀 𝘁𝗵𝗲 𝗱𝗮𝘁𝗮 𝗳𝗹𝗼𝘄 At scale, every layer behaves differently. A design that handles 10,000 messages/hour can fail at 100,000 unless each layer evolves with purpose. 📥 𝗗𝗮𝘁𝗮 𝗜𝗻𝗴𝗲𝘀𝘁𝗶𝗼𝗻 The first bottleneck often appears at the entry point — when traffic spikes. ✅ Using rate-limiting and throttling at API Gateway prevents downstream pressure before it starts. ✅ Using stateless services makes horizontal scaling straightforward — no sticky sessions, no shared state. ✅ Deciding early when to stream vs. batch saves unnecessary re-architecture later. ✅ Using Kafka or Kinesis streams flattens bursts and keeps ingestion asynchronous. 🧩 𝗚𝗼𝗮𝗹: absorb load gracefully instead of pushing it downstream. ⚙️ 𝗗𝗮𝘁𝗮 𝗣𝗿𝗼𝗰𝗲𝘀𝘀𝗶𝗻𝗴 Once data lands, the challenge shifts to throughput and resilience. ✅ Using AWS ECS and Lambda with metric-based scaling ensures elasticity under real workload patterns — not just CPU spikes. ✅ Using partitioning by customer, region, or key avoids hot partitions and enables balanced parallelism. ✅ Scaling consumers dynamically based on lag maintains steady throughput without over-provisioning. ✅ Using RDS Performance Insights and query tuning improves performance far more than scaling compute. ✅ Handling backpressure through queues and retries keeps the pipeline stable when traffic surges suddenly. 🧩 𝗚𝗼𝗮𝗹: increase throughput without increasing incident frequency. 📤 𝗗𝗮𝘁𝗮 𝗗𝗲𝗹𝗶𝘃𝗲𝗿𝘆 At scale, fast delivery matters as much as correct delivery. ✅ Using caching across layers — CDN, API Gateway, service, and DB query — consistently reduces latency. ✅ Using read replicas separates analytical reads from operational traffic. ✅ Designing APIs with pagination and slicing avoids expensive "fetch all" patterns. ✅ Switching from pull to event-based push delivery reduces repeated client polling. 🧩 𝗚𝗼𝗮𝗹: consistent performance, no matter the data size. ✨ 𝗧𝗮𝗸𝗲𝗮𝘄𝗮𝘆: Scalability isn't about adding compute — it's about removing friction between layers. When ingestion, processing, and delivery scale independently, the system runs predictably, even during peak traffic.
Like Comment
To view or add a comment, sign in
leela kumili

Collaboration Lead| Distributed Systems Specialist| InfoQ Editor| Creator of The Engineer’s Path
1mo
Report this post
𝗘𝘃𝗲𝗿 𝘄𝗼𝗻𝗱𝗲𝗿𝗲𝗱 𝗵𝗼𝘄 𝗗𝗮𝘁𝗮𝗱𝗼𝗴 𝗺𝗮𝗻𝗮𝗴𝗲𝘀 𝗯𝗶𝗹𝗹𝗶𝗼𝗻𝘀 𝗼𝗳 𝗺𝗲𝘁𝗿𝗶𝗰𝘀 𝗶𝗻 𝗿𝗲𝗮𝗹 𝘁𝗶𝗺𝗲? Datadog’s journey with Monocle is a perfect example of how real-world challenges evolve as systems scale—and how architecture must evolve to keep up. As the volume of metrics grew, maintaining multiple storage engines became increasingly complex. Performance bottlenecks and operational overhead were slowing down the platform. To address this, Datadog built Monocle, a Rust-powered, unified time series database. 𝗞𝗲𝘆 𝗳𝗮𝗰𝘁𝘀 𝗮𝗯𝗼𝘂𝘁 𝗠𝗼𝗻𝗼𝗰𝗹𝗲: • Unifies multiple storage engines into a single, high-performance backend • Handles billions of metrics daily with low latency • Reduces operational complexity for engineers • Built in Rust for reliability and efficiency 𝗪𝗵𝘆 𝗶𝘁 𝗺𝗮𝘁𝘁𝗲𝗿𝘀: Monocle is more than just a new database. It shows how engineering teams can rethink architecture to meet scaling challenges, improve performance, and deliver faster, more reliable insights to users. Read the full article here: https://lnkd.in/gHAfuRZ5 #Datadog #TimeSeriesDatabase #Observability #RustLang #CloudArchitecture #ScalableSystems #DataEngineering #SoftwareEngineering #DistributedSystems #TechLeadership #InfoQ

Datadog Launches Monocle, a Unified Rust-Powered Real-Time Metrics Engine infoq.com
Like Comment
To view or add a comment, sign in
Anirudh Sharma

Lead Software Engineer @Alteryx | Building Large-Scale Distributed Systems | Dabbling with Generative AI
1w
Report this post
We design distributed systems for parallel throughput, yet we often architect from sequential waiting. Usually, the culprit is acknowledgement or "ack". We treat it as a trivial handshake but the hidden cost is often the primary throttle on your entire data flow. The issue arises when processing blocks on the ack. A service receives a message, starts a DB txn, and only after it fully commits does it acknowledge message to the queue. During this time, the message is locked, and the system's parallel potential is wasted. It creates an artificial bottleneck, not of processing power, but of permission to proceed. This is a violation of Little's Law. Throughput is a function of the no. of items in the process and the time they spend there. By tying up messages in an invisible processing state for longer than necessary, we destroy our potential concurrency and cap our max throughput. The solution lies in decoupling - ack messages immediate upon receipt, before the expensive work. If the subsequent process fails, have a mechanism to recover. This require idempotency but unlocks true scale. We optimize for latency metrics while ignoring the throughput tax of ack-delays. True system performance isn't just about how fast a task runs, but how many can run at once w/o waiting for a signature.

2 Comments
Like Comment
To view or add a comment, sign in
Divyansh Mulchandani

Platform Engineer | Site Reliability Engineer | DevOps Engineer | CI/CD Automation | Software Developer | Data Engineer
2w
Report this post
🌊 Understanding the Raft Consensus Algorithm In distributed systems, ensuring multiple servers agree on the same state even if some fail is critical. That’s where Raft comes in. Designed as a more understandable alternative to Paxos, Raft is widely used in distributed databases, file systems, and coordination systems like etcd, Consul, and TiKV. 💻⚡ ❓ What is Raft? Raft is a leader-based consensus algorithm that keeps replicated logs consistent across all nodes in a cluster. Nodes communicate via RPC/gRPC, exchanging messages such as log entries, votes, and heartbeats. 📝🔄 🔑 Key Components: Leader 👑: Handles client requests, log replication, and heartbeats. Follower 🛡️: Passive nodes that replicate the leader’s logs. Candidate ⚔️: A node vying to become leader when none exists. ⚙️ How Raft Works: 1. Leader Election 🗳️: Nodes vote to elect a leader. The leader sends regular heartbeats 💓 to maintain authority. 2. Log Replication 📜: The leader records client commands, replicates them to followers, and commits entries once a majority agrees. ✅ 3. Safety 🛡️: Logs are applied in the same order on all nodes, ensuring consistency even if leaders fail. 💡 Why Raft? Understandable 📚: Easier to implement than Paxos. Strong consistency 🔗: All nodes stay synchronized. Fault-tolerant ⚡: Operates correctly even if a minority of nodes fail. Efficient ⚙️: Single leader minimizes conflicts. 🌍 Real-World Use Cases: etcd 🐙: Kubernetes cluster state management Consul 🧭: Service discovery and configuration HashiCorp Vault 🔐: Secure secret replication TiKV / CockroachDB 🐜: Distributed databases 💪 Raft ensures distributed systems remain reliable, consistent, and fault-tolerant, which is critical for modern cloud-native applications. ☁️🚀
Like Comment
To view or add a comment, sign in
Munimentum

23 followers
1w
Report this post
⚡ GO PERFORMANCE OPTIMISATION: Cost-Efficient Strategies for Application Excellence True performance optimisation for Go applications is not just about speed—it's about achieving application excellence that delivers a superior user experience while simultaneously ensuring measurable cost savings. Our systematic approach leverages Go’s native advantages to deliver results like 3-10x performance improvements and 30-60% infrastructure cost reduction. Strategic Go Performance Framework: The Efficiency Engine 🚀 We build performance into the core architecture, focusing on the highest-leverage areas: 1. Go-Native Performance Engineering • Deep Profiling: Using Go’s pprof to accurately identify CPU hotspots and memory allocation patterns—eliminating guesswork. • Memory Efficiency: Optimising allocation using tools like sync.Pool and compiler escape analysis to dramatically reduce Garbage Collection (GC) overhead. • Concurrency Control: Preventing goroutine leaks and efficiently managing concurrent operations to maximize resource use. 2. Database & Caching Leverage 💾 • PostgreSQL Excellence: Utilizing the pgx driver for superior connection pooling and prepared statement caching, extracting more throughput from existing database resources. • Intelligent Caching: Implementing multi-tiered strategies (Redis, in-memory sync.Map) to shield the database from up to 90% of read load. 3. AI/LLM Performance Optimisation 🤖 • Cost & Speed: Implementing client-side optimisation with connection pooling and request batching for external APIs (like OpenAI). • Strategic Caching: Deploying LLM response caching to achieve 50-70% reduction in AI service costs while maintaining responsiveness. 4. Resource & Infrastructure Scaling ☁️ • Minimal Footprint: Creating minimal Go Docker images (5-20MB) for 60-80% memory efficiency gains, translating directly to lower hosting costs. • High Density: Optimising CPU utilization to enable higher application density per server, minimizing cloud spend even as traffic scales. Real-World Performance Impact Our methodology, which starts with pprof baselining and focuses on systematic, bottleneck-driven optimisation, guarantees repeatable business value: • 3-5x API response improvement • 60% infrastructure cost reduction • 50-70% AI service cost optimisation Go performance optimisation enables competitive advantage through superior user experience while significantly reducing operational costs through language-native efficiency improvements. Which Go performance optimisation strategies would deliver the highest impact for your current application scalability and cost efficiency requirements? #GoLang #PerformanceOptimisation #CostEfficiency #DatabaseTuning #CachingStrategies
Like Comment
To view or add a comment, sign in
Donruethai Kromphamai

Strategic Partner, Cloud & Data Infrastructure | Enabling Hyper-Scale Businesses with AI-Driven Cost Optimization & DevOps Solutions
1w
Report this post
Business development perspective: This systematic approach perfectly demonstrates why companies choose Munimentum for Go performance optimisation. ⚡ Instead of expensive performance monitoring tools, we leverage Go’s built-in capabilities to deliver superior performance while reducing infrastructure costs. The formula is clear: 3-10x performance + 30-60% cost reduction + proven AI optimisation = competitive advantage through efficient Go performance engineering. See the full methodology from Alexis Morin and the team here! (Proud to be driving this strategy with Alex!) #GoPerformance #CostEfficiency #BusinessDevelopment #ROI
Munimentum

23 followers
1w

⚡ GO PERFORMANCE OPTIMISATION: Cost-Efficient Strategies for Application Excellence True performance optimisation for Go applications is not just about speed—it's about achieving application excellence that delivers a superior user experience while simultaneously ensuring measurable cost savings. Our systematic approach leverages Go’s native advantages to deliver results like 3-10x performance improvements and 30-60% infrastructure cost reduction. Strategic Go Performance Framework: The Efficiency Engine 🚀 We build performance into the core architecture, focusing on the highest-leverage areas: 1. Go-Native Performance Engineering • Deep Profiling: Using Go’s pprof to accurately identify CPU hotspots and memory allocation patterns—eliminating guesswork. • Memory Efficiency: Optimising allocation using tools like sync.Pool and compiler escape analysis to dramatically reduce Garbage Collection (GC) overhead. • Concurrency Control: Preventing goroutine leaks and efficiently managing concurrent operations to maximize resource use. 2. Database & Caching Leverage 💾 • PostgreSQL Excellence: Utilizing the pgx driver for superior connection pooling and prepared statement caching, extracting more throughput from existing database resources. • Intelligent Caching: Implementing multi-tiered strategies (Redis, in-memory sync.Map) to shield the database from up to 90% of read load. 3. AI/LLM Performance Optimisation 🤖 • Cost & Speed: Implementing client-side optimisation with connection pooling and request batching for external APIs (like OpenAI). • Strategic Caching: Deploying LLM response caching to achieve 50-70% reduction in AI service costs while maintaining responsiveness. 4. Resource & Infrastructure Scaling ☁️ • Minimal Footprint: Creating minimal Go Docker images (5-20MB) for 60-80% memory efficiency gains, translating directly to lower hosting costs. • High Density: Optimising CPU utilization to enable higher application density per server, minimizing cloud spend even as traffic scales. Real-World Performance Impact Our methodology, which starts with pprof baselining and focuses on systematic, bottleneck-driven optimisation, guarantees repeatable business value: • 3-5x API response improvement • 60% infrastructure cost reduction • 50-70% AI service cost optimisation Go performance optimisation enables competitive advantage through superior user experience while significantly reducing operational costs through language-native efficiency improvements. Which Go performance optimisation strategies would deliver the highest impact for your current application scalability and cost efficiency requirements? #GoLang #PerformanceOptimisation #CostEfficiency #DatabaseTuning #CachingStrategies
Like Comment
To view or add a comment, sign in
Vishnu Prasad Sharma

Data Scientist @ TimesInternet | IIT Madras '23 | Recommender Systems | Generative AI
1w Edited
Report this post
If you ask whether there is scope to reduce latency in the ML project or scope to make it more memory efficient, I would say yes. If you ask me the same question 2 months later, I would still say yes. I will always say yes, if we actually need to improve. When we develop a project from scratch and bring it to production, we do not question design choices every single day. Our objective is to bring the project or subsequent features live keeping the latency or memory requirements loosely within our resources. But there's always that pandas function waiting to be optimized more. There's always that file which might occupy less space in redis is saved in some other format. There's always that architecture decision which would make things faster. There's always that SQL query which is fetching redundant data. As the time goes and when we decide to improve the project... or as we visit the code for some other feature... we decide- OK, lets Optimize this. Believe it or not, that's the journey with ML production systems. The job of ML Engineers is not just to maintain the infrastructure but also to ship net new features. And when the focus is on shipping new features or products, hard optimization is not always the priority. Happy Productionizing!!
2 Comments
Like Comment
To view or add a comment, sign in
Roger Cummings

Customer Centric | CEO | Founder | Advisor | Board Member | Investor |
3d Edited
Report this post
Revolutionizing AI/HPC with Open-Source pNFS: PEAK:AIO’s Bold Leap Forward! PEAK:AIO is shaking up the high-performance computing world by embracing open-source parallel NFS (pNFS) to challenge legacy systems like Lustre. As highlighted in a recent Blocks & Files article, PEAK:AIO’s scale-out, all-flash storage solution is delivering blazing-fast performance—320 GB/s from a single 2RU system, scaling linearly to superpod levels! By open-sourcing their pNFS metadata software, PEAK:AIO is fostering collaboration with industry giants like Los Alamos National Labs and Carnegie Mellon University, driving a modern, flexible file system for AI and HPC workloads. With support for CXL for ultra-low latency and plans for a unified block, file, and object protocol system, they’re building a future-proof alternative to Ceph. This isn't just an upgrade; it's a game-changer for AI training, simulations, and data-intensive workloads. Governments and enterprises are already calling for open, simple, and scalable alternatives – PEAK:AIO is answering loud and clear! Read the full article here: https://lnkd.in/g7sziert #AI #HPC #StorageSolutions #pNFS #OpenSource #DataInfrastructure #HighPerformanceComputing #CXL #Innovation #TechTrends

PEAK:AIO bets on open pNFS to take on Lustre – Blocks and Files https://blocksandfiles.com

1 Comment
Like Comment
To view or add a comment, sign in
Marco Almeida

Senior Go Developer | ex Disney
2w
Report this post
⚡ Why Go Is a Hidden Weapon for Distributed Systems Everyone says Go is “simple and fast.” True — but that’s just the surface. Here are unique reasons Go quietly dominates the world of distributed architectures 👇 🧩 1. Concurrency as a Language Primitive — Not a Library Go’s concurrency model (goroutines + channels) isn’t an add-on. It’s baked into the language syntax — meaning you can model network nodes, pipelines, and event streams naturally, just like describing processes in CSP (Communicating Sequential Processes). 🔁 2. Deterministic Performance Under Load Go’s non-blocking scheduler and cooperative preemption ensure predictable latency even when millions of goroutines are running — a design inspired by Google’s need to keep latency under 100 µs for internal RPCs. 🧠 3. Memory Model Tuned for Cloud Workloads Unlike languages that rely on tracing GC pauses, Go’s asynchronous garbage collector uses incremental marking, so tail latency stays stable under high concurrency. It’s one reason NATS, etcd, and Kubernetes can sustain thousands of concurrent clients. 🧰 4. Minimal Runtime, Massive Reach Go binaries are statically linked, single executables — deployable to any node in any region with zero dependencies. In distributed systems, fewer moving parts = fewer failures. That’s why many orchestrators and sidecars are written in Go. 🌍 5. Observability and Tooling Culture The Go ecosystem was built with instrumentation in mind — from built-in pprof and race detectors to seamless OpenTelemetry support. You can trace a system across dozens of microservices without leaving Go’s standard toolchain. --- 💬 Distributed systems thrive on clarity, predictability, and speed. Go gives you all three — by design, not by accident. #Golang #DistributedSystems #CloudNative #Microservices #PerformanceEngineering
Like Comment
To view or add a comment, sign in
Manish Kumar
2w
Report this post
🔍 Why Latency Creeps In — and a Real Fix from My Multi-Agent Stack When you build multi-agent systems, latency often shows up in subtle layers — it’s rarely just “the network.” In my setup, two surprising culprits stood out: 1. Prompt misalignment during tool calls Agents calling out to MCP servers were using ambiguous or overly permissive prompts. Result? Wrong JSON pipelines, retries, or wasted steps. Once I enforced a stricter prompt schema (telling agents exactly which JSON pipeline to use), unnecessary tool chatter dropped significantly. 2. Database access overhead In my case, the agents ingest data via MongoDB. As the schema and volume grew, many queries became slow. Adding indexes on key query fields — especially those used in filtering or joins — slashed latency by 10–15 seconds in many workflows. ⸻ 💡 Key takeaway: Latency in multi-agent architectures isn’t just about faster models or better networking. It’s about: • Prompt engineering (ensuring agents use exactly the right interface and payload structures) • Storage optimization (indexing, query planning, and minimizing data fetch overhead) • Plus meta-orchestration: coordination layers, caching, or pruning unused context If you’re seeing lag in your agent pipelines, I’d encourage you to instrument each layer — prompt, tool, database — separately, and apply surgical fixes (not broad strokes). #AI #MultiAgent #PromptEngineering #MongoDB #PerformanceEngineering #Latency #LLM #Agents #BackendOptimizations #AIDevelopment
Like Comment
To view or add a comment, sign in

669 followers

34 Posts

View Profile Follow

LinkedIn respects your privacy

Higinio Martí’s Post

Explore content categories