Ruby on Rails Backend Reliability & Scale Obsessive

Published on August 31, 2025

Hellotext is a modern communication suite for focused on revolutionizing the eCommerce experience. Our mission is to empower businesses with cutting-edge tools for customer engagement, profiling, segmentation, and automation. We are passionate about providing top-notch services and highly-crafted experiences to our clients, and we are looking for exceptional talent to join our growing team.

Hellotext is the all‑in‑one messaging and customer engagement platform for eCommerce. We help merchants unify customer profiles, segment intelligently, and automate lifecycle journeys across WhatsApp, SMS, and web, sending at scale with precision and reliability. If you want to harden the backbone behind millions of customer interactions, this is your stage.

Do you obsess over uptime, throughput, and bulletproof integrations and enjoy squeezing every millisecond out of PostgreSQL and background jobs? Join us and own the reliability, performance, and integrations backbone of Hellotext. You’ll keep our messaging engine fast, predictable, and resilient as we scale.

What you’ll do
  • Own the reliability, scalability, and performance of our core Rails services and integrations.
  • Design resilient data flows: background jobs, schedulers, webhooks, retries with backoff, idempotency, and graceful degradation.
  • Build and harden third‑party API integrations with timeouts, circuit breakers, and robust error handling.
  • Define SLAs and run against clear SLOs/SLIs; instrument end‑to‑end observability (logs, metrics, traces) with actionable dashboards and alerts.
  • Lead incident response and postmortems; create crisp runbooks and automate repetitive ops tasks.
  • Optimize PostgreSQL: indexing, query plans, connection pooling, partitioning, and zero‑downtime migrations.
  • Evolve caching and concurrency strategy (e.g., Redis, job queues) to reduce latency under heavy load.
  • Automate CI/CD and deploys with safe rollouts and fast rollbacks; champion security, secrets management, and least‑privilege access.
  • Capacity plan and load‑test for spikes; protect SLAs with rate limiting, backpressure, and queue shaping.
  • Partner with product, frontend, and customer success to ship features that scale and stay up.

What you bring
  • Proven experience shipping and operating backend‑heavy Ruby on Rails systems in production.
  • Deep knowledge of PostgreSQL performance, background job processing, caching, and HTTP internals.
  • Strong Linux fundamentals: networking, TLS, shell, system monitoring, and debugging under pressure.
  • Experience defining and hitting SLOs/SLIs; comfort with incident management and on‑call (reasonable rotation).
  • Hands‑on with CI/CD, containerized deploys (Docker) and a deploy toolchain (Kamal or similar).
  • Fluency with observability tooling (e.g., New Relic/Datadog/Prometheus/Grafana/OpenTelemetry).
  • Solid testing discipline (RSpec/Minitest), clear documentation, and clean, maintainable code.

Nice to Have
  • SMPP protocol knowledge and experience handling high‑throughput messaging (millions of messages).
  • Experience with provider/webhook ecosystems (SMS, WhatsApp, email), and protecting them with idempotency and replay guards.
  • Retrieval‑Augmented Generation (RAG) experience and familiarity with vector databases (pgvector, Pinecone, Qdrant, Weaviate) for building reliable retrieval pipelines.
  • Familiarity with the Model Context Protocol (MCP) for secure tool integrations and internal developer workflows.
  • Elasticsearch/OpenSearch tuning in production.
  • Cost‑aware logging/metrics practices; familiarity with log sampling and retention strategies.
  • Basic security/compliance mindset (GDPR/SOC2 practices).

Our Stack

Ruby on Rails, PostgreSQL, Redis, ElasticSearch, background jobs (Active Job with Sidekiq or Solid Queue), REST/Webhooks, Docker, Kamal, GitHub, and a modern observability stack (OpenTelemetry + New Relic/Prometheus/Grafana).

You

You think in failure modes, measure twice and cut once, and prefer resilient defaults over heroics. You value clear runbooks, fast rollbacks, and learning‑rich postmortems. Autonomy energizes you; reliability is your craft.

Benefits
  • 100 % Remote: work from anywhere, anytime.
  • Flexible schedule: craft a routine that suits your life.
  • Unlimited paid time off: take the rest you need, when you need it.

Apply now and let’s build something remarkably reliable together.