Desarrollador backend Preguntas de entrevista & Respuestas

Las entrevistas de desarrollador backend evaluan su comprension de la arquitectura del lado del servidor, diseno de bases de datos, desarrollo de APIs y escalabilidad de sistemas. Espere preguntas sobre modelado de datos, concurrencia, seguridad y sistemas distribuidos.

Preguntas conductuales

  1. 1. Tell me about a time you had to debug a complex production issue.

    Respuesta modelo

    Our API started returning 500 errors intermittently — affecting about 5% of requests, but only during peak hours. The error logs showed database connection timeouts, but the database CPU and memory were fine. I added connection pool monitoring and discovered we were exhausting our connection pool limit (20 connections). The root cause: a new feature was opening a database transaction, making an HTTP call to an external service within the transaction, and that service occasionally took 30+ seconds to respond — holding the database connection hostage. I fixed it by restructuring the code to make the HTTP call outside the transaction and only opening a connection for the actual database write. I also increased the pool size to 50 as a buffer and added alerting on connection pool utilization. The 500 error rate dropped to zero immediately.

  2. 2. Describe a time you designed an API that other teams consumed. How did you ensure it was developer-friendly?

    Respuesta modelo

    I designed a partner integration API that 15 external companies would use to sync product data with our platform. Before writing any code, I conducted interviews with 5 potential API consumers to understand their integration patterns. I learned they needed batch operations (not just single-record CRUD) and wanted webhooks for real-time updates. I designed the API with consistent naming conventions, pagination, filtering, and comprehensive error responses with actionable messages — not just '400 Bad Request' but '400: The price field must be a positive number. Received: -5.99.' I generated OpenAPI documentation from the code, created a sandbox environment with test data, and wrote a quickstart guide with examples in Python, Node.js, and curl. I also versioned the API from day one (v1 in the URL path). The result: average integration time was 3 days instead of the 2-week industry average, and support tickets were 70% lower than our previous API version.

  3. 3. Tell me about a database migration you led that was particularly challenging.

    Respuesta modelo

    We needed to migrate 500M rows from a MySQL monolith database to PostgreSQL while keeping the application running — zero downtime was a hard requirement. I designed a phased approach. Phase 1: set up PostgreSQL with the new schema and implement dual-write — the application writes to both databases simultaneously. Phase 2: backfill historical data using a batch migration script that processed 100K rows per hour during off-peak hours, with checksums to verify data integrity. Phase 3: gradually shift read traffic to PostgreSQL using a feature flag, starting at 5% and increasing over 2 weeks while monitoring for discrepancies. Phase 4: once 100% of reads were on PostgreSQL with zero discrepancy alerts, we turned off MySQL writes and decommissioned the old database. The entire migration took 6 weeks. We caught 3 data inconsistency bugs during the dual-read phase that would have caused production issues in a big-bang migration.

  4. 4. Give me an example of how you improved the security of a backend system.

    Respuesta modelo

    I conducted a security audit of our authentication system and found several issues: JWT tokens had a 30-day expiration, refresh tokens were stored in localStorage (XSS-vulnerable), and API rate limiting was only applied to login endpoints. I implemented a comprehensive security overhaul: shortened JWT expiration to 15 minutes with silent refresh, moved refresh tokens to HTTP-only secure cookies with SameSite=Strict, implemented sliding-window rate limiting across all authenticated endpoints (100 requests per minute per user), added request signing for sensitive operations (transfers, password changes), and implemented audit logging for all authentication events. I also set up automated dependency vulnerability scanning with Snyk in our CI pipeline. Post-implementation, we passed a third-party penetration test with zero critical or high findings, compared to 4 critical findings in the previous audit.

Preguntas técnicas

  1. 1. Explain the CAP theorem and how it affects database selection.

    Respuesta modelo

    The CAP theorem states that a distributed data store can guarantee at most two of three properties: Consistency (every read returns the most recent write), Availability (every request receives a response), and Partition tolerance (the system continues operating despite network failures between nodes). Since network partitions are inevitable in distributed systems, the real choice is between consistency and availability during a partition. CP systems (like PostgreSQL with synchronous replication, MongoDB, HBase) refuse to respond rather than return stale data — suitable for financial systems where incorrect data is worse than downtime. AP systems (like Cassandra, DynamoDB, CouchDB) always respond but may return stale data — suitable for social media feeds, product catalogs, or caching layers where eventual consistency is acceptable. In practice, I choose based on the cost of being wrong: if stale data causes financial loss or safety issues, pick CP. If brief staleness is invisible to users, pick AP for better availability and latency.

  2. 2. How would you design a rate limiting system?

    Respuesta modelo

    I'd evaluate four algorithms based on the use case. Fixed window: count requests in fixed time intervals (e.g., 100 per minute). Simple but allows bursts at window boundaries — a user could make 100 requests at 0:59 and 100 more at 1:00. Sliding window log: track the timestamp of each request, count requests in the trailing window. Accurate but memory-intensive for high-volume APIs. Sliding window counter: hybrid of fixed and sliding — uses the previous window's count weighted by time overlap. Good accuracy with low memory. Token bucket: tokens accumulate at a fixed rate, each request costs a token. Allows controlled bursts while maintaining average rate. This is my default choice because it's intuitive, handles burst traffic gracefully, and is simple to implement. For implementation: Redis with INCR and EXPIRE for distributed rate limiting, with the key format being user_id:endpoint:window. I'd return rate limit headers (X-RateLimit-Remaining, X-RateLimit-Reset) so clients can self-throttle, and use HTTP 429 responses with a Retry-After header.

  3. 3. What are database indexes and when should you use them?

    Respuesta modelo

    An index is a data structure (typically B-tree or hash) that speeds up data retrieval at the cost of slower writes and additional storage. It's like a book's index — instead of reading every page to find a topic, you look it up in the index and jump to the right page. Use indexes on: columns in WHERE clauses (especially on large tables), JOIN columns, columns used in ORDER BY, and columns with high cardinality (many unique values). Don't index: small tables (sequential scan is faster), columns with low cardinality (boolean, status fields with 3 values), columns that are frequently updated (index maintenance overhead), or tables with heavy write traffic where read performance isn't critical. Composite indexes matter: an index on (user_id, created_at) helps queries filtering by both, or by user_id alone, but not by created_at alone. I use EXPLAIN ANALYZE to verify that queries actually use the indexes I create — the query planner sometimes ignores indexes when it estimates a sequential scan is faster.

  4. 4. How do you handle database transactions in a microservices architecture?

    Respuesta modelo

    Traditional ACID transactions don't work across microservices because each service owns its own database. The standard solution is the Saga pattern: a sequence of local transactions where each step publishes an event that triggers the next step, and each step has a compensating transaction for rollback. There are two approaches: choreography (services react to events autonomously) and orchestration (a central coordinator manages the flow). I prefer orchestration for complex flows because the logic is in one place and easier to debug. Example: processing an order involves the Order service (create order), Payment service (charge card), and Inventory service (reserve stock). If payment fails, the orchestrator calls the Order service's compensating transaction to cancel the order. If inventory fails after payment succeeds, the orchestrator refunds the payment and cancels the order. Key considerations: idempotency is critical (every step must handle being called twice), compensating transactions must be reliable, and you need observability into the saga's state for debugging failed flows.

Preguntas situacionales

  1. 1. Your API is experiencing a 10x traffic spike that's degrading performance. What steps do you take?

    Respuesta modelo

    Immediate (first 15 minutes): enable aggressive caching at the CDN/reverse proxy level for read-heavy endpoints. Activate auto-scaling for application servers if it's not already running. Check if the database is the bottleneck — if yes, enable read replicas for GET requests and add connection pooling (PgBouncer for PostgreSQL). Short-term (next hour): implement request throttling for non-critical endpoints to protect critical paths. Add a queue for writes that can be processed asynchronously (user analytics events, email notifications). Consider a circuit breaker on external service calls that might be cascading failures. If this is expected traffic (viral moment, product launch): scale horizontally, add caching layers, and optimize the hottest database queries. If it's unexpected (potential DDoS): check traffic patterns, enable rate limiting by IP, and consider WAF rules. Post-spike: conduct a capacity planning review to set scaling triggers that would handle this automatically next time.

  2. 2. You need to add a feature that requires changes to a heavily-used database table with billions of rows. How do you approach the schema change?

    Respuesta modelo

    Never run ALTER TABLE directly on a billion-row table in production — it will lock the table for hours. I'd use an online schema migration tool like pt-online-schema-change (MySQL) or pg_repack (PostgreSQL) that creates a shadow copy of the table, applies the change, syncs data using triggers, and swaps tables with minimal locking. For adding a column: add it as nullable first (fast, no table rewrite), backfill the data in batches during off-peak hours (process 10K rows per batch with a 100ms delay between batches), then add the NOT NULL constraint if needed after backfilling is complete. For changing a column type: create a new column with the target type, dual-write to both columns during migration, backfill the new column, switch application reads to the new column, then drop the old column. The key principles: every step must be reversible, the migration must run in background without locking, and the application must work correctly at every intermediate state.

  3. 3. A third-party API your service depends on starts returning errors 50% of the time. How do you handle it?

    Respuesta modelo

    First, implement a circuit breaker pattern. After N consecutive failures (say 5), open the circuit — stop calling the API and return a fallback response (cached data, default values, or a graceful degradation message) for a cooldown period. After the cooldown, allow a single test request through. If it succeeds, close the circuit and resume normal operation. Second, add retry logic with exponential backoff for transient failures — retry after 1s, then 2s, then 4s, with jitter to prevent thundering herd. Third, implement a request queue so that operations depending on this API can be retried later if they fail. Fourth, alert the team and check the third-party provider's status page. If this is a persistent issue, consider caching responses more aggressively, building a fallback to an alternative provider, or queuing requests for batch processing during stable periods. Long-term: for any critical third-party dependency, I'd maintain a service-level agreement and have a documented fallback strategy before the first outage occurs.

  4. 4. You're asked to build a notification system that sends emails, push notifications, and SMS. How would you architect it?

    Respuesta modelo

    I'd design it as an event-driven system with three layers. First, the notification request layer: services publish notification events to a message queue (RabbitMQ or SQS) with a channel-agnostic payload — recipient, notification type, template name, and template variables. This decouples the sender from delivery mechanics. Second, the orchestration layer: a notification service consumes events, looks up user preferences (which channels they've enabled), resolves templates, and publishes channel-specific messages to per-channel queues (email queue, push queue, SMS queue). Third, the delivery layer: channel-specific workers consume from their queue and call the appropriate provider (SendGrid for email, FCM/APNS for push, Twilio for SMS). Each worker handles retries, rate limiting, and provider-specific logic independently. Key design decisions: idempotency keys to prevent duplicate sends, a notification log for audit and debugging, user preference storage for opt-out/channel selection, and template versioning so we can update templates without redeploying code. I'd also implement a dead letter queue for permanently failed notifications with alerting.

Consejos para la entrevista

Para preguntas de diseno de sistemas, siempre clarifique los requisitos primero. Preparese para escribir codigo: endpoints de API, consultas SQL o implementaciones de algoritmos. Muestre que piensa en modos de fallo.

Practica estas preguntas con IA

Prueba una entrevista simulada gratis

Practica estas preguntas con IA

Preguntas frecuentes

Como prepararse para una entrevista backend?
Estructuras de datos y algoritmos, diseno de bases de datos, diseno de API (principios REST, autenticacion), diseno de sistemas (escalabilidad, cache, colas de mensajes) y el modelo de concurrencia de su lenguaje principal.
Que tan importante es el diseno de sistemas?
Critico para roles intermedios y senior. Espere al menos una ronda dedicada a disenar un sistema.
Debo conocer multiples lenguajes?
Experiencia profunda en uno es mejor que conocimiento superficial de muchos. Conozca el ecosistema de su lenguaje principal a fondo.
Como difieren las entrevistas backend de las de software engineer?
Se enfocan mas en diseno de bases de datos, arquitectura de API y escalabilidad del lado del servidor.

Puestos relacionados

Usamos cookies para analizar el tráfico del sitio web y mejorar tu experiencia. Puedes cambiar tus preferencias en cualquier momento. Cookie Policy