About ThredUp
ThredUp is transforming resale with technology and a mission to inspire the world to think secondhand first. By making it easy to buy and sell secondhand, ThredUp has become one of the world's largest online resale platforms for apparel, shoes and accessories. Sellers love ThredUp because we make it easy to clean out their closets and unlock value for themselves or for the charity of their choice while doing good for the planet. Buyers love shopping value, premium and luxury brands all in one place, at up to 90% off estimated retail price. Our proprietary operating platform is the foundation for our managed marketplace and consists of distributed processing infrastructure, proprietary software and systems and data science expertise. With ThredUp’s Resale-as-a-Service, some of the world's leading brands and retailers are leveraging our platform to deliver customizable, scalable resale experiences to their customers. ThredUp has processed over 172 million unique secondhand items from 55,000 brands across 100 categories. By extending the life cycle of clothing, ThredUp is changing the way consumers shop and ushering in a more sustainable future for the fashion industry.
Recognized on TIME Most Influential Companies of 2023, Digiday's WorkLife 50 2023, TIME's Best Inventions of 2022, and Lattice's People Success Awards 2022.
How You Will Make An Impact:
As a Senior Engineer, Infrastructure, you will design, build, and evolve the core infrastructure that powers ThredUP, with a particular focus on the reliability, performance, and operability of our database estate. Reporting to the Director of Engineering, you’ll lead cross-team initiatives and partner with stakeholders to ensure our systems are scalable, cost-effective, and secure.
This is an infrastructure-first role for an engineer who is equally comfortable owning production AWS Aurora (MySQL, PostgreSQL). You’ll spend most of your time on cloud-native infrastructure, orchestration, and automation, while serving as the team’s go-to person for database operations, performance tuning, and safe schema evolution. You will play a key role in defining our technical direction and ensuring our platform enables the rapid growth of our marketplace.
In This Role You'll Get To:
Scope \& Impact
- Lead or significantly contribute to medium-to-large infrastructure projects crossing multiple engineering teams.
- Serve as a domain expert in cloud infrastructure, orchestration, observability, and platform automation, and as the team’s primary owner of database reliability and performance.
- Ensure ThredUP’s infrastructure evolves to support scale, resilience, and developer productivity.
Technical Execution
- Architect and implement highly available, secure, and cost-efficient cloud infrastructure using AWS, Kubernetes (EKS), and Terraform.
- Drive improvements in CI/CD, observability, networking, and security across the platform.
- Provide high-quality, impactful technical contributions across infrastructure projects, setting engineering standards.
- Participate in and lead design reviews, providing constructive feedback and driving engineering excellence.
Database Engineering
- Operate, monitor, and tune AWS Aurora and RDS clusters (MySQL and PostgreSQL), including parameter groups, maintenance, minor/major upgrades, and point-in-time restores.
- Own HA and replication behavior: respond to failovers, work with cluster vs. instance endpoints, and run the checks required before promoting a reader to writer.
- Triage and resolve CPU/IO/locking/replication incidents; analyze slow query logs and EXPLAIN/EXPLAIN ANALYZE plans to produce immediate mitigations and long-term fixes.
- Plan and execute zero-downtime schema changes with safe rollback paths (online DDL, gh-ost / pt-online-schema-change, pg_repack, logical replication, trigger-based backfills).
- Partner with application teams to remove N+1 queries, tune indexes, and rewrite inefficient predicates; design partitioning and retention for high-write tables.
- Design, execute, and regularly validate backup and disaster-recovery procedures.
Ownership
- Act as an expert in infrastructure design, performance, and operations across multiple systems and services.
- Promote shared ownership of infrastructure by driving documentation, tooling, and process improvements.
- Monitor and optimize system performance, ensuring reliability while protecting teams from burnout.
Leadership \& Collaboration
- Build relationships with engineering teams, product managers, and cross-functional partners to ensure infrastructure supports company goals.
- Contribute to defining strategic technical direction, setting infrastructure roadmaps, and guiding prioritization.
- Advocate for best practices in reliability, security, and cost management.
- Ensure knowledge is shared within the team, reducing single points of failure.
What we are looking for:
- 6+ years of relevant industry experience with a Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
- Hands-on experience operating MySQL and PostgreSQL in production (Aurora/RDS preferred), including indexing strategies, transactions, locking/MVCC, and performance tuning.
- Demonstrated experience performing zero-downtime schema changes with safe rollbacks.
- Proven track record of designing and scaling infrastructure for distributed, service-oriented architectures.
- Expertise in AWS (EKS, RDS, IAM, cost optimization).
- Proficiency with Kubernetes and Terraform.
- Experience with CI/CD pipelines and practices (e.g., Jenkins, GitHub Actions, or ArgoCD – one or more).
- Experience with observability and monitoring tooling (Datadog, CloudWatch, or similar), including slow query and error log analysis.
- Ability to diagnose connectivity issues affecting database access (security groups, VPC routes/TGW, VPN, DNS).
- Scripting and automation skills (Python, Bash, or similar).
- Excellent communication and collaboration skills.
Additional skills that are a plus:
- Experience with Teleport (or similar DB proxying) and IAM DB authentication.
- Experience tuning Postgres autovacuum and designing partition-based retention policies.
- Terraform/Terragrunt module design for RDS/Aurora clusters and snapshot restores.
- Familiarity with service mesh (Istio) and edge/WAF tooling (Cloudflare).
- Familiarity with data infrastructure (Airflow, Databricks) and messaging/streaming systems (RabbitMQ, Kafka, DynamoDB, MongoDB).
- Knowledge of network security, vulnerability management, and incident response best practices.
- Experience with Rollbar or centralized logging pipelines.
- Experience in fintech, testing automation, and/or compliance-heavy environments (GDPR, SOC2).
- Prior leadership in scaling platforms for e-commerce or high-growth startups.
What We Offer
- Monthly allowance for insurance/education ($200 gross).
- 50% paid sabbatical after 3 year anniversary.
- Paid parental leave for new mothers and fathers.
- IT Kit.
We believe diversity, inclusion and belonging is key for our team
At ThredUp, our mission has been built on extending the lives of millions of unique clothing items. Much like our inventory, we are proud to have fostered a workplace that is one-of-a-kind. As a company focused on diversity, inclusion and belonging, we are committed to ensuring our employees are comfortable bringing their authentic selves to work every day. A unique perspective is critical to solving complex problems and inspiring a new generation to think secondhand first. Be you.
If you are a candidate with a disability and have a reasonable accommodation request for the job application process, please email [email protected] the specific details of your disability related accommodation request. This email address is reserved for candidates with disabilities only. General application inquiries will not receive a response.