Software Engineer II, Machine Learning Platform
Company: Attentive Inc
Location: San Francisco
Posted on: April 14, 2025
Job Description:
Attentive is the AI-powered mobile marketing platform
transforming the way brands personalize consumer engagement.
Attentive enables marketers to craft tailored journeys for every
subscriber, driving higher recurring revenue and maximizing
campaign performance. Activating real-time data from multiple
channels and advanced AI, the platform personalizes content, tone,
and timing to deliver 1:1 messages that truly resonate.With a
top-rated customer success team recognized on G2, Attentive
partners with marketers to provide strategic guidance and optimize
SMS and email campaigns. Trusted by leading global brands like
Neiman Marcus, Samsung, Wayfair, and Dyson, Attentive ensures
enterprise-grade compliance and deliverability, supporting
trillions of interactions across more than 70 industries. To learn
more or request a demo, visit or follow us on , (formerly Twitter),
or .Attentive's growth has been recognized by , and all thanks to
the hard work from our global employees!About the RoleWe're looking
for a self-motivated, highly driven Software Engineer II to join
our Machine Learning Platform (MLOps) team. As a team, we enable
Attentive's Machine Learning (ML) practice to directly impact
Attentive's AI product suite through the tools to train, inference,
and deploy ML models with higher velocity and performance, while
maintaining reliability. This team and role is responsible for
building and operating the ML data, tooling, serving, and inference
layers of the ML platform. We are excited to bring on more
engineers to continue expanding this stack.What You'll
Accomplish
- Expand, mature, and optimize our ML platform built around
cutting edge tooling like Ray, MLFlow, Argo, and Kubernetes to
support traditional and deep learning ML models
- Build and mature capabilities to support CPU / GPU clusters,
model performance monitoring, drift detection, automated roll-outs,
and improved developer experience
- Build, operate, and maintain a low-latency, high volume ML
serving layer covering both online and batch inference use
cases
- Orchestrate Kubernetes and ML training / inference
infrastructure exposed as an ML platform
- Expose and manage environments, interfaces, and workflows to
enable ML engineers to develop, build, and test ML models and
services
- Close the latency gap on model inference to online, real-time
model serving
- Develop automation workflows to improve team efficiency and ML
stability
- Analyze and improve efficiency, scalability, and stability of
various system resources
- Partner with other teams and business stakeholders to deliver
business initiatives
- Help onboard new team members, provide mentorship and enable
successful ramp up on your team's code basesAbout you
- You have been working in the areas of MLOps / Platform
Engineering / DevOps / Infrastructure for 5+ years, and have an
understanding of gold standard practices and best in class tooling
for ML
- Your passion is exposing platform capabilities through
interfaces that enable high performance ML practices, rather than
designing ML experiments (this team does not directly develop ML
models)
- You understand the key differences between online and offline
ML inferences and can voice the critical elements to be successful
with each to meet business needs
- You have experience building infrastructure for an ML platform
and managing CPU and GPU compute
- You have a background in software development and are
passionate about bringing that experience to bear on the world of
ML infrastructure
- You have experience with Infrastructure as Code using Terraform
and can't imagine a world without it
- You understand the importance of CI/CD in building
high-performing teams and have worked with tools like Jenkins,
CircleCI, Argo Workflows, and ArgoCD
- You are passionate about observability and worked with tools
such as Splunk, Nagios, Sensu, Datadog, New Relic
- You are very familiar with containers and container
orchestration and have direct experience with vanilla Docker as
well as Kubernetes as both a user and as an administratorYour
Expertise
- You have been working in the areas of ML Platform / MLOps /
Platform Engineering / DevOps / Infrastructure for 3+ years, and
have an understanding of gold standard practices and best in class
tooling for ML
- Your passion is exposing platform capabilities through
interfaces that enable high performance ML practices, rather than
designing ML experiments (this team does not directly develop ML
models)
- You understand the key differences between online and offline
ML inferences and can voice the critical elements to be successful
with each to meet business needs
- You have experience building infrastructure for an ML platform
and managing CPU and GPU compute
- You have a background in software development and are
passionate about bringing that experience to bear on the world of
ML infrastructure
- You have experience with Infrastructure as Code using Terraform
and can't imagine a world without it
- You understand the importance of CI/CD in building
high-performing teams and have worked with tools like Jenkins,
CircleCI, Argo Workflows, and ArgoCD
- You are passionate about observability and worked with tools
such as Splunk, Nagios, Sensu, Datadog, New Relic
- You are very familiar with containers and container
orchestration and have direct experience with vanilla Docker as
well as Kubernetes as both a user and as an administratorWhat We
Use
- Our infrastructure runs primarily in Kubernetes hosted in AWS's
EKS
- Infrastructure tooling includes Istio, Datadog, Terraform,
CloudFlare, and Helm
- Our backend is Java / Spring Boot microservices, built with
Gradle, coupled with things like DynamoDB, Kinesis, AirFlow,
Postgres, Planetscale, and Redis, hosted via AWS
- Our frontend is built with React and TypeScript, and uses best
practices like GraphQL, Storybook, Radix UI, Vite, esbuild, and
Playwright
- Our automation is driven by custom and open source machine
learning models, lots of data and built with Python, Metaflow,
HuggingFace, PyTorch, TensorFlow, and PandasYou'll get competitive
, from health & wellness to equity, to help you bring your best
self to work.For US based applicants:- The US base salary range for
this full-time position is $148,000 - $195,000 annually + equity +
benefits- Our salary ranges are determined by role, level, and
location#LI-EZ1Attentive Company ValuesDefault to Action - Move
swiftly and with purposeBe One Unstoppable Team - Rally as each
other's championsChampion the Customer - Our success is defined by
our customers' successAct Like an Owner- Take responsibility for
Attentive's successIf you do not meet all the requirements listed
here, we still encourage you to apply! No job description is
perfect, and we may also have another opportunity that closely
matches your skills and experience.At Attentive, we know that our
Company's strength lies in the diversity of our employees.
Attentive is an Equal Opportunity Employer and we welcome
applicants from all backgrounds. Our policy is to provide equal
employment opportunities for all employees, applicants and covered
individuals regardless of protected characteristics. We prioritize
and maintain a fair, inclusive and equitable workplace free from
discrimination, harassment, and retaliation. Attentive is also
committed to providing reasonable accommodations for candidates
with disabilities. If you need any assistance or reasonable
accommodations, please let your recruiter know.
#J-18808-Ljbffr
Keywords: Attentive Inc, Mountain View , Software Engineer II, Machine Learning Platform, IT / Software / Systems , San Francisco, California
Didn't find what you're looking for? Search again!
Loading more jobs...