Careers - obsdeck

DevOps Engineer

Full-time Remote San Francisco, CA

Build and maintain the infrastructure that powers our AI-native observability platform. You'll work on scalable cloud infrastructure, CI/CD pipelines, and deployment automation to support our rapidly growing platform serving mission-critical workloads.

What You'll Do

Design and implement scalable cloud infrastructure using Kubernetes and Terraform
Build and maintain CI/CD pipelines for rapid, reliable deployments
Develop monitoring and alerting systems for our infrastructure and services
Optimize cloud costs and resource utilization across our platform
Implement security best practices and compliance requirements
Collaborate with engineering teams to improve deployment processes

What We're Looking For

5+ years of experience in DevOps, SRE, or infrastructure engineering
Strong experience with Kubernetes, Docker, and container orchestration
Proficiency with cloud services and infrastructure management
Infrastructure as Code experience (Terraform, CloudFormation)
Experience with CI/CD tools (GitHub Actions, Jenkins, ArgoCD)
Strong scripting skills (Python, Bash, Go)
Understanding of networking, security, and observability concepts

Bonus Points

Experience with monitoring and observability tools
Background in SRE or platform engineering
Experience scaling infrastructure for high-throughput data pipelines
Contributions to open-source infrastructure projects

Full Stack Engineer

Full-time Remote San Francisco, CA

Design and build user interfaces and APIs that help engineering teams monitor their applications and AI models. You'll work across the entire stack, from React frontends to Node.js/Python backends, creating intuitive experiences for complex observability workflows.

What You'll Do

Build responsive, performant web applications using React and TypeScript
Design and implement RESTful and GraphQL APIs for data visualization
Create data visualization components for metrics, traces, and logs
Develop real-time dashboards and alerting interfaces
Work with product designers to craft intuitive user experiences
Optimize application performance and database queries
Write comprehensive tests and maintain high code quality

What We're Looking For

4+ years of full-stack development experience
Strong proficiency in React, TypeScript, and modern JavaScript
Experience with backend frameworks (Node.js, Express, or Python/FastAPI)
Solid understanding of RESTful API design and GraphQL
Experience with relational and NoSQL databases (PostgreSQL, MongoDB)
Knowledge of state management (Redux, Zustand, or similar)
Strong CSS skills and experience with design systems

Bonus Points

Experience building data visualization tools (D3.js, Recharts, Plotly)
Background in observability or monitoring tools
Experience with real-time data streaming (WebSockets, Server-Sent Events)
Knowledge of performance optimization and profiling
Open-source contributions to developer tools

Data Scientist

Full-time Remote San Francisco, CA

Develop machine learning models for anomaly detection, predictive analytics, and root cause analysis. You'll work with large-scale time-series data and build the algorithms that power intelligent monitoring for applications and AI systems.

What You'll Do

Design and implement anomaly detection algorithms for time-series metrics
Build predictive models to forecast system behavior and prevent outages
Develop root cause analysis systems using correlation and causal inference
Create automated baseline learning systems for dynamic thresholds
Work with streaming data pipelines to enable real-time ML inference
Collaborate with engineers to deploy models into production
Analyze and interpret complex patterns in application and infrastructure metrics

What We're Looking For

MS or PhD in Computer Science, Statistics, Mathematics, or related field
4+ years of experience in data science or machine learning
Strong expertise in time-series analysis and forecasting methods
Proficiency with Python and ML frameworks (TensorFlow, PyTorch, scikit-learn)
Experience with anomaly detection techniques (statistical, ML-based)
Understanding of distributed systems and their failure modes
Strong foundation in statistics and probability theory

Bonus Points

Experience with AIOps or observability platforms
Background in causal inference or Bayesian methods
Experience with streaming ML (Apache Flink, Spark Streaming)
Knowledge of AutoML and automated feature engineering
Publications in relevant ML conferences

LLM Research Engineer

Full-time Remote San Francisco, CA

Research and develop novel approaches to monitoring and optimizing large language models in production. You'll work on cutting-edge problems like LLM performance profiling, inference optimization, and intelligent prompt analysis for enterprise AI applications.

What You'll Do

Research and develop novel methods for LLM observability and monitoring
Design systems to track and analyze LLM inference performance, latency, and quality
Build automated evaluation frameworks for prompt engineering and model outputs
Develop techniques for detecting hallucinations, biases, and degradation in LLM responses
Create optimization strategies for LLM serving and resource utilization
Analyze failure modes in production LLM deployments
Publish research findings and contribute to the ML community

What We're Looking For

PhD or MS in Computer Science, Machine Learning, or related field
3+ years of hands-on experience with large language models
Deep understanding of transformer architectures and attention mechanisms
Experience with LLM frameworks (Hugging Face, LangChain, LlamaIndex)
Strong programming skills in Python and PyTorch
Track record of research publications or significant contributions to ML projects
Experience with model evaluation metrics and benchmarking

Bonus Points

Experience with LLM fine-tuning and RLHF
Knowledge of model compression and quantization techniques
Background in production ML systems and MLOps
Experience with vision-language models (VLMs)
Contributions to major LLM open-source projects
Publications at top-tier ML conferences (NeurIPS, ICML, ICLR)

Data Engineer

Full-time Remote San Francisco, CA

Build and scale the data infrastructure that powers our observability platform. You'll design and implement large-scale data pipelines processing billions of events per day, working with modern data stack tools to enable real-time and batch analytics for applications and AI models.

What You'll Do

Design and build scalable data pipelines for metrics, logs, and traces ingestion
Develop ETL/ELT workflows using Airflow to process terabytes of observability data
Implement data models and schemas in Snowflake for efficient querying and analytics
Build real-time streaming pipelines using Kafka, Flink, or Spark Streaming
Optimize data storage and query performance for time-series and event data
Implement data quality monitoring and validation frameworks
Work with data scientists to enable ML feature engineering at scale
Build data APIs and services for internal teams and customer integrations

What We're Looking For

5+ years of experience in data engineering or similar roles
Strong expertise with Snowflake, BigQuery, or other cloud data warehouses
Hands-on experience with Airflow (or similar orchestration tools like Dagster, Prefect)
Proficiency with streaming platforms (Kafka, Kinesis, Pulsar)
Experience with distributed processing frameworks (Spark, Flink, Beam)
Strong SQL skills and data modeling expertise
Proficiency in Python and familiarity with data frameworks (pandas, dask, polars)
Understanding of data governance, security, and compliance best practices

Bonus Points

Experience with observability or monitoring platforms
Knowledge of dbt for data transformation and modeling
Experience with columnar storage formats (Parquet, ORC, Iceberg)
Familiarity with data catalog tools (DataHub, Amundsen)
Understanding of cost optimization in cloud data platforms
Background in real-time analytics and event-driven architectures

Build the Future of
Observability

Why Join obsdeck

Cutting-Edge Technology

Strong Team Culture

Competitive Benefits

Open Positions

DevOps Engineer

What You'll Do

What We're Looking For

Bonus Points

Full Stack Engineer

What You'll Do

What We're Looking For

Bonus Points

Data Scientist

What You'll Do

What We're Looking For

Bonus Points

LLM Research Engineer

What You'll Do

What We're Looking For

Bonus Points

Data Engineer

What You'll Do

What We're Looking For

Bonus Points

Build the Future of Observability

Why Join obsdeck

Cutting-Edge Technology

Strong Team Culture

Competitive Benefits

Open Positions

DevOps Engineer

What You'll Do

What We're Looking For

Bonus Points

Full Stack Engineer

What You'll Do

What We're Looking For

Bonus Points

Data Scientist

What You'll Do

What We're Looking For

Bonus Points

LLM Research Engineer

What You'll Do

What We're Looking For

Bonus Points

Data Engineer

What You'll Do

What We're Looking For

Bonus Points

Build the Future of
Observability