SlideShare a Scribd company logo
Using ScyllaDB for
Real-Time Write-Heavy
Workloads
Felipe Cardeneti Mendes, Technical Director, ScyllaDB
Lubos Kosco, Principal Field Engineer, ScyllaDB
+ For data-intensive applications that require high
throughput and predictable low latencies
+ Close-to-the-metal design takes full advantage of
modern infrastructure
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ Compatible with Apache Cassandra and Amazon
DynamoDB
+ DBaaS/Cloud, Enterprise and Open Source
solutions
The Database for Gamechangers
2
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the
power a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of
an in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
3
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Digital experiences at
massive scale
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine
Presenters
Felipe Cardeneti Mendes, Technical Director
+ Puppy Lover
+ Open Source Enthusiast
+ ScyllaDB passionate!
Lubos Kosco, Principal Field Engineer
+ Software Engineer
+ 🎹 lover, 🪂pilot
+ ScyllaDB enthusiast :-)
Agenda
+ Characterizing Write-heavy workloads
+ Challenges and Tradeoffs
+ ScyllaDB Under Load
+ Best Practices
+ Success Stories
High throughput REAL-TIME data ingestion.
Characterizing
Write-Heavy Workloads
+ Commonly referred to as "write-mostly"
+ Workloads requiring high volume of writes under very low response times
+ Challenges involve:
+ Scaling writes – Price per operation, often driven by internal design decisions
+ Locking – Add delays and reduce throughput
+ I/O Bottlenecks – Write amplification & Crash Recovery
+ Conflict Resolution – Resolving conflicts and/or Commit Protocols
+ Database Backpressure – Throttling incoming load
Real-time Write-Heavy?
+ 📟 Internet of Things
+ Often time-series workloads
+ Small (but frequent!) append-only writes
+ Rate determined by number of ingestion endpoints
+ 📶 Logging and Monitoring
+ Similar to IOT, but…
+ Doesn't have a fixed ingestion rate
+ Not necessarily append-only, and prone to hotspots
+ 🎮 Online Gaming
+ Real-time user interactions (state, actions, messaging)
+ Often spiky
+ Very latency dependant
Commonly Seen Use Cases
+ 🚚 E-commerce & Retail
+ Update-heavy and Batch-y
+ Inventory updates, reviews, order status & placement
+ Shopping carts inherently require a read-before-write
+ 🔔 Ad Tech and Real-time Bidding
+ Bid Processing (Impressions, Auction outcomes)
+ User Interaction (Clicks, Conversions, Fraud Detection)
+ Audience Segmentation
+ 📈 Real-time Stock exchange
+ High-frequency trading (HFT)
+ Stock prices updates
+ Order matching
What happens during a write?
Challenges and
Tradeoffs
LSM Tree
Underlying Storage Engine
B+Tree
ScyllaDB – Write Path
Memtable
Reads
Commit Log
Incoming
Writes
Storage
SStable 1
SStable 2
SStable 3
Time
SStable 4
SStable 5
SStable
1+2+3
Compaction
Payload Size
Compression Speed & Efficiency
Compression in ScyllaDB
Compression Chunk Size
Selecting Compression Chunk Sizes for ScyllaDB
+ Determines the size of a compression block
+ ScyllaDB block size defaults to: 4kB (SSTable), 1MB (RAID), filesystem sector size 1kB || 4kB
+ Trade-off:
+ Larger chunk sizes – Reduces the bandwidth used to write data
+ Smaller chunk sizes – Reduces the bandwidth needed to read data
Chunk size > Partition Size
Chunk size ~= Partition Size
Use case Recommendation Comments
small single key smaller chunks close to partition size
large single key larger chunks close to partition size
partition scans larger chunks good cache locality
mostly writes larger chunks saves write bandwidth
Compaction Strategy
+ The goal of a compaction strategy is low amplification
+ Read amplification – Avoid reading from too many SSTables
+ Write amplification –Avoid re-writing the same data over and over again
+ Space amplification – Avoid expired/deleted/overwritten data sitting on disk for too long
+ If write performance is important for you…
+ ❌ Avoid Leveled Compaction at all costs!
Every byte has to be rewritten up to 10 times per level
4 Levels usually - potentially up to 40x write amplification
ScyllaDB Under Load
Live Optimizing (or Worsening) Write Performance
Avoiding common mishaps
Best Practices
Batching Anti-Pattern
Client App
Far too much work for the coordinator
Christopher Batey's – Misuse of unlogged batches
Prefer Individual Inserts
Client App
Fully utilize cluster processing power
Christopher Batey's – Misuse of unlogged batches
Batching – Good Pattern
Client App
All to the same partition
Christopher Batey's – Misuse of unlogged batches
Views & Global Indexes
+ All writes to a base table are eventually propagated to the view table
+ If the update changes a view’s key column, this mean deleting an old view row and
creating new one:
UPDATE tbl SET x=3
WHERE pk=1
DELETE FROM view
WHERE x=<old value>
INSERT INTO view (x, …)
VALUES (3, <old data>)
read-before-write
Local Indexes, CDC & Other Animals
+ All writes are applied as a single mutation, however…
+ It also results in write amplification as it requires yet another write
UPDATE tbl SET x=3
WHERE pk=1
INSERT INTO index (pk, x,
…) VALUES (1, 3)
INSERT INTO tbl_cdc_log (…)
VALUES (…)
Success Stories
How ScyllaDB is being used among your peers!
Zillow – Real-time Property Updates
Consuming records from different data producers: Out of Order Writes
+ Why not SQL? Locking.
+ Can't simply insert everything: Data bloat, higher costs
+ Solution: INSERT … USING TIMESTAMP ?
“No one will even notice that we’re processing the entirety of Zillow’s property and listings data in order
to correct some data issue or change a business rule. The beauty of that is we can process the entirety
of the data at Zillow that we care about in less than a business day and, again, no performance hit to
real-time data.” – Dan Podhola, Principal Software Engineer
Zillow: Optimistic Concurrency with Write-Time Timestamps
Unleashing the Power of Data with Scylla
"We were looking for something
really specific.A highly scalable,
and highly performant NoSQL
database.
The answer was simple,
ScyllaDB is a better fit for our
use case."
João Pedro Voltani – Head of Engineering
Fanatics – Retail Operations
Use cases:
+ Order capture
+ Product Catalog
+ Shopping Carts
+ Promotions, …
“During a recent peak minute, we saw nearly 280,000 IOPs for a solid minute. With a 3-node ScyllaDB
cluster, we registered zero timeouts. Because of this we had happier customers and application
teams.” – Niraj Kothari, Director of Platforms Engineering
Read the Case Study
Poll
How much data do you have under
management of your transactional
database?
Keep Learning
scylladb.com/category/engineering Register now at p99conf.io
Visit our blog for
more on ScyllaDB
engineering
bit.ly/dynamodb-mc
DynamoDB
Cost Optimization
Masterclass
Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

More Related Content

Similar to Using ScyllaDB for Real-Time Write-Heavy Workloads

AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
Amazon Web Services
 
Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...
Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...
Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...
Amazon Web Services
 
Real-time Analytics with Redis
Real-time Analytics with RedisReal-time Analytics with Redis
Real-time Analytics with Redis
Cihan Biyikoglu
 
The Yin and Yang of Software
The Yin and Yang of SoftwareThe Yin and Yang of Software
The Yin and Yang of Software
elliando dias
 
Hadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsHadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both Worlds
Inside Analysis
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails Apps
Matt Kuklinski
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
Amazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
Amazon Web Services
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
Intel® Software
 
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
Amazon Web Services
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Amazon Web Services
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
Eugenio Minardi
 
Real World Use Cases and Success Stories for In-Memory Data Grids (TIBCO Acti...
Real World Use Cases and Success Stories for In-Memory Data Grids (TIBCO Acti...Real World Use Cases and Success Stories for In-Memory Data Grids (TIBCO Acti...
Real World Use Cases and Success Stories for In-Memory Data Grids (TIBCO Acti...
Kai Wähner
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
Amazon Web Services
 
Data storage for the cloud ce11
Data storage for the cloud ce11Data storage for the cloud ce11
Data storage for the cloud ce11
CloudExpoEurope
 
Data storage for the cloud ce11
Data storage for the cloud ce11Data storage for the cloud ce11
Data storage for the cloud ce11
aseager
 
Data storage for the cloud ce11
Data storage for the cloud ce11Data storage for the cloud ce11
Data storage for the cloud ce11
aseager
 
AWS Cloud Kata | Manila - Getting to Profitability on AWS
AWS Cloud Kata | Manila - Getting to Profitability on AWSAWS Cloud Kata | Manila - Getting to Profitability on AWS
AWS Cloud Kata | Manila - Getting to Profitability on AWS
Amazon Web Services
 
Retail & CPG
Retail & CPGRetail & CPG
How to Reduce your Spend on AWS
How to Reduce your Spend on AWSHow to Reduce your Spend on AWS
How to Reduce your Spend on AWS
Joseph K. Ziegler
 

Similar to Using ScyllaDB for Real-Time Write-Heavy Workloads (20)

AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...
Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...
Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...
 
Real-time Analytics with Redis
Real-time Analytics with RedisReal-time Analytics with Redis
Real-time Analytics with Redis
 
The Yin and Yang of Software
The Yin and Yang of SoftwareThe Yin and Yang of Software
The Yin and Yang of Software
 
Hadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsHadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both Worlds
 
Boosting the Performance of your Rails Apps
Boosting the Performance of your Rails AppsBoosting the Performance of your Rails Apps
Boosting the Performance of your Rails Apps
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
 
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
 
Real World Use Cases and Success Stories for In-Memory Data Grids (TIBCO Acti...
Real World Use Cases and Success Stories for In-Memory Data Grids (TIBCO Acti...Real World Use Cases and Success Stories for In-Memory Data Grids (TIBCO Acti...
Real World Use Cases and Success Stories for In-Memory Data Grids (TIBCO Acti...
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
Data storage for the cloud ce11
Data storage for the cloud ce11Data storage for the cloud ce11
Data storage for the cloud ce11
 
Data storage for the cloud ce11
Data storage for the cloud ce11Data storage for the cloud ce11
Data storage for the cloud ce11
 
Data storage for the cloud ce11
Data storage for the cloud ce11Data storage for the cloud ce11
Data storage for the cloud ce11
 
AWS Cloud Kata | Manila - Getting to Profitability on AWS
AWS Cloud Kata | Manila - Getting to Profitability on AWSAWS Cloud Kata | Manila - Getting to Profitability on AWS
AWS Cloud Kata | Manila - Getting to Profitability on AWS
 
Retail & CPG
Retail & CPGRetail & CPG
Retail & CPG
 
How to Reduce your Spend on AWS
How to Reduce your Spend on AWSHow to Reduce your Spend on AWS
How to Reduce your Spend on AWS
 

More from ScyllaDB

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
ScyllaDB
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
ScyllaDB
 
Noise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, AkamaiNoise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, Akamai
ScyllaDB
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
ScyllaDB
 
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
ScyllaDB
 
Performance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy EvertsPerformance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy Everts
ScyllaDB
 
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance TroublesUsing Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance Troubles
ScyllaDB
 
Reducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGCReducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGC
ScyllaDB
 
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
ScyllaDB
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
ScyllaDB
 
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB DriversConquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB Drivers
ScyllaDB
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
ScyllaDB
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
ScyllaDB
 
99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
ScyllaDB
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
ScyllaDB
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
ScyllaDB
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
ScyllaDB
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
ScyllaDB
 

More from ScyllaDB (20)

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throug...
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
 
Noise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, AkamaiNoise Canceling RUM by Tim Vereecke, Akamai
Noise Canceling RUM by Tim Vereecke, Akamai
 
Running a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU ImpactsRunning a Go App in Kubernetes: CPU Impacts
Running a Go App in Kubernetes: CPU Impacts
 
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Con...
 
Performance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy EvertsPerformance Budgets for the Real World by Tammy Everts
Performance Budgets for the Real World by Tammy Everts
 
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance TroublesUsing Libtracecmd to Analyze Your Latency and Performance Troubles
Using Libtracecmd to Analyze Your Latency and Performance Troubles
 
Reducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGCReducing P99 Latencies with Generational ZGC
Reducing P99 Latencies with Generational ZGC
 
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
 
How Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global ScaleHow Netflix Builds High Performance Applications at Global Scale
How Netflix Builds High Performance Applications at Global Scale
 
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB DriversConquering Load Balancing: Experiences from ScyllaDB Drivers
Conquering Load Balancing: Experiences from ScyllaDB Drivers
 
Interaction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance MetricInteraction Latency: Square's User-Centric Mobile Performance Metric
Interaction Latency: Square's User-Centric Mobile Performance Metric
 
How to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory ModelHow to Avoid Learning the Linux-Kernel Memory Model
How to Avoid Learning the Linux-Kernel Memory Model
 
99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
 

Recently uploaded

Ensuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG DeploymentsEnsuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG Deployments
Zilliz
 
IVE 2024 Short Course - Lecture 2 - Fundamentals of Perception
IVE 2024 Short Course - Lecture 2 - Fundamentals of PerceptionIVE 2024 Short Course - Lecture 2 - Fundamentals of Perception
IVE 2024 Short Course - Lecture 2 - Fundamentals of Perception
Mark Billinghurst
 
TribeQonf2024_Dimpy_ShiftingSecurityLeft
TribeQonf2024_Dimpy_ShiftingSecurityLeftTribeQonf2024_Dimpy_ShiftingSecurityLeft
TribeQonf2024_Dimpy_ShiftingSecurityLeft
Dimpy Adhikary
 
UiPath Community Day Amsterdam presentations
UiPath Community Day Amsterdam presentationsUiPath Community Day Amsterdam presentations
UiPath Community Day Amsterdam presentations
UiPathCommunity
 
The learners analyze the various sectors of ICT and evaluate the potential ca...
The learners analyze the various sectors of ICT and evaluate the potential ca...The learners analyze the various sectors of ICT and evaluate the potential ca...
The learners analyze the various sectors of ICT and evaluate the potential ca...
maricrismontales
 
Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...
Nohoax Kanont
 
Flame Atomic Emission Spectroscopy.-pptx
Flame Atomic Emission Spectroscopy.-pptxFlame Atomic Emission Spectroscopy.-pptx
Flame Atomic Emission Spectroscopy.-pptx
VaishnaviChavan206944
 
STKI Israeli IT Market Study v2 August 2024.pdf
STKI Israeli IT Market Study v2 August 2024.pdfSTKI Israeli IT Market Study v2 August 2024.pdf
STKI Israeli IT Market Study v2 August 2024.pdf
Dr. Jimmy Schwarzkopf
 
Jacquard Fabric Explained: Origins, Characteristics, and Uses
Jacquard Fabric Explained: Origins, Characteristics, and UsesJacquard Fabric Explained: Origins, Characteristics, and Uses
Jacquard Fabric Explained: Origins, Characteristics, and Uses
ldtexsolbl
 
Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024
Peter Caitens
 
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptxFIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Alliance
 
Leading Bigcommerce Development Services for Online Retailers
Leading Bigcommerce Development Services for Online RetailersLeading Bigcommerce Development Services for Online Retailers
Leading Bigcommerce Development Services for Online Retailers
SynapseIndia
 
Getting Ready for Copilot for Microsoft 365 with Governance Features in Share...
Getting Ready for Copilot for Microsoft 365 with Governance Features in Share...Getting Ready for Copilot for Microsoft 365 with Governance Features in Share...
Getting Ready for Copilot for Microsoft 365 with Governance Features in Share...
Juan Carlos Gonzalez
 
BCC -401-aktu-Cyber-Security Unit-1.docx
BCC -401-aktu-Cyber-Security Unit-1.docxBCC -401-aktu-Cyber-Security Unit-1.docx
BCC -401-aktu-Cyber-Security Unit-1.docx
pubgnewstate1620
 
Connecting Attitudes and Social Influences with Designs for Usable Security a...
Connecting Attitudes and Social Influences with Designs for Usable Security a...Connecting Attitudes and Social Influences with Designs for Usable Security a...
Connecting Attitudes and Social Influences with Designs for Usable Security a...
Cori Faklaris
 
FIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptxFIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Alliance
 
Top keywords searches on home and garden
Top keywords searches on home and gardenTop keywords searches on home and garden
Top keywords searches on home and garden
riannecreativetwo
 
Multimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup SlidesMultimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup Slides
Zilliz
 
The Challenge of Interpretability in Generative AI Models.pdf
The Challenge of Interpretability in Generative AI Models.pdfThe Challenge of Interpretability in Generative AI Models.pdf
The Challenge of Interpretability in Generative AI Models.pdf
Sara Kroft
 
Project Delivery Methodology on a page with activities, deliverables
Project Delivery Methodology on a page with activities, deliverablesProject Delivery Methodology on a page with activities, deliverables
Project Delivery Methodology on a page with activities, deliverables
CLIVE MINCHIN
 

Recently uploaded (20)

Ensuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG DeploymentsEnsuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG Deployments
 
IVE 2024 Short Course - Lecture 2 - Fundamentals of Perception
IVE 2024 Short Course - Lecture 2 - Fundamentals of PerceptionIVE 2024 Short Course - Lecture 2 - Fundamentals of Perception
IVE 2024 Short Course - Lecture 2 - Fundamentals of Perception
 
TribeQonf2024_Dimpy_ShiftingSecurityLeft
TribeQonf2024_Dimpy_ShiftingSecurityLeftTribeQonf2024_Dimpy_ShiftingSecurityLeft
TribeQonf2024_Dimpy_ShiftingSecurityLeft
 
UiPath Community Day Amsterdam presentations
UiPath Community Day Amsterdam presentationsUiPath Community Day Amsterdam presentations
UiPath Community Day Amsterdam presentations
 
The learners analyze the various sectors of ICT and evaluate the potential ca...
The learners analyze the various sectors of ICT and evaluate the potential ca...The learners analyze the various sectors of ICT and evaluate the potential ca...
The learners analyze the various sectors of ICT and evaluate the potential ca...
 
Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...
 
Flame Atomic Emission Spectroscopy.-pptx
Flame Atomic Emission Spectroscopy.-pptxFlame Atomic Emission Spectroscopy.-pptx
Flame Atomic Emission Spectroscopy.-pptx
 
STKI Israeli IT Market Study v2 August 2024.pdf
STKI Israeli IT Market Study v2 August 2024.pdfSTKI Israeli IT Market Study v2 August 2024.pdf
STKI Israeli IT Market Study v2 August 2024.pdf
 
Jacquard Fabric Explained: Origins, Characteristics, and Uses
Jacquard Fabric Explained: Origins, Characteristics, and UsesJacquard Fabric Explained: Origins, Characteristics, and Uses
Jacquard Fabric Explained: Origins, Characteristics, and Uses
 
Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024
 
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptxFIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
 
Leading Bigcommerce Development Services for Online Retailers
Leading Bigcommerce Development Services for Online RetailersLeading Bigcommerce Development Services for Online Retailers
Leading Bigcommerce Development Services for Online Retailers
 
Getting Ready for Copilot for Microsoft 365 with Governance Features in Share...
Getting Ready for Copilot for Microsoft 365 with Governance Features in Share...Getting Ready for Copilot for Microsoft 365 with Governance Features in Share...
Getting Ready for Copilot for Microsoft 365 with Governance Features in Share...
 
BCC -401-aktu-Cyber-Security Unit-1.docx
BCC -401-aktu-Cyber-Security Unit-1.docxBCC -401-aktu-Cyber-Security Unit-1.docx
BCC -401-aktu-Cyber-Security Unit-1.docx
 
Connecting Attitudes and Social Influences with Designs for Usable Security a...
Connecting Attitudes and Social Influences with Designs for Usable Security a...Connecting Attitudes and Social Influences with Designs for Usable Security a...
Connecting Attitudes and Social Influences with Designs for Usable Security a...
 
FIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptxFIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptx
 
Top keywords searches on home and garden
Top keywords searches on home and gardenTop keywords searches on home and garden
Top keywords searches on home and garden
 
Multimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup SlidesMultimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup Slides
 
The Challenge of Interpretability in Generative AI Models.pdf
The Challenge of Interpretability in Generative AI Models.pdfThe Challenge of Interpretability in Generative AI Models.pdf
The Challenge of Interpretability in Generative AI Models.pdf
 
Project Delivery Methodology on a page with activities, deliverables
Project Delivery Methodology on a page with activities, deliverablesProject Delivery Methodology on a page with activities, deliverables
Project Delivery Methodology on a page with activities, deliverables
 

Using ScyllaDB for Real-Time Write-Heavy Workloads

  • 1. Using ScyllaDB for Real-Time Write-Heavy Workloads Felipe Cardeneti Mendes, Technical Director, ScyllaDB Lubos Kosco, Principal Field Engineer, ScyllaDB
  • 2. + For data-intensive applications that require high throughput and predictable low latencies + Close-to-the-metal design takes full advantage of modern infrastructure + >5x higher throughput + >20x lower latency + >75% TCO savings + Compatible with Apache Cassandra and Amazon DynamoDB + DBaaS/Cloud, Enterprise and Open Source solutions The Database for Gamechangers 2 “ScyllaDB stands apart...It’s the rare product that exceeds my expectations.” – Martin Heller, InfoWorld contributing editor and reviewer “For 99.9% of applications, ScyllaDB delivers all the power a customer will ever need, on workloads that other databases can’t touch – and at a fraction of the cost of an in-memory solution.” – Adrian Bridgewater, Forbes senior contributor
  • 3. 3 +400 Gamechangers Leverage ScyllaDB Seamless experiences across content + devices Digital experiences at massive scale Corporate fleet management Real-time analytics 2,000,000 SKU -commerce management Video recommendation management Threat intelligence service using JanusGraph Real time fraud detection across 6M transactions/day Uber scale, mission critical chat & messaging app Network security threat detection Power ~50M X1 DVRs with billions of reqs/day Precision healthcare via Edison AI Inventory hub for retail operations Property listings and updates Unified ML feature store across the business Cryptocurrency exchange app Geography-based recommendations Global operations- Avon, Body Shop + more Predictable performance for on sale surges GPS-based exercise tracking Serving dynamic live streams at scale Powering India's top social media platform Personalized advertising to players Distribution of game assets in Unreal Engine
  • 4. Presenters Felipe Cardeneti Mendes, Technical Director + Puppy Lover + Open Source Enthusiast + ScyllaDB passionate! Lubos Kosco, Principal Field Engineer + Software Engineer + 🎹 lover, 🪂pilot + ScyllaDB enthusiast :-)
  • 5. Agenda + Characterizing Write-heavy workloads + Challenges and Tradeoffs + ScyllaDB Under Load + Best Practices + Success Stories
  • 6. High throughput REAL-TIME data ingestion. Characterizing Write-Heavy Workloads
  • 7. + Commonly referred to as "write-mostly" + Workloads requiring high volume of writes under very low response times + Challenges involve: + Scaling writes – Price per operation, often driven by internal design decisions + Locking – Add delays and reduce throughput + I/O Bottlenecks – Write amplification & Crash Recovery + Conflict Resolution – Resolving conflicts and/or Commit Protocols + Database Backpressure – Throttling incoming load Real-time Write-Heavy?
  • 8. + 📟 Internet of Things + Often time-series workloads + Small (but frequent!) append-only writes + Rate determined by number of ingestion endpoints + 📶 Logging and Monitoring + Similar to IOT, but… + Doesn't have a fixed ingestion rate + Not necessarily append-only, and prone to hotspots + 🎮 Online Gaming + Real-time user interactions (state, actions, messaging) + Often spiky + Very latency dependant Commonly Seen Use Cases + 🚚 E-commerce & Retail + Update-heavy and Batch-y + Inventory updates, reviews, order status & placement + Shopping carts inherently require a read-before-write + 🔔 Ad Tech and Real-time Bidding + Bid Processing (Impressions, Auction outcomes) + User Interaction (Clicks, Conversions, Fraud Detection) + Audience Segmentation + 📈 Real-time Stock exchange + High-frequency trading (HFT) + Stock prices updates + Order matching
  • 9. What happens during a write? Challenges and Tradeoffs
  • 11. ScyllaDB – Write Path Memtable Reads Commit Log Incoming Writes Storage SStable 1 SStable 2 SStable 3 Time SStable 4 SStable 5 SStable 1+2+3 Compaction
  • 13. Compression Speed & Efficiency Compression in ScyllaDB
  • 14. Compression Chunk Size Selecting Compression Chunk Sizes for ScyllaDB + Determines the size of a compression block + ScyllaDB block size defaults to: 4kB (SSTable), 1MB (RAID), filesystem sector size 1kB || 4kB + Trade-off: + Larger chunk sizes – Reduces the bandwidth used to write data + Smaller chunk sizes – Reduces the bandwidth needed to read data Chunk size > Partition Size Chunk size ~= Partition Size Use case Recommendation Comments small single key smaller chunks close to partition size large single key larger chunks close to partition size partition scans larger chunks good cache locality mostly writes larger chunks saves write bandwidth
  • 15. Compaction Strategy + The goal of a compaction strategy is low amplification + Read amplification – Avoid reading from too many SSTables + Write amplification –Avoid re-writing the same data over and over again + Space amplification – Avoid expired/deleted/overwritten data sitting on disk for too long + If write performance is important for you… + ❌ Avoid Leveled Compaction at all costs! Every byte has to be rewritten up to 10 times per level 4 Levels usually - potentially up to 40x write amplification
  • 16. ScyllaDB Under Load Live Optimizing (or Worsening) Write Performance
  • 18. Batching Anti-Pattern Client App Far too much work for the coordinator Christopher Batey's – Misuse of unlogged batches
  • 19. Prefer Individual Inserts Client App Fully utilize cluster processing power Christopher Batey's – Misuse of unlogged batches
  • 20. Batching – Good Pattern Client App All to the same partition Christopher Batey's – Misuse of unlogged batches
  • 21. Views & Global Indexes + All writes to a base table are eventually propagated to the view table + If the update changes a view’s key column, this mean deleting an old view row and creating new one: UPDATE tbl SET x=3 WHERE pk=1 DELETE FROM view WHERE x=<old value> INSERT INTO view (x, …) VALUES (3, <old data>) read-before-write
  • 22. Local Indexes, CDC & Other Animals + All writes are applied as a single mutation, however… + It also results in write amplification as it requires yet another write UPDATE tbl SET x=3 WHERE pk=1 INSERT INTO index (pk, x, …) VALUES (1, 3) INSERT INTO tbl_cdc_log (…) VALUES (…)
  • 23. Success Stories How ScyllaDB is being used among your peers!
  • 24. Zillow – Real-time Property Updates Consuming records from different data producers: Out of Order Writes + Why not SQL? Locking. + Can't simply insert everything: Data bloat, higher costs + Solution: INSERT … USING TIMESTAMP ? “No one will even notice that we’re processing the entirety of Zillow’s property and listings data in order to correct some data issue or change a business rule. The beauty of that is we can process the entirety of the data at Zillow that we care about in less than a business day and, again, no performance hit to real-time data.” – Dan Podhola, Principal Software Engineer Zillow: Optimistic Concurrency with Write-Time Timestamps
  • 25. Unleashing the Power of Data with Scylla "We were looking for something really specific.A highly scalable, and highly performant NoSQL database. The answer was simple, ScyllaDB is a better fit for our use case." João Pedro Voltani – Head of Engineering
  • 26. Fanatics – Retail Operations Use cases: + Order capture + Product Catalog + Shopping Carts + Promotions, … “During a recent peak minute, we saw nearly 280,000 IOPs for a solid minute. With a 3-node ScyllaDB cluster, we registered zero timeouts. Because of this we had happier customers and application teams.” – Niraj Kothari, Director of Platforms Engineering Read the Case Study
  • 27. Poll How much data do you have under management of your transactional database?
  • 28. Keep Learning scylladb.com/category/engineering Register now at p99conf.io Visit our blog for more on ScyllaDB engineering bit.ly/dynamodb-mc DynamoDB Cost Optimization Masterclass
  • 29. Thank you for joining us today. @scylladb scylladb/ slack.scylladb.com @scylladb company/scylladb/ scylladb/