Performance & Optimization

Performance & Optimization #

Resources #

Notes #

It’s important to have a methodology. I’ve run these tools in this manner and I didn’t find anything that indicates a performance issue on the server. I think you should check your code.

Anti-Methods #

Drunk man looking for their keys under a streetlamp because that’s where the light is.

Blame someone else anti-method.

Actual Methodologies #

Problem statement
Workload characterization
USE
Off-CPU Analysis
CPU profile
RTFM Method
Active Benchmarking
Static Performance Tuning

Problem Statement Method #

What makes you think there is a performance problem?
Has this system ever performed well?
What has changed recently? a. Software? b. Hardware? c. Load?
Can the performance degradation be expressed in terms of latency or run time?
Does the problem affect other people or applications (or is it just you)?
What is the environment? Software, hardware, instance types? Versions? Configuration?

Workload Characterization Method #

Who is causing the load? PID, UID, IP addr, …
Why is the load called? code path, stack trace
What is the load? IOPS, tput, type, r/w
How is the load changing over time?

The USE Method #

Only check these 3 things for all of your resources.

For every resource check:

Utilization
Saturation
Errors

Definitions:

Utilization: busy time
Saturation: queue length or queued time
Errors: easy to interpret (objective)

It helps if you have a functional (block) diagram of your system / software / environment, showing all resources

Start with questions, then find the tools.

USE Method: Linux Performance Checklist #

USE Linux

Off-CPU Analysis #

See slides

I’m not sure I understand this really…worth more research.

CPU Profile Method #

Take a CPU profile
Understand all software in profile > 1%

Discovers a wide range of performance issues by their CPU usage.
Narrows software study.

If you profile what’s on CPU then narrows down what parts of the software (i.e. MySQL) is actually turned on and therefore needs to be looked at.

RTFM #

How to understand performance tools or metrics?

Man pages
Books
Web search
Co-workers
Talks, slides, videos
Support services
Source code
Experimentation
Social

Reading through source code. Writing a bit of code that should tax the resource in the way we’re looking for.

Tools #

Objectives:

Perform the USE Method for resource utilization
Perform workload characterization for disks, network
Perform CPU Profile Method using flame graphs
Have exposure to various observability tools:
- Basic: vmstat, iostat, mpstat, ps, top
- Intermediate: tcpdump, netstat, nicstat, pidstat, sar,
- Advnaced: ss, slaptop, perf_events,
Perform Active Benchmarking
Understand tuning risks
Perform Static Performance Tuning

Tool Types #

Type	Types
Observability	Watch activity. Safe: usually, depending on resource overhead.
Benchmarking	Load test. Caution: production tests can cause issues due to contention.
Tuning	Change. Danger: changes could hurt performance, now or later with load.
Static	Check configuration. Should be safe.

Basic Observability Tools #

uptime
top or htop
ps
vmstat
iostat
mpstat
free

Intermediate Observability Tools #

strace
tcpdump
netstat
nicstat
pidstat
swapon
lsof
sar - System Activity Reporter