Introduction

翻译http://oreilly.com/catalog/errata.csp?isbn=9781492034148书籍

有些单词或者内容不会去翻译是为了防止歧义或者理解错误 原书里有些废话的部分可能机翻译或者翻译不好

本书详细介绍了如何使用Prometheus监控系统进行监控应用程序和基础架构性能的信息,图表和警报。 这本书适用于应用程序开发人员,系统管理员和所有人

Part I. Introduction

    1. What Is Prometheus?

    2. What Is Monitoring?

      • A Brief and Incomplete History of Monitoring

      • Categories of Monitoring

    3. Prometheus Architecture

      • Client Libraries

      • Exporters

      • Service Discovery

      • Scraping

      • Storage

      • Dashboards

      • Recording Rules and Alerts

      • Alert Management

      • Long-Term Storage

    4. What Prometheus Is Not

  • 2. Getting Started with Prometheus

    • Running Prometheus

    • Using the Expression Browser

    • Running the Node Exporter

    • Alerting

Part II. Application Monitoring

  • 3. Instrumentation

    • A Simple Program

    • The Counter

      • Counting Exceptions

      • Counting Size

    • The Gauge

      • Using Gauges

      • Callbacks

    • The Summary

    • The Histogram

      • Buckets

    • Unit Testing Instrumentation

    • Approaching Instrumentation

      • What Should I Instrument?

      • How Much Should I Instrument?

      • What Should I Name My Metrics?

  • 4. Exposition

    • Python

      • WSGI

      • Twisted

      • Multiprocess with Gunicorn

    • Go

    • Java

      • HTTPServer

      • Servlet

    • Pushgateway

    • Bridge

    • Parsers

    • Exposition Format

      • Metrics Types

      • Labels

      • Escaping

      • Timestamps

      • check metrics

  • 5. Labels

    • What Are Labels?

    • Instrumentation and Target Labels

    • Instrumentation

      • Metric

      • Multiple Labels

      • Child

    • Aggregating

    • Label Patterns

      • Enum

      • Info

    • When to Use Labels

      • Cardinality

  • 6. Dashboarding with Grafana

    • Installation

    • Data Source

    • Dashboards and Panels

      • Avoiding the Wall of Graphs

    • Graph Panel

      • Time Controls

    • Singlestat Panel

    • Table Panel

    • Template Variables

Part III. Infrastructure Monitoring

  • 7. Node Exporter

    • CPU Collector

    • Filesystem Collector

    • Diskstats Collector

    • Netdev Collector

    • Meminfo Collector

    • Hwmon Collector

    • Stat Collector

    • Uname Collector

    • Loadavg Collector

    • Textfile Collector

      • Using the Textfile Collector

      • Timestamps

  • 8. Service Discovery

    • Service Discovery Mechanisms

      • Static

      • File

      • Consul

      • EC2

    • Relabelling

      • Choosing What to Scrape

      • Target Labels

    • How to Scrape

      • metric_relabel_configs

      • Label Clashes and honor_labels

  • 9. Containers and Kubernetes

    • cAdvisor

      • CPU

      • Memory

      • Labels

    • Kubernetes

      • Running in Kubernetes

      • Service Discovery

      • kube-state-metrics

  • 10. Common Exporters

    • Consul

    • HAProxy

    • Grok Exporter

    • Blackbox

      • ICMP

      • TCP

      • HTTP

      • DNS

      • Prometheus Configuration

  • 11. Working with Other Monitoring Systems

    • Other Monitoring Systems

    • InfluxDB

    • StatsD

  • 12. Writing Exporters

    • Consul Telemetry

    • Custom Collectors

      • Labels

    • Guidelines

Part IV. PromQL

  • 13. Introduction to PromQL

    • Aggregation Basics

      • Gauge

      • Counter

      • Summary

      • Histogram

    • Selectors

      • Matchers

      • Instant Vector

      • Range Vector

      • Offset

    • HTTP API

      • query

      • query_range

  • 14. Aggregation Operators

    • Grouping

      • without

      • by

    • Operators

      • sum

      • count

      • avg

      • stddev and stdvar

      • min and max

      • topl and bottomk

      • quantile

      • count_values

  • 15. Binary Operators

    • Working with Scalars

      • Arithmetic Operators

      • Comparison Operators

    • Vector Matching

      • One-to-One

      • Many-to-One and group_left

      • Mant-to-Many and Logical Operators

    • Operator Precedence

  • 16. Functions

    • Changing Type

      • vector

      • scalar

    • Math

      • abs

      • ln,log2,and log10

      • exp

      • sqrt

      • ceil and floor

      • round

      • clamp_max and clamp_min

    • Time and Date

      • time

      • minute,hour,day_of_week,day_of_month,dats_in_month,month,and year

      • timestamp

    • Labels

      • label_replace

      • label_join

    • Missing Series and absent

    • Sorting with sort and sort_desc

    • Histograms with histogram_quantile

    • Counters

      • rate

      • increase

      • irate

      • resets

    • Changing Gauges

      • changes

      • deriv

      • predict_linear

      • delta

      • idelta

      • holt_winters

    • Aggregation Over Time

  • 17. Recording Rules

    • Using Recording Rules

    • When to Use Recording Rules

      • Reducing Cardinality

      • Composing Range Vector Functions

      • Rules for APIs

      • How Not to Use Rules

    • Naming of Recording Rules

Part V. Alerting

  • 18. Alerting

    • Alerting Rules

      • for

      • Alert Labels

      • Annotations and Templates

      • What Are Good Alerts?

    • Configuring Alertmanagers

      • External Labels

  • 19. Alertmanager

    • Notification Pipeline

    • Configuration File

      • Routing Tree

      • Receivers

      • Inhibitions

    • Alertmanager Web Interface

Part VI. Deployment

  • 20. Putting It All Together

    • Planning a Rollout

      • Growing Prometheus

    • Going Global with Federation

    • Long-Term Storage

    • Running Prometheus

      • Hardware

      • Configuration Management

      • Networks and Authentication

    • Planning for Failure

      • Alertmanager Clustering

      • Meta- and Cross-Monitoring

    • Managing Performance

      • Detecting a Problem

      • Finding Expensive Metrics and Targets

      • Reducing Load

      • Horizontal Sharding

    • Managing Change

    • Getting Help

Last updated