> For the complete documentation index, see [llms.txt](https://zhangguanzhang.gitbook.io/prometheus-up-running/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://zhangguanzhang.gitbook.io/prometheus-up-running/master.md).

# Introduction

有些单词或者内容不会去翻译是为了防止歧义或者理解错误\
原书里有些废话的部分可能机翻译或者翻译不好

本书详细介绍了如何使用Prometheus监控系统进行监控应用程序和基础架构性能的信息，图表和警报。 这本书适用于应用程序开发人员，系统管理员和所有人

### &#x20;Part I. Introduction

* 1. **What Is Prometheus?**
  2. What Is Monitoring?
  3. * A Brief and Incomplete History of Monitoring
     * Categories of Monitoring
  4. &#x20;Prometheus Architecture
  5. * &#x20;Client Libraries
     * &#x20;Exporters
     * &#x20;Service Discovery
     * &#x20;Scraping
     * &#x20;Storage
     * &#x20;Dashboards
     * &#x20;Recording Rules and Alerts
     * &#x20;Alert Management
     * &#x20;Long-Term Storage
  6. &#x20;What Prometheus Is Not
* **2. Getting Started with Prometheus**
* * &#x20;Running Prometheus
  * &#x20;Using the Expression Browser
  * &#x20;Running the Node Exporter
  * &#x20;Alerting

### &#x20;**Part II. Application Monitoring**

* 3\. **Instrumentation**
* * &#x20;A Simple Program
  * &#x20;The Counter
  * * Counting Exceptions
    * &#x20;Counting Size
  * &#x20;The Gauge
  * * &#x20;Using Gauges
    * &#x20;Callbacks
  * &#x20;The Summary
  * &#x20;The Histogram
  * * &#x20;Buckets
  * &#x20;Unit Testing Instrumentation
  * &#x20;Approaching Instrumentation
  * * &#x20;What Should I Instrument?
    * &#x20;How Much Should I Instrument?
    * &#x20;What Should I Name My Metrics?
* &#x20;**4. Exposition**
* * &#x20;Python
  * * &#x20;WSGI
    * Twisted
    * Multiprocess with Gunicorn
  * Go
  * Java
  * * HTTPServer
    * Servlet
  * Pushgateway
  * Bridge
  * Parsers
  * Exposition Format
  * * Metrics Types
    * Labels
    * Escaping
    * Timestamps
    * check metrics
* **5. Labels**
* * What Are Labels?
  * &#x20;Instrumentation and Target Labels
  * &#x20;Instrumentation
  * * &#x20;Metric
    * &#x20;Multiple Labels
    * &#x20;Child
  * &#x20;Aggregating
  * &#x20;Label Patterns
  * * Enum
    * Info
  * When to Use Labels
  * * &#x20;Cardinality
* &#x20;**6. Dashboarding with Grafana**
* * Installation
  * Data Source
  * Dashboards and Panels
  * * &#x20;Avoiding the Wall of Graphs
  * &#x20;Graph Panel
  * * Time Controls
  * Singlestat Panel
  * Table Panel
  * Template  Variables

### &#x20;**Part III. Infrastructure Monitoring**

* &#x20;**7. Node Exporter**
* * &#x20;CPU Collector
  * &#x20;Filesystem Collector
  * &#x20;Diskstats Collector
  * &#x20;Netdev Collector
  * &#x20;Meminfo Collector
  * &#x20;Hwmon Collector
  * &#x20;Stat Collector
  * &#x20;Uname Collector
  * &#x20;Loadavg Collector
  * &#x20;Textfile Collector
  * * &#x20;Using the Textfile Collector
    * &#x20;Timestamps
* &#x20;**8. Service Discovery**
* * &#x20;Service Discovery Mechanisms
  * * Static
    * File
    * Consul
    * EC2
  * Relabelling
  * * &#x20;Choosing What to Scrape
    * &#x20;Target Labels
  * &#x20;How to Scrape
  * * &#x20;metric\_relabel\_configs
    * &#x20;Label Clashes and honor\_labels
* &#x20;**9. Containers and Kubernetes**
* * &#x20;cAdvisor
  * * CPU
    * Memory
    * Labels
  * Kubernetes
  * * Running in Kubernetes
    * Service Discovery
    * kube-state-metrics
* &#x20;**10. Common Exporters**
* * Consul
  * HAProxy
  * Grok Exporter
  * Blackbox
  * * ICMP
    * TCP
    * HTTP
    * DNS
    * Prometheus Configuration
* &#x20;**11. Working with Other Monitoring Systems**
* * &#x20;Other Monitoring Systems
  * InfluxDB
  * StatsD
* &#x20;**12. Writing Exporters**
* * Consul Telemetry
  * Custom Collectors
  * * Labels
  * Guidelines

### &#x20;Part IV. PromQL

* &#x20;**13. Introduction to PromQL**
* * &#x20;Aggregation Basics
  * * &#x20;Gauge
    * Counter
    * Summary
    * Histogram
  * Selectors
  * * Matchers
    * &#x20;Instant Vector
    * &#x20;Range Vector
    * Offset
  * HTTP API
  * * &#x20;query
    * &#x20;query\_range
* &#x20;**14. Aggregation Operators**
* * &#x20;Grouping
  * * without
    * by
  * Operators
  * * sum
    * count
    * avg
    * &#x20;stddev and stdvar
    * min and max
    * topl and bottomk
    * quantile
    * count\_values
* &#x20;**15. Binary Operators**
* * &#x20;Working with Scalars
  * * &#x20;Arithmetic Operators
    * &#x20;Comparison Operators
  * Vector Matching
  * * One-to-One
    * Many-to-One and group\_left
    * Mant-to-Many and Logical Operators
  * Operator Precedence
* &#x20;**16. Functions**
* * Changing Type
  * * vector
    * scalar
  * Math
  * * abs
    * ln,log2,and log10
    * exp
    * sqrt
    * ceil and floor
    * round
    * clamp\_max and clamp\_min
  * Time and Date
  * * time
    * minute,hour,day\_of\_week,day\_of\_month,dats\_in\_month,month,and year
    * timestamp
  * Labels
  * * label\_replace
    * label\_join
  * Missing Series and absent
  * &#x20;Sorting with sort and sort\_desc
  * &#x20;Histograms with histogram\_quantile
  * Counters
  * * rate
    * increase
    * irate
    * resets
  * &#x20;Changing Gauges
  * * &#x20;changes
    * deriv
    * predict\_linear
    * delta
    * idelta
    * holt\_winters
  * Aggregation Over Time
* &#x20;**17. Recording Rules**
* * &#x20;Using Recording Rules
  * When to Use Recording Rules
  * * Reducing Cardinality
    * &#x20;Composing Range Vector Functions
    * &#x20;Rules for APIs
    * &#x20;How Not to Use Rules
  * &#x20;Naming of Recording Rules

### &#x20;Part V. Alerting

* &#x20;**18. Alerting**
* * &#x20;Alerting Rules
  * * for
    * Alert Labels
    * &#x20;Annotations and Templates
    * &#x20;What Are Good Alerts?
  * &#x20;Configuring Alertmanagers
  * * &#x20;External Labels
* &#x20;**19. Alertmanager**
* * &#x20;Notification Pipeline
  * &#x20;Configuration File
  * * &#x20;Routing Tree
    * Receivers
    * Inhibitions
  * &#x20;Alertmanager Web Interface

### &#x20;Part VI. Deployment

* &#x20;**20. Putting It All Together**
* * &#x20;Planning a Rollout
  * * &#x20;Growing Prometheus
  * &#x20;Going Global with Federation
  * &#x20;Long-Term Storage
  * &#x20;Running Prometheus
  * * Hardware
    * Configuration Management
    * Networks and Authentication
  * &#x20;Planning for Failure
  * * &#x20;Alertmanager Clustering
    * &#x20;Meta- and Cross-Monitoring
  * &#x20;Managing Performance
  * * &#x20;Detecting a Problem
    * &#x20;Finding Expensive Metrics and Targets
    * &#x20;Reducing Load
    * &#x20;Horizontal Sharding
  * &#x20;Managing Change
  * &#x20;Getting Help


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zhangguanzhang.gitbook.io/prometheus-up-running/master.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
