Shell 命令执行可视化和告警工具( 二 )

Asciibox
Shell 命令执行可视化和告警工具文章插图
asciiboxes:- title: UTC timerate-ms: 500 # sampling rate, default = 1000font: 3d # font type, default = 2dborder: false # border around the item, default = truecolor: 43 # 8-bit color number, default is whitesample: env TZ=UTC date +%r5、额外功能Triggers触发器允许执行条件操作 , 如视觉/声音告警或任意shell命令 。 以下示例说明了此概念 。
Clock gauge , 从开始的每分钟显示时间进度和当前时间
gauges:- title: MINUTE PROGRESSposition: [[0, 18], [80, 0]]cur:sample: date +%Smax:sample: echo 60min:sample: echo 0triggers:- title: CLOCK BELL EVERY MINUTEcondition: '[ $label == "cur" ]# executes with a required rate, in scope of the interactive sessiontransform: echo result = $sample # executes in scope of local session, $sample variable is available for transformationPTY mode在某些情况下 , 交互式shell将无法工作 , 因为它的stdin不是终端 。 这种情况下我们可以使用PTY模式:
【Shell 命令执行可视化和告警工具】textboxes:- title: Neo4j pollingpty: true # enables pseudo-terminal mode, default = falseinit: cypher-shell -u neo4j -p pwd --format plainsample: RETURN rand;transform: echo "$sample" | tail -n 1- title: Top on a remote serverpty: true # enables pseudo-terminal mode, default = falseinit: ssh -i ~/user.pem ec2-user@1.2.3.4sample: topinit 命令逐步执行在开始采样之前 , 还可以逐个执行多个init命令 。
textboxes:- title: Java application uptimemultistep-init:- java -jar jmxterm-1.0.0-uber.jar- open host:port # or local PID- bean java.lang:type=Runtimesample: get Uptime变量如果配置文件包含重复的模式 , 则可以将它们提取到变量部分 。 此外 , 还可以在启动时使用-v/–variable标志指定变量 , 并且任意的系统环境变量也可以在脚本中使用 。
variables:mongoconnection: mongo --quiet --host=localhost testbarcharts:- title: MongoDB documents by statusitems:- label: IN_PROGRESSinit: $mongoconnectionsample: db.getCollection('events').find({status:'IN_PROGRESS'}).count- label: SUCCESSinit: $mongoconnectionsample: db.getCollection('events').find({status:'SUCCESS'}).count- label: FAILinit: $mongoconnectionsample: db.getCollection('events').find({status:'FAIL'}).count颜色主题
Shell 命令执行可视化和告警工具文章插图
theme: light # default = darksparklines:- title: CPU usagesample: ps -A -o %cpu | awk '{s+=$1} END {print s}'6、真实场景数据库以下是不同的数据库连接示例 。 建议使用交互式shell(init脚本)仅建立一次连接 , 然后在采样期间重用即可 。
MySQL# prerequisite: installed mysql shellvariables:mysql_connection: mysql -u root -s --database mysql --skip-column-namessparklines:- title: MySQL (random number example)pty: trueinit: $mysql_connectionsample: select rand;PostgreSQL# prerequisite: installed psql shellvariables:PGPASSWORD: pwdpostgres_connection: psql -h localhost -U postgres --no-align --tuples-onlysparklines:- title: PostgreSQL (random number example)init: $postgres_connectionsample: select random;MongoDB# prerequisite: installed mongo shellvariables:mongo_connection: mongo --quiet --host=localhost testsparklines:- title: MongoDB (random number example)init: $mongo_connectionsample: Math.random;Neo4j# prerequisite: installed cypher shellvariables:neo4j_connection: cypher-shell -u neo4j -p pwd --format plainsparklines:- title: Neo4j (random number example)pty: trueinit: $neo4j_connectionsample: RETURN rand;transform: echo "$sample" | tail -n 1Kafka检查kafka lag值 , 计算每个队列lag值的和 , 高于阈值报警 , 多consumergroup , 多topic 。
variables:kafka_connection: $KAFKA_HOME/bin/kafka-consumer-groups --bootstrap-server localhost:9092runcharts:- title: Kafka lag per consumer grouprate-ms: 5000scale: 0items:- label: A->Bsample: $kafka_connection --group group_a --describe | awk 'NR>1 {sum += $5} END {print sum}'- label: B->Csample: $kafka_connection --group group_b --describe | awk 'NR>1 {sum += $5} END {print sum}'- label: C->Dsample: $kafka_connection --group group_c --describe | awk 'NR>1 {sum += $5} END {print sum}'DockerDocker容器统计信息(CPU , MEM , O/I)
textboxes:- title: Docker containers statssample: docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}\t{{.PIDs}}"SSH远程服务器上的TOP命令
variables:sshconnection: ssh -i ~/my-key-pair.pem ec2-user@1.2.3.4textboxes:- title: SSHpty: trueinit: $sshconnectionsample: topJMXJava应用程序的正常运行示例
# prerequisite: download [jmxterm jar file]()textboxes:- title: Java application uptimemultistep-init:- java -jar jmxterm-1.0.0-uber.jar- open host:port # or local PID- bean java.lang:type=Runtimesample: get Uptimetransform: echo $sample | tr -dc '0-9' | awk '{printf "%.1f min", $1/1000/60}'