-
PostgreSQL Cluster utilization!
pgCluu is a PostgreSQL performances monitoring and auditing tool.
View reports of all statistics collected from your PostgreSQL databases cluster. pgCluu will show you the entire informations of the PostgreSQL Cluster and the system utilization
-
About pgCluu
pgCluu is a Perl program used to perform a full audit of a PostgreSQL Cluster performances. It is divided in two parts, a collector used to grab statistics on the PostgreSQL cluster using psql and sar, a grapher that will generate all HTML and charts output. It is fully open source and free of charge.
License
pgCluu is licenced under the PostgreSQL Licence a liberal Open Source license, similar to the BSD or MIT licenses.
That mean that all parts of the program are open source and free of charge.
This is the case for both, the collecter and the grapher programs.
Download
Release and source code at github:
[ https://github.com/darold/pgcluu ].ChangeLog can be read on-line on GitHub repository here
Releases at SourceForge:
[ http://sourceforge.net/projects/pgcluu/ ].Offical web site is hosted at pgcluu.darold.net
Authors
pgCluu is an original development of Gilles Darold.
Some parts of the collector are taken from pgstats a C program writen by Guillaume Lelarge and especially the SQL queries including the compatibility with all PostgreSQL versions.
Btw pgCluu grapher is compatible with files generated by pgstats, sar and sadc so you can use it independantly to graph those data. Some part of the sar output parser are taken from SysUsage.
-
Support
pgCluu is open-source and you can get community support using online services like the bug tracker at GitHub.org. You can also write to me directly at < gilles _AT_ darold _DOT_ net >
For professional support you can contact one of the excellent company near of your listed in the PostgreSQL Professional Services page, they all do great job.
Bugs reports & features requests
Bugs reports, pull resquests, fix and features requests can be sent through the GitHub tracker.
You can also write to me at < gilles _AT_ darold _DOT_ net >.pgCluu development
Source Lines of Code: 21,737 Estimated Cost to Develop: $ 685,017 (Generated using David A. Wheeler's 'SLOCCount'.)
-
2019-09-18 - Version 3.0 released
This release is a major release that adds lot of improvement and new reports. It also improve performances and fully support a CGI mode with dynamic content for realtime reports.
New or enhanced reports/features:
- The CGI mode to provide dynamic reports on time selection is now production ready and has left the beta stage. See documentation on how to enable it. Debian packaging enable this mode by default. - Installation and configuration is fully managed from Makefile.PL script. This will help distribution packaging with additional configuration directives to control the behavior of pgcluu scripts with generic systemd service and timer files. - Add incremental mode to pgcluu. That mean that you don't have to remove the content of the report directory anymore before running pgcluu again and when pgcluu detect that you are running in this mode. The incremental mode is detected from statistic directory when you have used the --rotate-daily or --rotate-hourly option to pgcluu_collectd. - Rewrite Stats I/O reports to display read/hit per table or indexes during the audit period reported. - Move Xlog (Wal bytes written) report under replication menu as data comes from pg_stat_replication. - Explode System and Cluster reports into several reports dedicated files to limit the size of the HTML files in incremental mode. - Add collect and report of vm.nr_overcommit_hugepages. - Add report on pages scanned to report number of pages scanned by the kswapd daemon and number of pages scanned directly per second (pgscank/s and pgscand/s). It also reports the number of pages the system has reclaimed from cache (pagecache and swapcache) per second to satisfy its memory demands (pgsteal/s). On a second axis %vmeffto reports the the efficiency of page reclaim. - All pages reports have been moved as sub menu of the Page menu and pgfree has been removed from the page fault report. - Limit system cache statistics report to pgpgin/s and pgpgout/s and a new report is dedicated to Page fault statistics with majflt/s, minflt/s and pgfree/s dataset. Dataset minflt/s is calculated from sar output as result of "fault/s - majflt/s". - All statistics about pg_stat_user_* and pg_stat_xact_user_* views are now stored as a snapshot at start and at end of the audit period and each hour if you are running in incremental mode. The old behavior was to append the snapshot at each running interval to the data file but this is not necessary and this can use huge disk space if you have lot of table and indexes in your database. Backward compatibility is preserved. - New report of json versus jsonb columns in each database. - Add keepalive kernel configuration in SysInfo report. Separate bgwriter "allocated" buffers in a dedicated report for better reading. - Add new report about estimated memory workload based on the value of Committed_AS from /proc/meminfo. Committed_AS is the total amount of memory estimated to complete the workload. This value represents the worst case scenario value, and also includes swap memory. The report show the memory workload aver the time.
New command line options:
- Add option -x, --external-menu to pgcluu to save the common menu in menu.html and load it into each report using w3-include-html attribute from w3.js. This will only work if access to HTML report is through a Web server, not using the file:// protocol. - Add new option --retention to pgcluu to set number of rolling days to keep in report directory. Default is to store indefinitely. - Add option --retention NDAYS to pgcluu_collectd to set the number of rolling days to keep in data directory in incremental mode. Default is to store indefinitely.
New configuration directives :
- Add configuration directive STATS_REPORT_CACHING to pgcluu.conf file. This directive must be enabled in CGI mode. In this case the pgcluu script must be run periodically at least each 5 minutes to build and update cached statistics. In cache mode pgcluu generate binary file in the statistics directory and nothing in the report directory. The pgcluu.service and pgcluu.timer can be use, pgcluu will read the configuration file and switch automatically to cache mode if STATS_REPORT_CACHING is enabled. - Add MAX_RENDERED_DAYS configuration directive to set the maximum number of days in a graph. Using default pgcluu_collectd 1 minute interval each, daily graph will have 1440 points. Having too much data to render is not a good point for performances with the current code. The user can set is own graph period, default is to limit to the first seven days of the selected period. This mean a maximum of 10080 points for one week, this seems a safe limit.
It also fixes some issues reported by users since last ten months.
See ChangeLog for a complete list of changes and bug fixes.
2018-12-13 - Version 2.9 released
This release is a minor release that fix a major issue in the interval detection. This This led to a bad scale in bg_writter, replication and most database statistics reports. Every one must upgrade immediately.
It also fixes some issues reported by users since last release:
- Improve devices list retrieving and fix several menu issues. - Fix checkpoint counter legend. - Move query in pg_stat_statements at the right of the table for better readability. - Fix report of disk use, mounted devices, running processes, crontab and installed package to avoid duplicate entries. - Empty SysInfo arrays before reading sysinfo file, this prevent duplicate data. - Improve parsing of sar file. - Fix interval that was set in milliseconds instead of second. This conduct to wrong scale (ex: Kb instead of Gb) in bg_writter, replication and database statistics reports. Thanks to Guillaume Lelarge for the report.
See ChangeLog for a complete list of changes and bug fixes.
2018-11-28 - Version 2.8 released
This release is a minor release that fix some issues reported by users over past months, it also adds some new reports:
- Add disk space utilization report over time using df command for sysstat version older than 11.1.4. - Add report of percentage of disk space and inode used on each file system. The information is available in sar output since 11.1.4 version. - Add PageTables information in SysInfo memory report. - Show database information about table/indexes in capture mode. - Add report of number of tasks currently blocked, waiting for I/O to complete "Run queue" report. - Add new report about system dirty memory that need to be written to disk as well as amount of active/inactive memory. - Show highest dirty memory to write and highest number of processes blocked in overall stats page. - Add report of crontab entries for the user running pgcluu. - Add report of installed PostgreSQL packages in main page.
New pgcluu_collectd command line option:
- Add command line option --package-list to be able to set a custom command to list PostgreSQL packages. Default is to auto-detect package type between rpm and dpkg, using command 'rpm -qa' or 'dpkg -l'. If you have an other system you can use this option to set a custom command. A filter on keyword 'postgres' is appended to the command: ' | grep postgres'.
See ChangeLog for a complete list of changes and bug fixes.
2018-07-09 - Version 2.7 released
This release is a minor release that fix some issues reported by users over past year, it also adds some new reports:
* Add reports of pgbouncer wait for server statistics. * Make pgCluu more compatible with PG10 and pgBouncer 1.8 * Add kernel vm.overcommit_kbytes configuration to report.
Upgrade: you can safely override previous installation, backward compatibility with 2.6 version is preserved.
See ChangeLog for a complete list of changes and bug fixes.
2017-07-09 - Version 2.6 released
This release is a minor release that fix some issues reported by users over past year but it also adds some new interesting reports:
* Add report on prepared transaction and oldest one in seconde per database. * Detect partitions and summarize information in a dedicated report. * Add kernel scheduler configuration for sched_autogroup_enabled and sched_migration_cost_ns to sysinfo report. * Add report of configuration files changes in incremental mode. * Report on cancelled queries due to conflicts is now a time based graph instead of a pie chart. * bgwritter buffer clean, checkpoint and backend statistics are now reported as bytes using size of 8192 per buffer. * Add report of allocated buffers with bgwritter buffer statistics. * Add report of transaction throughput per second. * Show data checksum status. * Add report of unlogged tables. Database with unlogged tables will be listed in cluster view. * Add hourly index when --rotate-hourly is enabled.
and useful features:
* Compatibility with PostgreSQL 10.0 * Add static index on main directory with incremental report to link to the different days. * Use bootstrap modal dialog windows to download graph as png. * Autodetect interval between collected data to support interval change during stats collect. * Replace javascript call to dateToDisplay.toGMTString() with dateToDisplay.toString(). Please note that this could not be backward compatible with your previous timezone settings. See pgcluu.js to revert the function call. * Create DDL of missing index concurrently.
Upgrade: you can safely override previous installation, backward compatibility with 2.5 version is preserved.
See ChangeLog for a complete list of changes and bug fixes.
2016-04-27 - Version 2.5 released
This release fixes some issues reported by users over past year and a full replacement of the flotr2 javascript chart library with jqplot.
There's also some new interesting reports:
* Add report for hash indexes. * Keep track of pg_settings and database/roles settings changes and show diff in the the reports. * Add report of pending restart in pg settings view. * Add information about percentage of timed against requested checkpoints. * Split database menu in submenu per set of 10 databases. * Add report of tables without indexes and tables with more than five indexes. * Add report about invalid index after concurrency build. * Add report of system and PostgreSQL uptime. * Add report of last statistics reset per database and report of last autovacuum and autoanalyze. * Add report of bgwriter last statistics reset in Home/Cluster view. * Add a non default configuration settings report.
and useful features:
* Autodetect timezone from csv data files and automatic adjustment of chart axis. * Add systemd start script. * Add collect of crontab information. * Add support to daylight saving. * Add compatibility with PostgreSQL 9.5 * Add ablity to export plots data as csv * Set legend table outside the graph. * Display specific titles and description for overall graphs.
Note that a CGI script have been added to be able to perform temporal lookup in incremental pgCluu statistics, with predefined year, month, day and hour views. This is a work in progress, it will be available in next major release.
Upgrade: you can safely override previous installation, backward compatibility with 2.4 version is preserved.
See ChangeLog for a complete list of changes and bug fixes.
2015-07-25 - Version 2.4 released
This release is a maintenance release that fix some issues. There's also some new interesting reports:
* Transfers per second (read/write/both) on all devices from sar -b * Transfers per second for each device from sar -d . * Number of tasks created per second * Number of context switches per seconds. * Improve pg_stat_statement report by adding all shared block stats and read/write I/O timing per query when track_io_timing is enabled. * Add device with highest tps on overall system information.
and useful features:
* Add --capture mode to pgcluu_collectd to be be able to build a snapshot of the PostgreSQL instance and exit. pgCluu will automatically adapt the report to this capture mode. It will use a temporary directory /tmp/pgcluu_capture to generate a tarball /tmp/pgcluu_capture.tar.gz containing the capture. * Add pgCluu logo and ico to the html output. * Add --charset option to be able to change the html charset, default: utf8. * Allow regular expression in database list available in reports, for example: with --db-only "p.*", only database beginning with p will be reported. * Allow pgcluu to parse and compute statistics from gzip compressed files. * pgcluu will not stop anymore if the sar file is not found, it will only show a warning message and continue.
This release also adds -r | --rotate-daily and -R | --rotate-hourly options to pgcluu_collectd be able to rotate statistic files on a daily or hourly basis. You can use -z or --compress option to compress rotated data files. Thanks to Euler Taveira de Oliveira for the feature request.
There also some code relative to next coming major release that will be used to allow a full incremental mode and a temporal navigation into the collected statistics from a CGI program. This code enable caching (option -C or --cache) by dumping statistics stored in memory into binary files. With those files, data files can be removed (automatically with option -c or --clean) and report can be build later from them. This is not really useful now but this s the first stage to build incremental and cumulative reports.
Here are the complete list of bug fixes in this release:
- Fix pretty print number format when units are blocks and not sizes. - Do not apply database list restriction on total cluster size calculation. - Fix Statistics about I/O on Indexes. - Rewrite some parts of the overall system statistics for better performances. - Fix bug in overall stat for Most read/written device, they was multiplied by 512 (size of a block) twice. - Fix system report of r/w and tps per devices. - Fix wrong report of pg_buffercache statistics. - Remove some global variables with local redeclaration. - Fix use of uninitialized value during build of pg_stat_statement report. - Fix Statements statistics not available because a local var is used during computation. Thanks to Michel Meyer for the patch. - Fix uninitialized value and wrong overall reports in Home menu. Thanks to Assem Bayahi for the report. - Remove garbage from last commit. - Fix replication lag statistics report and add new report about "Number of xlog data written per second." - Fix documentation about --included-db - Fix pgcluu_collectd crash when --no-database is used and the psql command is not available. Thanks to Ronan Dunklau for the report. - Fix call to cluster canceled queries statistics. - Fix menu when no database are found. - Fix start/end date and path to sar file in incremental and capture mode. - Separate load of statistics from report builder. - Make paths internally relative into output directory especially for incremental facilities. - Fix several issue that prevented configuration files to be copied into the output directory. - Allow white space in db's names. Thanks to Nicolas Thauvin for the report. - Fix wrongly disabled temporary files menu. - Fix database list with character - inside the name - Fix replication and checkpoint report and corresponding menus disabled. Thanks to Zsolt for the report. - Fix a print on undefined value when a device is mounted after the first execution of pgcluu. Thanks to Ezequiel Mina for the report. - Add timezone option for perl localtime() and fix mixed formated tabs. Thanks to David Cramblett for the patch. - Add missing information into usage about -w | --password option. - Fix for sar command when ssh not in use. Thanks to David Cramblett for the patch. - Make error message during first connection more explicit. Thanks to Gregoire Pineau for the report. - Be sure that interval is always != 0 to avoid illegal division by zero. Thanks to Gregoire Pineau for the report. - Fix control character in substitution regex. - Fix error message for $OUTPUT_DIR not being empty should not use $INPUT_DIR. Thanks to Matthew Musgrove for the patch. - Fix limit reports to the database when --only-db is used. Thanks to Bianca Santana Espichicoquez for the report.
2015-02-06 - Version 2.3 released
This release is a maintenance release that fix some issues. There's also a new report about "Role Settings" per database.
- Report default parameters values set with ALTER DATABASE and ALTER ROLE in new menu item: "Database/Role Settings". Thanks to Thomas Reiss for the feature request. - Fix detection of disk device in sar file. - Add boot value to the settings report - Add Unit and Reset value in pg_settings report to highlight parameters where values have been changes outside the configuration file. - Fix handling of sysinfo information, and enhance .gitignore. Thanks to Julien Rouhaud for the patch. - Fix kernel.* and transparent_hugepage display on Sysinfo section. Thanks to Julien Rouhaud for the patch.
2015-01-05 - Version 2.2 released
This release is a maintenance release that fix several issues. There's also a simple report of transparent_hugepage from the system and a menu enhancement by dividing the device menu per 10 items which is helpful when there's plenty of disk devices.
- Increase copyright year to 2015. - Fix bug in database list extraction. - Fix query to get missing foreign key indexes. Thanks to Ronan Dunklau for the patch. - Add collect of transparent_hugepage information into sysinfo.txt file. (read from /sys/kernel/mm/transparent_hugepage/-) - Fix overwriting of idle_in_xact for all database, should be idle. Thanks to Guillaume Lelarge for the patch. - Fix query for missing indexes on FKs. The previous query reported every index that SHOULD exist for FKs, but was not filtering out the already existing ones. Thanks to Ronan Dunklau for the patch. - Do not report redundant indexes when one is partial and not the other one. Thanks to Ronan Dunklau for the report. - Fix statement to search redundant index by not reporting index that has uniqueness and not the other one. The statement now also reports duplicate indexes on the same column and not only composite indexes. Thanks to Ronan Dunklau for the report. - Divide Devices menu by part of 10 devices. Useful when there is plenty of disk devices. - Add --from-sa-file to allow parsing of sar output coming from a sa daily file. - Fix call to local sar command and limit test on application_name for pg version >= 9.0. Thanks to Julien Rouhaud for the report. - Set application_name to pgcluu before collecting data and exclude from pg_stat_activity queries generated by pgcluu.
2014-09-25 - Version 2.1 released
This new release adds lot of report improvements and bug fixes. There is also several new features or reports :
- Allow system information commands to be executed remotely. - Allow sar to be executed on a remote server using a ssh connection. - New report of connections waiting for a lock. - Add average duration time in statement report. - New pg_stat_statements report. - Add pg_default and pg_global to tablespace size report.
New options to pgcluu_collectd:
-M or --max-size option to allow an output directory size limit. --no-database to avoid collecting statistics from a database. -C count to terminate program after collecting data for X times. -E or --end-after to terminate program after for some time. -V or --version to show version information.
And some more changes/fixes, see ChangeLog for a complete list.
2014-03-31 - Version 2.0 released
This major release adds more than twenty new reports. pgCluu now reports everything you want to know about your PostgreSQL server from a cluster, database, or operating system point of view. It is also able to collect statistics about pgBouncer, shared buffer utilization using pg_buffercache, most frequent and time consuming queries with pg_stat_statements and really much more, see full list bellow.
New or enhanced reports/features:
- Allow pgcluu_collected to collect sar and configuration files remotely together with PostgreSQL statistics using a ssh connection. That mean that pgCluu is fully able to audit a remote server. - Add pg_stat_statements report showing most frequent and time consuming queries. - Add reports for unused and redundant indexes per database. - Add report about archiver statistics with PostgreSQL 9.4 new pg_stat_archiver view. - Add report about wal files created vs recycled and the max number of wal files. - Add reports about network interfaces utilization and network errors. - Add reports about system cache page in/out from/to disk. - Add report of isdirty statistics. - Add --exclude-time to pgcluu_collectd options to stop collecting statistics during a period of time. Ex: --exclude-time "22:00-06:00" to stop collecting data from 22:00 to 06:00 am the next day. - Add report about percentage of shared buffer used per database - Add report about percentage of each databases loaded in shared buffer - Add report about usagecount distribution in shared buffer - Add report about usagecount distribution in dirty shared buffer - Add report of number of shared buffer/pages used by a relation - Add report number of buffers loaded in cache for a relation relation and the percentage of the relation loaded (1). - Add reports about configuration files (postgresql.conf, pg_hba.conf and pg_ident.conf), pgcluu_collectd must be run as postgresql user. - Add report of user functions statistics. - Add report about the pgbouncer settings in Configuration submenu. - Add copy of pgbouncer.ini file in Configuration submenu. - Add report of pg_settings in Configuration submenu.. - Add report with SQL orders to create missing indexes on foreign keys. - Allow HTML table sort with a modified version of the javascript library sorttable.js - Moved System info report from sar sytem menu to a dedicated main SysInfo menu. - Add the number and list of extensions used inside the PG cluster. - Add reports about Locks per types, Locks per modes and Locks granted or not. - Add report about I/O statistics per index. - Add report about I/O statistics per table. - Add last manual vacuum/analyze datatime on database info report. - Add per database index statistics. - Add report of size and number of tuples per table of a database. - Add statistics reports per table (idx_scan/seq_scan, vacuums/analyzes, Insert/update/delete/hot update and live/dead tuples). - Add count of user triggers in database information slide. - Add Database info submenu per database to display the general information about a database (installed extension, schemas, number of Stored procedures and of all kind of object. - Add last know size of the database in the Database info slide. - Add total number of databases on cluster key values. - Add report of tablespaces utilization. - Add report of pgbouncer statistics per dbname and per pool (dbname/username). (1) [Shared buffers statistics are collected automatically when pg_buffercache extension is installed and the -B | --enable-buffercache option is used.]
And some more changes/fixes, see ChangeLog for a complete list.
2014-01-28 - Version 1.1 released
This new release adds lot of report improvements and bug fixes. There is also several new features or reports :
- Format mouse tracker on graphs to show all dataset values at a time.
- Add run queue length report to system menu.
- Add checkpoint write and sync times reports.
- Add report of PostgreSQL version
- Split background writer buffer and count statistics into separated reports.
- Add report of maxwritten_clean into bgwriter reports.
- Add report of kernel parameters to the system info page.
- Add collect of system kernel tuning parameters.
- Remove embedded CSS and javascript on each HTML page, resources are now automatically copied into the output directory if not already present.
- Allow pgcluu to parse sar file generated from sa file, use commands -f /var/log/sysstat/sa*".
- Split commit, rollback and backend graph by using a second yaxis for backend.
- Add System Information report.
- Move Cache hit/miss ratio on second yaxis and change dataset colors.
- Add collect of OS release information.
- Allow pgcluu_collectd to grab OS information (cpu, memory, etc.) and
- info option to only grab that information (for testing).
- Reformat dashboard information.
- timezone to set the hour(s) from GMT time to adjust times on sar report.
and lot of changes/fixes, see ChangeLog for a complete list.
2013-11-18 - Version 1.0 released
This is the first public release of pgCluu, that is a packaging of the tools I use every day to collect statistics and build reports of PostgreSQL Clusters for performances auditing and troubleshooting.
At this time it collect and report most of what is helpful for a PostgreSQL Cluster performance auditing. There's lot of others reports to be included:
- Statistics reports concerning tables.
- Statistics reports about pg_stat_statement.
- More Sar statistiques reports.
- ...
This will comes in next release.
The goal of this project is to provide a complete PostgreSQL auditing tool that do not need any dependency so that it can be run on any server.
If you just have a sar output file, pgCluu can be use to draw graphs about the system utilization only.