dba:postgresql:a_tale_of_postgresql_the_second
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | |||
dba:postgresql:a_tale_of_postgresql_the_second [2023/10/24 09:23] – dodger | dba:postgresql:a_tale_of_postgresql_the_second [2023/10/25 06:45] (current) – removed dodger | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | # A Thale(s) of PostgreSql (the 2nd) | ||
- | |||
- | # Servers | ||
- | |||
- | | ip | role | | ||
- | |----|------| | ||
- | | `10.3.0.3` | primary | | ||
- | | `10.6.1.3` | backup | | ||
- | |||
- | # The problem | ||
- | |||
- | Just open the log give us information about the error: | ||
- | ``` | ||
- | The network name cannot be found. | ||
- | 2023-10-19 09: | ||
- | 2023-10-19 09: | ||
- | 2023-10-19 09: | ||
- | ``` | ||
- | |||
- | The command executed by `archive_command` is failing. | ||
- | The `archive_command` on PostgreSql main configuration file `postgresql.conf` is set as: | ||
- | ```config | ||
- | archive_command = 'copy " | ||
- | ``` | ||
- | |||
- | Additionally, | ||
- | ``` | ||
- | The network name cannot be found. | ||
- | ``` | ||
- | |||
- | Without connecting again, in my oppinion, the ip address: `10.6.1.3` (which is the _backup_ server), either is *not* accessible via windows shared drive or the shared drive `\\10.6.1.3\archivelog` is full (0% space available). | ||
- | |||
- | Anyway, the result of the `archive_command` is _ERROR_. And the default behaviour of PostgreSql in such case is *DON' | ||
- | |||
- | PostgreSql [Documentation](https:// | ||
- | > While designing your archiving setup, consider what will happen if the archive command fails repeatedly because some aspect requires operator intervention or the archive runs out of space. For example, this could occur if you write to tape without an autochanger; | ||
- | |||
- | |||
- | That's the cause of the problems you're facing. As the result of `archive_command` fail, WAL are not being automatically deleted then the disk will run out of space. PostgreSql is waiting for manual intervention so `archive_command` ends with _OK_ and it can continue deleting old WAL files. | ||
- | |||
- | |||
- | |||
- | # Solutions | ||
- | |||
- | ## Solution 1: Solve the connectivity problems | ||
- | |||
- | First solution should be fast and doesn' | ||
- | You have to check: | ||
- | * Connectivity from _primary_ to the shared Network drive: `\\10.6.1.3\archivelog` | ||
- | * Disk space available on the _backup_ server in the underlying drive of the shared network drive `\\10.6.1.3\archivelog` | ||
- | |||
- | ### Additional actions on this solution | ||
- | |||
- | To avoid possible future failures. | ||
- | |||
- | Add check to the monitorization tool to check: | ||
- | * Connectivity from _primary_ server to the shared Network drive | ||
- | * Connectivity from _backup_ server to the shared Network drive | ||
- | * Disk space available on the _backup_ server in the underlying drive of the shared network drive | ||
- | |||
- | |||
- | ## Solution 2: Change archive_command | ||
- | |||
- | This solution will involve PostgreSql server to be restarted to apply changes. | ||
- | |||
- | Modify `archive_command` to copy WAL files locally: | ||
- | ```conf | ||
- | archive_command = 'copy " | ||
- | ``` | ||
- | Create the destination directory: | ||
- | ```powershell | ||
- | mkdir C: | ||
- | ``` | ||
- | And restart PostgreSql. | ||
- | |||
- | *Note*: This solution is not recommended because you won't have cross-repliaction. That is, in our initial recomendations we suggest that _primary_ server copies the WAL files to the _backup_ server to be able to execute a Point-in-time-recover (PITR), that is a *best-practice*. | ||
- | |||
- | |||
- | |||
- | # Additional notes and recommendations | ||
- | |||
- | ## PostgreSQL version | ||
- | |||
- | You're running version 12 which has EOL (end-of-life) support on [November 14, 2024](https:// | ||
- | Our recommendation is to upgrade to v16. | ||
- | v16 will be on production state for many months when final upgrade happens, so we suggest to plan the version upgrade as soon as possible. | ||
- | |||
- | ## Archivelog location | ||
- | |||
- | Consider having a external shared storage shared with both _primary_ and _backup_ for archivelog storage. | ||
- | |||
- | # Thales Hyderabad info | ||
- | You can safely ignore this. | ||
- | |||
- | ``` | ||
- | Archive: | ||
- | Written using Zippy | ||
- | Length | ||
- | --------- | ||
- | 9315434 | ||
- | 5903497 | ||
- | --------- | ||
- | | ||
- | ``` | ||
- | |||
- | |||