What's new in PostgreSQL 9.4
Modern Database
PostgreSQL has used a JSON data type for some time to simplify storing JSON documents in relational databases. This approach is useful for handling JSON documents as entities instead of mapping to a relational data model. Additionally, it allows data storage within a database, with all the advantages that PostgreSQL offers the user, such as uncompromising transaction security, excellent extensibility, and scalability in the same software component.
However, the original implementation of the JSON data type had a major drawback: JSON documents were stored in the database as strings and required parsing and analysis every time JSON data was accessed. Accessing elements of a document with this kind of representation is also very complex, to say nothing of indexing with any kind of efficiency.
PostgreSQL 9.4 has now introduced a data type, in the form of JSONB, that handles and saves JSON documents as structured, binary data types. This allows efficient access to JSON documents as well as the implementation of fast index access methods. Scalar values from JSONB documents are stored as basic PostgreSQL types within a document. The query half-way down Figure 1 shows a very simple example with the table movies
whose column entry
contains a JSONB document with a film title
. The @>
operator checks whether the left operand contains the right JSONB expression.
Additionally, the <@
operator checks whether the left JSONB expression is contained in the right JSONB operand, the ?
operator verifies whether a specific key exists, and ?|
tests whether a key or element exists in the left JSONB operand. The ?&
operator finds out whether all the keys from the given set are present in the JSONB document. All these operators can be accelerated using a GIN index for such queries created by the database admin:
CREATE INDEX ON movies USING gin(entry);
Testing for the presence of certain attribute values or the existence of certain keys in JSONB documents and the associated possibility to index relevant queries provides a variety of interesting applications compared with the old JSON data type.
Configuration Changes Using SQL Commands
Until now, administrators were forced to rely on manual editing for changes to configuration variables in the postgresql.conf
configuration file. This included, for example, settings for the shared buffer pool (shared_buffers
) or the number of transaction log segments (checkpoint_segments
).
As of version 9.4, PostgreSQL has a tool for changing all settings in postgresql.conf
using SQL commands in the form ALTER SYSTEM
. Another configuration file is maintained by the database server for this purpose – the postgresql.auto.conf
file.
The ALTER SYSTEM
command changes settings in this additional configuration file; PostgreSQL always reads this after postgresql.conf
at startup or when it receives a SIGHUP signal. This means that settings made by the administrator using ALTER SYSTEM
always have priority over the corresponding settings in postgresql.conf
. The administrator must have superuser permissions. The following example illustrates the new syntax.
ALTER SYSTEM SET maintenance_work_mem TO '1GB'; SELECT pg_config_reload();
Because maintenance_work_mem
can be easily changed at run time, reloading the configuration files will suffice here; this is handled by the legacy pg_config_reload()
function. Changes can be reset specifically to the original default value for a particular variable:
ALTER SYSTEM RESET maintenance_work_mem ;
All settings that were made via ALTER SYSTEM
can be reset once using the keyword ALL
:
ALTER SYSTEM RESET ALL;
ALTER SYSTEM
facilitates administration (particularly on database systems) where there is strict separation between database and system administration and where configuration files cannot be edited easily. The new command even simplifies administration when installing in the cloud, where there is no direct access to the shell.
Improvements for Materialized Views
The basic functionality for materialized views was introduced in version 9.3 of PostgreSQL. The REFRESH MATERIALIZED VIEW
command has now been improved in version 9.4. In version 9.3, if an exclusive lock on the view existed, it had to be held while refreshing the materialized dataset and thus allowed no concurrent access; the new CONCURRENTLY
option allows materialized views to be updated without locking concurrent queries.
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_test;
This action takes longer than an exclusive REFRESH
with larger amounts of data. Furthermore, REFRESH MATERIALIZED VIEW CONCURRENTLY
requires a unique index on one or more columns of the view.
Replication Slots
Replication slots expand the infrastructure with streaming replication. Until PostgreSQL 9.3, connections via streaming replication protocol were relatively anonymous: Using hot_standby_feedback
to submit feedback regarding the current state of a streaming standby to the master was an option. However, this only had an effect, for example, on any replication conflicts that arose, such as VACUUM
or lock management. If a streaming replication connection falls too far behind, transaction logs on the master can already have been cleared, thus causing a replication breakdown. The standby can then only catch up using an existing archive.
Replication slots now provide a kind of named replication connection, which accurately provides information about which position this slot occupies in the transaction log. PostgreSQL will then stop automatically recycling required transaction log segments until the slot is released again. Replication slots therefore summarize all the characteristics of hot_standby_feedback
, wal_keep_segments
, and vacuum_defer_cleanup_age
. However, PostgreSQL administrators absolutely have to monitor the size of the transaction log if replication slots are only served very rarely or if standby servers are down for extended periods of time.
A distinction is made between physical and logical replication slots. The latter are required for logically decoding the transaction log. A physical replication slot configured by the new configuration parameter primary_slot_name
in recovery.conf
is required for streaming replication.
Creating such a physical replication slot is simple: The SQL function pg_create_physical_replication_slot()
or pg_create_logical_replication_slot()
creates a physical or logical replication slot. Existing replication slots can be queried using the new system view pg_replication_slots
.
The new configuration parameter max_replication_slots
must be set to the maximum number of replication slots in preparation. Changing this setting requires a reboot. The database administrator can now create a physical replication slot (Listing 1).
Listing 1
Physical Replication Slot
SELECT * FROM pg_create_physical_replication_slot('pgstandby'); slot_name | xlog_position --------------+--------------------- pgstandby | (1 row)
The slot is automatically activated if a streaming standby server connects to this slot after the parameter primary_slot_name = 'pgstandby'
was set in its configuration file.
Listing 2 shows the information from pg_replication_slots
for this replication slot.
Listing 2
Information About Replication Slot
SELECT * FROM pg_replication_slots ; slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn -----------+--------+-----------+--------+----------+--------+------+--------------+------------- pgstandby | | physical | | | t | | | 0/76000138 (1 row)
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.