/* Void Main's man pages */
{ phpMan } else { main(); }
CREATE INDEX(7) SQL Commands CREATE INDEX(7)
NAME
CREATE INDEX - define a new index
SYNOPSIS
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] name ON table [ USING method ]
( { column | ( expression ) } [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
[ WITH ( storage_parameter = value [, ... ] ) ]
[ TABLESPACE tablespace ]
[ WHERE predicate ]
DESCRIPTION
CREATE INDEX constructs an index named name on the specified table. Indexes are primarily used to enhance database per-
formance (though inappropriate use can result in slower performance).
The key field(s) for the index are specified as column names, or alternatively as expressions written in parentheses.
Multiple fields can be specified if the index method supports multicolumn indexes.
An index field can be an expression computed from the values of one or more columns of the table row. This feature can be
used to obtain fast access to data based on some transformation of the basic data. For example, an index computed on
upper(col) would allow the clause WHERE upper(col) = 'JIM' to use an index.
PostgreSQL provides the index methods B-tree, hash, GiST, and GIN. Users can also define their own index methods, but
that is fairly complicated.
When the WHERE clause is present, a partial index is created. A partial index is an index that contains entries for only
a portion of a table, usually a portion that is more useful for indexing than the rest of the table. For example, if you
have a table that contains both billed and unbilled orders where the unbilled orders take up a small fraction of the
total table and yet that is an often used section, you can improve performance by creating an index on just that portion.
Another possible application is to use WHERE with UNIQUE to enforce uniqueness over a subset of a table. See in the docu-
mentation for more discussion.
The expression used in the WHERE clause can refer only to columns of the underlying table, but it can use all columns,
not just the ones being indexed. Presently, subqueries and aggregate expressions are also forbidden in WHERE. The same
restrictions apply to index fields that are expressions.
All functions and operators used in an index definition must be ``immutable'', that is, their results must depend only on
their arguments and never on any outside influence (such as the contents of another table or the current time). This
restriction ensures that the behavior of the index is well-defined. To use a user-defined function in an index expression
or WHERE clause, remember to mark the function immutable when you create it.
PARAMETERS
UNIQUE Causes the system to check for duplicate values in the table when the index is created (if data already exist) and
each time data is added. Attempts to insert or update data which would result in duplicate entries will generate
an error.
CONCURRENTLY
When this option is used, PostgreSQL will build the index without taking any locks that prevent concurrent
inserts, updates, or deletes on the table; whereas a standard index build locks out writes (but not reads) on the
table until it's done. There are several caveats to be aware of when using this option -- see Building Indexes
Concurrently [create_index(7)].
name The name of the index to be created. No schema name can be included here; the index is always created in the same
schema as its parent table.
table The name (possibly schema-qualified) of the table to be indexed.
method The name of the index method to be used. Choices are btree, hash, gist, and gin. The default method is btree.
column The name of a column of the table.
expression
An expression based on one or more columns of the table. The expression usually must be written with surrounding
parentheses, as shown in the syntax. However, the parentheses can be omitted if the expression has the form of a
function call.
opclass
The name of an operator class. See below for details.
ASC Specifies ascending sort order (which is the default).
DESC Specifies descending sort order.
NULLS FIRST
Specifies that nulls sort before non-nulls. This is the default when DESC is specified.
NULLS LAST
Specifies that nulls sort after non-nulls. This is the default when DESC is not specified.
storage_parameter
The name of an index-method-specific storage parameter. See Index Storage Parameters [create_index(7)] for
details.
tablespace
The tablespace in which to create the index. If not specified, default_tablespace is consulted, or
temp_tablespaces for indexes on temporary tables.
predicate
The constraint expression for a partial index.
INDEX STORAGE PARAMETERS
The optional WITH clause specifies storage parameters for the index. Each index method has its own set of allowed storage
parameters. The B-tree, hash and GiST index methods all accept a single parameter:
FILLFACTOR
The fillfactor for an index is a percentage that determines how full the index method will try to pack index
pages. For B-trees, leaf pages are filled to this percentage during initial index build, and also when extending
the index at the right (adding new largest key values). If pages subsequently become completely full, they will
be split, leading to gradual degradation in the index's efficiency. B-trees use a default fillfactor of 90, but
any integer value from 10 to 100 can be selected. If the table is static then fillfactor 100 is best to minimize
the index's physical size, but for heavily updated tables a smaller fillfactor is better to minimize the need for
page splits. The other index methods use fillfactor in different but roughly analogous ways; the default fillfac-
tor varies between methods.
GIN indexes accept a different parameter:
FASTUPDATE
This setting controls usage of the fast update technique described in in the documentation. It is a Boolean param-
eter: ON enables fast update, OFF disables it. (Alternative spellings of ON and OFF are allowed as described in
in the documentation.) The default is ON.
Note: Turning FASTUPDATE off via ALTER INDEX prevents future insertions from going into the list of pending index
entries, but does not in itself flush previous entries. You might want to VACUUM the table afterward to ensure the
pending list is emptied.
BUILDING INDEXES CONCURRENTLY
Creating an index can interfere with regular operation of a database. Normally PostgreSQL locks the table to be indexed
against writes and performs the entire index build with a single scan of the table. Other transactions can still read the
table, but if they try to insert, update, or delete rows in the table they will block until the index build is finished.
This could have a severe effect if the system is a live production database. Very large tables can take many hours to be
indexed, and even for smaller tables, an index build can lock out writers for periods that are unacceptably long for a
production system.
PostgreSQL supports building indexes without locking out writes. This method is invoked by specifying the CONCURRENTLY
option of CREATE INDEX. When this option is used, PostgreSQL must perform two scans of the table, and in addition it
must wait for all existing transactions that could potentially use the index to terminate. Thus this method requires more
total work than a standard index build and takes significantly longer to complete. However, since it allows normal opera-
tions to continue while the index is built, this method is useful for adding new indexes in a production environment. Of
course, the extra CPU and I/O load imposed by the index creation might slow other operations.
In a concurrent index build, the index is actually entered into the system catalogs in one transaction, then the two ta-
ble scans occur in a second and third transaction. If a problem arises while scanning the table, such as a uniqueness
violation in a unique index, the CREATE INDEX command will fail but leave behind an ``invalid'' index. This index will be
ignored for querying purposes because it might be incomplete; however it will still consume update overhead. The psql \d
command will report such an index as INVALID:
postgres=# \d tab
Table "public.tab"
Column | Type | Modifiers
--------+---------+-----------
col | integer |
Indexes:
"idx" btree (col) INVALID
The recommended recovery method in such cases is to drop the index and try again to perform CREATE INDEX CONCURRENTLY.
(Another possibility is to rebuild the index with REINDEX. However, since REINDEX does not support concurrent builds,
this option is unlikely to seem attractive.)
Another caveat when building a unique index concurrently is that the uniqueness constraint is already being enforced
against other transactions when the second table scan begins. This means that constraint violations could be reported in
other queries prior to the index becoming available for use, or even in cases where the index build eventually fails.
Also, if a failure does occur in the second scan, the ``invalid'' index continues to enforce its uniqueness constraint
afterwards.
Concurrent builds of expression indexes and partial indexes are supported. Errors occurring in the evaluation of these
expressions could cause behavior similar to that described above for unique constraint violations.
Regular index builds permit other regular index builds on the same table to occur in parallel, but only one concurrent
index build can occur on a table at a time. In both cases, no other types of schema modification on the table are allowed
meanwhile. Another difference is that a regular CREATE INDEX command can be performed within a transaction block, but
CREATE INDEX CONCURRENTLY cannot.
NOTES
See in the documentation for information about when indexes can be used, when they are not used, and in which particular
situations they can be useful.
Currently, only the B-tree, GiST and GIN index methods support multicolumn indexes. Up to 32 fields can be specified by
default. (This limit can be altered when building PostgreSQL.) Only B-tree currently supports unique indexes.
An operator class can be specified for each column of an index. The operator class identifies the operators to be used by
the index for that column. For example, a B-tree index on four-byte integers would use the int4_ops class; this operator
class includes comparison functions for four-byte integers. In practice the default operator class for the column's data
type is usually sufficient. The main point of having operator classes is that for some data types, there could be more
than one meaningful ordering. For example, we might want to sort a complex-number data type either by absolute value or
by real part. We could do this by defining two operator classes for the data type and then selecting the proper class
when making an index. More information about operator classes is in in the documentation and in in the documentation.
For index methods that support ordered scans (currently, only B-tree), the optional clauses ASC, DESC, NULLS FIRST,
and/or NULLS LAST can be specified to modify the sort ordering of the index. Since an ordered index can be scanned either
forward or backward, it is not normally useful to create a single-column DESC index -- that sort ordering is already
available with a regular index. The value of these options is that multicolumn indexes can be created that match the sort
ordering requested by a mixed-ordering query, such as SELECT ... ORDER BY x ASC, y DESC. The NULLS options are useful if
you need to support ``nulls sort low'' behavior, rather than the default ``nulls sort high'', in queries that depend on
indexes to avoid sorting steps.
For most index methods, the speed of creating an index is dependent on the setting of maintenance_work_mem. Larger val-
ues will reduce the time needed for index creation, so long as you don't make it larger than the amount of memory really
available, which would drive the machine into swapping. For hash indexes, the value of effective_cache_size is also rele-
vant to index creation time: PostgreSQL will use one of two different hash index creation methods depending on whether
the estimated index size is more or less than effective_cache_size. For best results, make sure that this parameter is
also set to something reflective of available memory, and be careful that the sum of maintenance_work_mem and effec-
tive_cache_size is less than the machine's RAM less whatever space is needed by other programs.
Use DROP INDEX [drop_index(7)] to remove an index.
Prior releases of PostgreSQL also had an R-tree index method. This method has been removed because it had no significant
advantages over the GiST method. If USING rtree is specified, CREATE INDEX will interpret it as USING gist, to simplify
conversion of old databases to GiST.
EXAMPLES
To create a B-tree index on the column title in the table films:
CREATE UNIQUE INDEX title_idx ON films (title);
To create an index on the expression lower(title), allowing efficient case-insensitive searches:
CREATE INDEX lower_title_idx ON films ((lower(title)));
To create an index with non-default sort ordering of nulls:
CREATE INDEX title_idx_nulls_low ON films (title NULLS FIRST);
To create an index with non-default fill factor:
CREATE UNIQUE INDEX title_idx ON films (title) WITH (fillfactor = 70);
To create a GIN index with fast updates disabled:
CREATE INDEX gin_idx ON documents_table USING gin (locations) WITH (fastupdate = off);
To create an index on the column code in the table films and have the index reside in the tablespace indexspace:
CREATE INDEX code_idx ON films(code) TABLESPACE indexspace;
To create a GiST index on a point attribute so that we can efficiently use box operators on the result of the conversion
function:
CREATE INDEX pointloc
ON points USING gist (box(location,location));
SELECT * FROM points
WHERE box(location,location) && '(0,0),(1,1)'::box;
To create an index without locking out writes to the table:
CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
COMPATIBILITY
CREATE INDEX is a PostgreSQL language extension. There are no provisions for indexes in the SQL standard.
SEE ALSO
ALTER INDEX [alter_index(7)], DROP INDEX [drop_index(7)]
SQL - Language Statements 2011-09-22 CREATE INDEX(7)

