An Overview of PL/SQL Collection Types

Collections are core components in Oracle PL/SQL programs. You can (temporarily) store data from the database or local variables in collections and pass these collections to subprograms. Collections are also critical to bulk operations, such as BULK COLLECT and FORALL, as well as table functions, both simple and pipelined. Bulk operations and table functions are critical to high-performance code.

Here, I provide an overview of their characteristics as an introduction to novice PL/SQL developers and as an one-stop reference.

Collections are data structures that are the PL/SQL-equivalent of traditional arrays. PL/SQL comes with three collection types:

  1. Associative arrays (a.k.a. PL/SQL tables and index-by tables).
  2. Nested tables.
  3. Variable-size (or varying) arrays (a.k.a. varrays).

These are all homogeneous and one-dimensional: all elements have the same data type and all collections are linear arrays. Multi-dimensional data structures can be simulated by nesting collections, that is a collection of a collection. You can reference such nested collections as my_collection( n )( m ), which in programming languages that support two-dimensional arrays would correspond to my_collection[m,n].

The collections are listed in order of appearance and thus utility. Variable-size arrays have a maximum number of elements they can hold, which in many practical situations means that they are too inflexible to be useful.

Below we have condensed the salient details for each of these three collection types. In naming the collection types we stick to the coding guidelines laid out previously. The transpose of this table first appeared on oracle.rtfd.io in the context of bulk optimizations to PL/SQL code, and it has been adapted and expanded since.

Property nested table associative array variable-size array
index Yes Yes No
dense Yes Yes Yes
persistent Yes No Yes
initialization Yes No Yes
ordered No Yes Yes
compare Yes No No*
EXTEND Yes No Yes
TRIM Yes No Yes
MULTISET Yes No No

The properties in the table have the following meaning:

  • Index: How the elements are indexed and accessed.
  • Sparse: Whether or not the collection may be sparse.
  • Dense: Whether or not the collection may be dense.
  • Persistent: Whether or not it is possible to store the collection at the database (schema) level.
  • Initialization: Whether or not the collection needs to be initialized with the default constructor before it can be used.
  • Compare: Whether or not collections can be compared with relational operators (e.g. = and IS [ NOT ] NULL).
  • EXTEND: Whether or not the EXTEND and/or EXTEND( n ) method is available.
  • TRIM: Whether or not the TRIM and/or TRIM( n ) method is available.
  • MULTISET: Whether or not MULTISET operators (e.g. INTERSECT) can be used on the collection.

These three collection types can be defined with the following syntax:

-- nested table
[ CREATE ] TYPE identifier_ntt { IS | AS } TABLE OF data_type;

-- associative array
TYPE identifier_aat IS TABLE OF data_type INDEX BY ix_data_type;

-- variable-size array
[ CREATE ] TYPE identifier_vat { IS | AS } VARRAY ( n ) OF data_type;

Explanation

The declaration is closely related to the row that determines whether the respective collection types can be stored at the schema rather than module level. A module in this context means one of the following:

  • Anonymous block.
  • Function.
  • Procedure.
  • Package.

If a collection type has ‘Yes’ in the row Persistent, then the SQL syntax CREATE [ OR REPLACE ] TYPE ... AS is allowed outside of modules. If not, you can only define the collection type (i.e. associative arrays) in modules, and they are local to these modules. Please note that it is possible to define custom collection types in the package specification, which means that they are available to all users who have the required privileges to EXECUTE said package.

Elements are typically indexed by a positive integer with the exception of associative arrays’: associative arrays can have negative and zero index values as well as strings of type VARCHAR2. The maximum length of the VARCHAR2 is the maximum size in PL/SQL, which is 32767 bytes, even when created at the schema level? As explained on oracle.rtfd.io, string indexes of more than about 100 characters can become a performance issue due to costly hashing, so it is best to keep the strings relatively short.

Nested tables can be sparse, although they are typically dense. In fact, they can only become sparse by deleting elements in the middle.

Both nested tables and variable-size arrays need to be initialized with the default constructor:

my_collection identifier_ntt := identifier_ntt();
my_collection identifier_vat := identifier_vat();

When you supply parameters, these values will be used as the initial values, otherwise the collection will be empty. Note that assigning NULL to the collection does not count as initialization. In fact, the collection is atomically null until you initialize it with the system-generated constructor function; you can use my_collection IS NULL to check whether the collection is atomically null.

Without the call to the default constructor, you will run into an ORA-06531: Reference to uninitialized collection error, and without the EXTEND method you are greeted by ORA-06533: Subscript beyond count. Associative arrays do not require initialization, as they are empty by default.

Once these collections are initialized, you still need to make room for new elements with the EXTEND method. Yes, even for variable-size arrays that have a maximum size defined when they are declared. A simple call to my_collection.EXTEND allows you to add one element, whereas my_collection.EXTEND( m ) can be used to allow m elements to be added. Note that for variable-size arrays mn.

Oracle offers the DELETE method to all three collections. With it you can remove all (my_collection.DELETE) or specific elements (my_collection.DELETE( ix_val ) to remove the element at index value ix_value), where the latter only applies to nested tables and associative arrays. Because variable-size arrays are dense, it only has the delete-all-elements method. Nested tables and variable-size arrays also have the TRIM method, which can be used to trim a certain number of elements from the back of the collection. TRIM without a parameter removes the last element; TRIM( m ) removes the last m elements.

The difference between both methods is that DELETE removes the element but leaves the placeholder intact, whereas TRIM also removes the placeholder. This difference becomes significant when using both removal methods on a collection: if TRIM is called after my_collection.DELETE( my_collection.LAST ), only one element is removed, not two. Oracle therefore recommends to either use TRIM or DELETE but not both on the same collection.

Oracle does not guarantee that that order in which elements of a nested table from the database are retrieved matches the order in which they were inserted. Nested tables most closely represent sets, which is why the MULTISET operators can be used on them. Variable-size arrays are always ordered. Since associative arrays are really key-value pairs, the keys are sorted by the index values of the associative array. Hence, the creation order is most likely not the same as the index order.

You can compare individual elements of collections but not necessarily entire collections. To be more specific, associative arrays cannot be compared with relational (in-)equality operators at all. Nested tables can be compared with both (in-)equality operators and the IS [ NOT ] NULL operator. However, variable-size arrays can only be compared to NULL, which is why we have added the asterisk in the row Compare for varrays. If you try, Oracle will throw the exception PLS-00306: wrong number or types of arguments in call to ....

When you compare two nested tables with, say, an equality operator, they have to be of the same collection type, the elements need to be the same, and the number of elements (even empty ones) need to match, for the equality operator to yield TRUE. Nested tables have another advantage when it comes to comparisons: they allow MULTISET operators.

All of these collection types can be defined in package specifications, even when used with pipelined table functions. Since pipelined table functions are called from SQL (e.g. SELECT * FROM TABLE( DBMS_XPLAN.DISPLAY )), the SQL engine needs to know of these collection types, even though they are technically PL/SQL constructs that exist only in the package specification. This is why Oracle automatically creates these at the schema level with system-generated names like SYS_PLSQL_12345_0. Since this can cause a maintenance headache down the line, it is generally recommend that you create the required object and collection types at the schema level even though it is technically not required. Note that when you do decide to go with system-generated types based on record and collection types in the package specification and you drop the pipelined table function that needs these, Oracle automatically removes them from the database.

Tables with Persistent Collection Types as Columns

Even though it is technically possible to store collections as columns, it may not be a smart idea because it breaks normalization. If, however, you are forced to use collections as tables or they lead to a significant performance benefit that cannot be ignored, there are a few restrictions that you have to be aware of:

  • Tables can have at most 1000 columns, which for most practical purposes is more than enough. When you define columns based on collection types, Oracle maps these to relational columns internally, which means that you can hit the 1000-column limit earlier than you think.
  • An index-organized table (IOT) that is a nested table or variable-size array cannot be partitioned.
  • The in-memory (IM) column store that is available as of 12c does not support collection types, among others. In case you enable a table to be stored in memory, unsupported columns are not populated in the IM column store.
  • Temporary table cannot contain nested-table columns.

So, how would you create such a table?

-- Setup.
CREATE TYPE colours_vat AS VARRAY(5) OF VARCHAR2(10);

CREATE TABLE people
(
  id NUMBER
, name VARCHAR2(50)
, ...
, fav_colours colours_vat
, CONSTRAINT people_pk PRIMARY KEY ( id )
);

-- Data.
INSERT INTO people
VALUES ( 1, 'Ian Hellström', ..., colours_vat('Black','Grey','Purple','Red') );

-- Cleanup.
DROP TABLE people;
DROP TYPE colours_vat;

Alternatively, you can use a nested table type as a column:

-- Setup.
CREATE TYPE colours_ntt AS TABLE OF VARCHAR2(10);

CREATE TABLE people
(
  id NUMBER
, name VARCHAR2(50)
, ...
, fav_colours colours_ntt
, CONSTRAINT people_pk PRIMARY KEY ( id )
)
NESTED TABLE fav_colours STORE AS fav_colours_st;

-- Data.
INSERT INTO people
VALUES ( 1, 'Ian Hellström', ..., colours_ntt('Black','Grey','Purple','Red') );

UPDATE people
SET name = 'Hellström, Ian' WHERE id = 1;

UPDATE people
SET fav_colours = colours_ntt('Black','White') WHERE id = 1;

-- Cleanup.
DROP TABLE people;
DROP TYPE colours_ntt;

No, I’m not a very colourful person…

Anyway, because variable-size arrays have a maximum number of elements they can hold, they are stored with the table itself. Because nested tables can potentially grow indefinitely, Oracle stores the data separately. Variable-size arrays are best for small, order-preserved data sets; large, unordered data sets, the size of which is initially unknown, are best kept in nested tables.

You cannot directly access elements by means of something along the lines of SELECT fav_colours(2) FROM people. You have to dump the entire collection into an object of the same type, and then access the element you’re after:

DECLARE
  l_colours colours_ntt := colours_ntt();
BEGIN
  SELECT fav_colours
  INTO   l_colours
  FROM   people;

  DBMS_OUTPUT.PUT_LINE( 'My second favourite colour is: ' || l_colours(2) );
END;

An alternative is to use a table expression, which takes a subquery that returns exactly one record of a collection. In our example this could be achieved as follows:

SELECT
  p.id
, p.name
, c.*
FROM
  people p
CROSS JOIN
  TABLE
  (
    SELECT fav_colours AS colour
    FROM   people
    WHERE  id = 1
  ) c;

Such a table expression has nothing to do with table functions though.

Conclusion

Collections are everywhere in PL/SQL and even SQL. So, be sure you understand collections and employ them wisely. The table shown before is there to help you in identifying the right collection for the right job.

If you’re ready to find out how and why collections can be used to optimize your code, head on over to Read The Docs, where bulk operations and pipelined table functions stand at the centre of attention. If you’re not yet familiar with my goings-on at RTD, you can catch up here.