API

The API is split into two modules: the multicorn module and the utils module:

  • The multicorn module contains the whole API needed for implementing a Foreign Data Wrapper.
  • The utils module contains logging and error reporting functions, which are ultimately implemented as calls to the PostgreSQL API.

Implementing an FDW

Implementing an FDW is as simple as implementing the ForeignDataWrapper class.

Required API

This subset of the API allows your ForeignDataWrapper to be used for read-only queries.

You have to implement the following methods:

Note

In the documentation, FDWs implementing this API will be marked with:

Supports:

Write API

To implement full write capabilites, the following property must be implemented:

ForeignDataWrapper.rowid_column

Returns – A column name which will act as a rowid column, for delete/update operations.

One can think of it as a primary key.

This can be either an existing column name, or a made-up one. This column name should be subsequently present in every returned resultset.

In addition to that, you should implement each DML operation as you see fit:

Note

In the documentation, FDWs implementing this API will be marked with:

Supports:

Transactional API

Transactional Capabilities can be implemented with the following methods:

ForeignDataWrapper.begin(serializable)[source]

Hook called at the beginning of a transaction.

ForeignDataWrapper.pre_commit()[source]

Hook called just before a commit is issued, on PostgreSQL >=9.3. This is where the transaction should tentatively commited.

ForeignDataWrapper.rollback()[source]

Hook called when the transaction is rollbacked.

ForeignDataWrapper.sub_begin(level)[source]

Hook called at the beginning of a subtransaction.

ForeignDataWrapper.sub_commit(level)[source]

Hook called when a subtransaction is committed.

ForeignDataWrapper.sub_rollback(level)[source]

Hook called when a subtransaction is rollbacked.

Note

In the documentation, FDWs implementing this API will be marked with:

Supports:

Full API

class multicorn.ForeignDataWrapper(fdw_options, fdw_columns)[source]

Base class for all foreign data wrapper instances.

Though not required, ForeignDataWrapper implementation should inherit from this class.

__init__(fdw_options, fdw_columns)[source]

The foreign data wrapper is initialized on the first query.

Parameters:
  • fdw_options (dict) – The foreign data wrapper options. It is a dictionary mapping keys from the sql “CREATE FOREIGN TABLE” statement options. It is left to the implementor to decide what should be put in those options, and what to do with them.
  • fdw_columns (dict) – The foreign datawrapper columns. It is a dictionary mapping the column names to their ColumnDefinition.
begin(serializable)[source]

Hook called at the beginning of a transaction.

can_sort(sortkeys)[source]

Method called from the planner to ask the FDW what are the sorts it can enforced, to avoid PostgreSQL to sort the data after retreiving all the rows. These sorts can come from explicit ORDER BY clauses, but also GROUP BY and DISTINCT clauses.

The FDW has to inspect every sort, and respond which one are handled. The sorts are cumulatives. For example:

col1 ASC
col2 DESC

means that the FDW must render the tuples sorted by col1 ascending and col2 descending.

Parameters:sortkeys (list) – A list of SortKey representing all the sorts the query must enforce.
Returns:The list of cumulative SortKey, for which the FDW can enforce the sort.
commit()[source]

Hook called at commit time. On PostgreSQL >= 9.3, the pre_commit hook should be preferred.

delete(oldvalues)[source]

Delete a tuple identified by oldvalues

Parameters:oldvalues (dict) – a dictionary mapping from column names to previously known values for the tuple.
Returns:None
end_modify()[source]

Hook called at the end of a foreign modify (DML operations)

end_scan()[source]

Hook called at the end of a foreign scan.

execute(quals, columns, sortkeys=None)[source]

Execute a query in the foreign data wrapper.

This method is called at the first iteration. This is where the actual remote query execution takes place. Multicorn makes no assumption about the particular behavior of a ForeignDataWrapper, and will NOT remove any qualifiers from the PostgreSQL quals list. That means the quals will be rechecked anyway.

Typically, an implementation would:

  • initialize (or reuse) some sort of connection to the remote system
  • transform the quals and columns arguments to a representation suitable for the remote system
  • fetch the data according to this query
  • return it to the C-extension.

Although any iterable can be returned, it is strongly advised to implement this method as a generator to prevent loading the whole dataset in memory.

Parameters:
  • quals (list) – A list of Qual instances, containing the basic where clauses in the query.
  • columns (list) – A list of columns that postgresql is going to need. You should return AT LEAST those columns when returning a dict. If returning a sequence, every column from the table should be in the sequence.
  • sortkeys (list) – A list of SortKey that the FDW said it can enforce.
Returns:

An iterable of python objects which can be converted back to PostgreSQL. Currently, such objects are: - sequences containing exactly as much columns as the underlying tables - dictionaries mapping column names to their values. If the sortkeys wasn’t empty, the FDW has to return the data in the expected order.

explain(quals, columns, sortkeys=None, verbose=False)[source]

Hook called on explain.

The arguments are the same as the execute(), with the addition of a “verbose” keyword arg for when the EXPLAIN is called with the VERBOSE option. :returns: An iterable of strings to display in the EXPLAIN output.

get_path_keys()[source]

Method called from the planner to add additional Path to the planner. By default, the planner generates an (unparameterized) path, which can be reasoned about like a SequentialScan, optionally filtered.

This method allows the implementor to declare other Paths, corresponding to faster access methods for specific attributes. Such a parameterized path can be reasoned about like an IndexScan.

For example, with the following query:

select * from foreign_table inner join local_table using(id);

where foreign_table is a foreign table containing 100000 rows, and local_table is a regular table containing 100 rows.

The previous query would probably be transformed to a plan similar to this one:

┌────────────────────────────────────────────────────────────────────────────────────┐
│                                     QUERY PLAN                                     │
├────────────────────────────────────────────────────────────────────────────────────┤
│ Hash Join  (cost=57.67..4021812.67 rows=615000 width=68)                           │
│   Hash Cond: (foreign_table.id = local_table.id)                                   │
│   ->  Foreign Scan on foreign_table (cost=20.00..4000000.00 rows=100000 width=40)  │
│   ->  Hash  (cost=22.30..22.30 rows=1230 width=36)                                 │
│         ->  Seq Scan on local_table (cost=0.00..22.30 rows=1230 width=36)          │
└────────────────────────────────────────────────────────────────────────────────────┘

But with a parameterized path declared on the id key, with the knowledge that this key is unique on the foreign side, the following plan might get chosen:

┌───────────────────────────────────────────────────────────────────────┐
│                              QUERY PLAN                               │
├───────────────────────────────────────────────────────────────────────┤
│ Nested Loop  (cost=20.00..49234.60 rows=615000 width=68)              │
│   ->  Seq Scan on local_table (cost=0.00..22.30 rows=1230 width=36)   │
│   ->  Foreign Scan on remote_table (cost=20.00..40.00 rows=1 width=40)│
│         Filter: (id = local_table.id)                                 │
└───────────────────────────────────────────────────────────────────────┘
Returns:(key_columns, expected_rows), where key_columns is a tuple containing the columns on which the path can be used, and expected_rows is the number of rows this path might return for a simple lookup. For example, the return value corresponding to the previous scenario would be:
[(('id',), 1)]
Return type:A list of tuples of the form
get_rel_size(quals, columns)[source]

Method called from the planner to estimate the resulting relation size for a scan.

It will help the planner in deciding between different types of plans, according to their costs.

Parameters:
  • quals (list) – A list of Qual instances describing the filters applied to this scan.
  • columns (list) – The list of columns that must be returned.
Returns:

A tuple of the form (expected_number_of_rows, avg_row_width (in bytes))

classmethod import_schema(schema, srv_options, options, restriction_type, restricts)[source]

Hook called on an IMPORT FOREIGN SCHEMA command.

Parameters:
  • schema (str) – the foreign schema to import
  • srv_options (dict) – options defined at the server level
  • options (dict) – options defined at the IMPORT FOREIGN SCHEMA statement level
  • restriction_type (str) – One of ‘limit’, ‘except’ or None
  • restricts (list) – a list of tables as passed to the LIMIT TO or EXCEPT clause
Returns:

a list of multicorn.TableDefinition

Return type:

list

insert(values)[source]

Insert a tuple defined by ‘’values’’ in the foreign table.

Parameters:values (dict) – a dictionary mapping column names to column values
Returns:A dictionary containing the new values. These values can differ from the values argument if any one of them was changed or inserted by the foreign side. For example, if a key is auto generated.
pre_commit()[source]

Hook called just before a commit is issued, on PostgreSQL >=9.3. This is where the transaction should tentatively commited.

rollback()[source]

Hook called when the transaction is rollbacked.

rowid_column

Returns – A column name which will act as a rowid column, for delete/update operations.

One can think of it as a primary key.

This can be either an existing column name, or a made-up one. This column name should be subsequently present in every returned resultset.

sub_begin(level)[source]

Hook called at the beginning of a subtransaction.

sub_commit(level)[source]

Hook called when a subtransaction is committed.

sub_rollback(level)[source]

Hook called when a subtransaction is rollbacked.

update(oldvalues, newvalues)[source]

Update a tuple containing ‘’oldvalues’’ to the ‘’newvalues’‘.

Parameters:
  • oldvalues (dict) – a dictionary mapping from column names to previously known values for the tuple.
  • newvalues (dict) – a dictionary mapping from column names to new values for the tuple.
Returns:

A dictionary containing the new values. See :method:insert for information about this return value.

class multicorn.SortKey(attname, attnum, is_reversed, nulls_first, collate)

A SortKey describes the sort of one column an SQL query requested.

A query can request the sort of zero, one or multiple columns. Therefore, a list of SortKey is provided to the ForeignDataWrapper, containing zero, one or more SortKey.

attname

str – The name of the column to sort as defined in the postgresql table.

attnum

int – The position of the column to sort as defined in the postgresql table.

is_reversed

bool – True is the query requested a DESC order.

nulls_first

bool – If True, NULL values must appears at the beginning. Otherwise, they must appear at the end.

collate

str – The collation name to use to sort the data, as appearing in the postgresql cluster.

class multicorn.Qual(field_name, operator, value)[source]

A Qual describes a postgresql qualifier.

A qualifier is here defined as an expression of the type:

col_name operator value

For example:

mycolumn > 3
mycolumn = ANY(1,2,3)
mycolumn ~~ ALL('A%','AB%', '%C')
field_name

str – The name of the column as defined in the postgresql table.

operator

str or tuple – The name of the operator if a string. Example: =, <=, ~~ (for a like clause)

If it is a tuple, then the tuple is of the form (operator name, ANY or ALL).

The tuple represents a comparison of the form WHERE field = ANY(1, 2, 3), which is the internal representation of WHERE field IN (1, 2, 3)

value

object – The constant value on the right side

is_list_operator

Returns – True if this qual represents an array expr, False otherwise

list_any_or_all

Returns

ANY if and only if:
  • this qual is a list operator
  • the operator applies as an ‘ANY’ clause (eg, = ANY(1,2,3))
ALL if and only if:
  • this is a list operator
  • the operator applies as an ‘ALL’ clause (eg, > ALL(1, 2, 3))

None if this is not a list operator.

class multicorn.ColumnDefinition(column_name, type_oid=0, typmod=0, type_name='', base_type_name='', options=None)[source]

Definition of Foreign Table Column.

column_name

str – the name of the column

type_oid

int – the internal OID of the PostgreSQL type

typmod

int – the type modifier (ex: VARCHAR(12))

type_name

str – the formatted type name, with the modifier (ex: VARCHAR(12))

base_type_name

str – the base type name, withou modifier (ex: VARCHAR)

options

dict – a mapping of option names to option values, as strings.

class multicorn.TableDefinition(table_name, schema=None, columns=None, options=None)[source]

Definition of a Foreign Table.

table_name

str – the name of the table

columns

str – a list of ColumnDefinition objects

options

dict – a dictionary containing the table-level options.

to_statement(schema_name, server_name)[source]

Generates the CREATE FOREIGN TABLE statement associated with this definition.