API¶
The API is split into two modules: the multicorn module and the utils module:
- The multicorn module contains the whole API needed for implementing a Foreign Data Wrapper.
- The utils module contains logging and error reporting functions, which are ultimately implemented as calls to the PostgreSQL API.
Implementing an FDW¶
Implementing an FDW is as simple as implementing the ForeignDataWrapper class.
Required API¶
This subset of the API allows your ForeignDataWrapper to be used for read-only queries.
You have to implement the following methods:
Note
In the documentation, FDWs implementing this API will be marked with:
Supports: ✓✗✗✗Write API¶
To implement full write capabilites, the following property must be implemented:
- ForeignDataWrapper.rowid_column[source]¶
Returns: A column name which will act as a rowid column, for delete/update operations. One can think of it as a primary key.
This can be either an existing column name, or a made-up one. This column name should be subsequently present in every returned resultset.
In addition to that, you should implement each DML operation as you see fit:
Note
In the documentation, FDWs implementing this API will be marked with:
Supports: ✗✓✗✗Transactional API¶
Transactional Capabilities can be implemented with the following methods:
- ForeignDataWrapper.pre_commit()[source]¶
Hook called just before a commit is issued, on PostgreSQL >=9.3. This is where the transaction should tentatively commited.
Note
In the documentation, FDWs implementing this API will be marked with:
Supports: ✗✗✓✗Full API¶
- class multicorn.ForeignDataWrapper(fdw_options, fdw_columns)[source]¶
Base class for all foreign data wrapper instances.
Though not required, ForeignDataWrapper implementation should inherit from this class.
- __init__(fdw_options, fdw_columns)[source]¶
The foreign data wrapper is initialized on the first query.
Parameters: - fdw_options (dict) – The foreign data wrapper options. It is a dictionary mapping keys from the sql “CREATE FOREIGN TABLE” statement options. It is left to the implementor to decide what should be put in those options, and what to do with them.
- fdw_columns (dict) – The foreign datawrapper columns. It is a dictionary mapping the column names to their ColumnDefinition.
- begin(serializable)[source]
Hook called at the beginning of a transaction.
- commit()[source]¶
Hook called at commit time. On PostgreSQL >= 9.3, the pre_commit hook should be preferred.
- delete(oldvalues)[source]¶
Delete a tuple identified by oldvalues
Parameters: oldvalues (dict) – a dictionary mapping from column names to previously known values for the tuple. Returns: None
- execute(quals, columns)[source]¶
Execute a query in the foreign data wrapper.
This method is called at the first iteration. This is where the actual remote query execution takes place. Multicorn makes no assumption about the particular behavior of a ForeignDataWrapper, and will NOT remove any qualifiers from the PostgreSQL quals list. That means the quals will be rechecked anyway.
Typically, an implementation would:
- initialize (or reuse) some sort of connection to the remote system
- transform the quals and columns arguments to a representation suitable for the remote system
- fetch the data according to this query
- return it to the C-extension.
Although any iterable can be returned, it is strongly advised to implement this method as a generator to prevent loading the whole dataset in memory.
Parameters: - quals (list) – A list of Qual instances, containing the basic where clauses in the query.
- columns (list) – A list of columns that postgresql is going to need. You should return AT LEAST those columns when returning a dict. If returning a sequence, every column from the table should be in the sequence.
Returns: An iterable of python objects which can be converted back to PostgreSQL. Currently, such objects are: - sequences containing exactly as much columns as the underlying tables - dictionaries mapping column names to their values.
- get_path_keys()[source]¶
Method called from the planner to add additional Path to the planner. By default, the planner generates an (unparameterized) path, which can be reasoned about like a SequentialScan, optionally filtered.
This method allows the implementor to declare other Paths, corresponding to faster access methods for specific attributes. Such a parameterized path can be reasoned about like an IndexScan.
For example, with the following query:
select * from foreign_table inner join local_table using(id);
where foreign_table is a foreign table containing 100000 rows, and local_table is a regular table containing 100 rows.
The previous query would probably be transformed to a plan similar to this one:
┌────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├────────────────────────────────────────────────────────────────────────────────────┤ │ Hash Join (cost=57.67..4021812.67 rows=615000 width=68) │ │ Hash Cond: (foreign_table.id = local_table.id) │ │ -> Foreign Scan on foreign_table (cost=20.00..4000000.00 rows=100000 width=40) │ │ -> Hash (cost=22.30..22.30 rows=1230 width=36) │ │ -> Seq Scan on local_table (cost=0.00..22.30 rows=1230 width=36) │ └────────────────────────────────────────────────────────────────────────────────────┘
But with a parameterized path declared on the id key, with the knowledge that this key is unique on the foreign side, the following plan might get chosen:
┌───────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├───────────────────────────────────────────────────────────────────────┤ │ Nested Loop (cost=20.00..49234.60 rows=615000 width=68) │ │ -> Seq Scan on local_table (cost=0.00..22.30 rows=1230 width=36) │ │ -> Foreign Scan on remote_table (cost=20.00..40.00 rows=1 width=40)│ │ Filter: (id = local_table.id) │ └───────────────────────────────────────────────────────────────────────┘
Returns: (key_columns, expected_rows), where key_columns is a tuple containing the columns on which the path can be used, and expected_rows is the number of rows this path might return for a simple lookup. For example, the return value corresponding to the previous scenario would be: [(('id',), 1)]
Return type: A list of tuples of the form
- get_rel_size(quals, columns)[source]¶
Method called from the planner to estimate the resulting relation size for a scan.
It will help the planner in deciding between different types of plans, according to their costs.
Parameters: Returns: A tuple of the form (expected_number_of_rows, avg_row_width (in bytes))
- classmethod import_schema(schema, srv_options, options, restriction_type, restricts)[source]¶
Hook called on an IMPORT FOREIGN SCHEMA command.
Parameters: - schema (str) – the foreign schema to import
- srv_options (dict) – options defined at the server level
- options (dict) – options defined at the IMPORT FOREIGN SCHEMA statement level
- restriction_type (str) – One of ‘limit’, ‘except’ or None
- restricts (list) – a list of tables as passed to the LIMIT TO or EXCEPT clause
Returns: a list of multicorn.TableDefinition
Return type: list
- insert(values)[source]¶
Insert a tuple defined by ‘’values’’ in the foreign table.
Parameters: values (dict) – a dictionary mapping column names to column values Returns: A dictionary containing the new values. These values can differ from the values argument if any one of them was changed or inserted by the foreign side. For example, if a key is auto generated.
- pre_commit()[source]
Hook called just before a commit is issued, on PostgreSQL >=9.3. This is where the transaction should tentatively commited.
- rollback()[source]
Hook called when the transaction is rollbacked.
- rowid_column[source]
Returns: A column name which will act as a rowid column, for delete/update operations. One can think of it as a primary key.
This can be either an existing column name, or a made-up one. This column name should be subsequently present in every returned resultset.
- sub_begin(level)[source]
Hook called at the beginning of a subtransaction.
- sub_commit(level)[source]
Hook called when a subtransaction is committed.
- sub_rollback(level)[source]
Hook called when a subtransaction is rollbacked.
- class multicorn.Qual(field_name, operator, value)[source]¶
A Qual describes a postgresql qualifier.
A qualifier is here defined as an expression of the type:
col_name operator value
For example:
mycolumn > 3 mycolumn = ANY(1,2,3) mycolumn ~~ ALL('A%','AB%', '%C')
- field_name¶
str
The name of the column as defined in the postgresql table.
- operator¶
str or tuple
The name of the operator if a string. Example: =, <=, ~~ (for a like clause)
If it is a tuple, then the tuple is of the form (operator name, ANY or ALL).
The tuple represents a comparison of the form WHERE field = ANY(1, 2, 3), which is the internal representation of WHERE field IN (1, 2, 3)
- value¶
object
The constant value on the right side
- class multicorn.ColumnDefinition(column_name, type_oid=0, typmod=0, type_name='', base_type_name='', options=None)[source]¶
Definition of Foreign Table Column.
- column_name¶
str
the name of the column
- type_oid¶
int
the internal OID of the PostgreSQL type
- typmod¶
int
the type modifier (ex: VARCHAR(12))
- type_name¶
str
the formatted type name, with the modifier (ex: VARCHAR(12))
- base_type_name¶
str
the base type name, withou modifier (ex: VARCHAR)
- options¶
dict
a mapping of option names to option values, as strings.