API Reference

This page contains specific information on the polyanalyst6api package classes and methods.

class polyanalyst6api.api.API(url=None, username=None, password=None, ldap_server=None, version='1.0')[source]

PolyAnalyst API

Parameters:
  • url (Optional[str, None]) – (optional) The scheme, host and port(if exists) of a PolyAnalyst server (e.g. https://localhost:5043/, http://example.polyanalyst.com)

  • username (Optional[str, None]) – (optional) The username to log in with

  • password (Optional[str, None]) – (optional) The password for specified username

  • ldap_server (Optional[str, None]) – (optional) LDAP Server address

  • version (str) – (optional) Choose which PolyAnalyst API version to use. Default: 1.0

If ldap_server is provided, then login will be performed via LDAP Server.

Usage:

>>> with API(POLYANALYST_URL, YOUR_USERNAME, YOUR_PASSWORD) as api:
...     print(api.get_server_info())

or if you’re using configuration file (New in version 0.23.0):

>>> with API() as api:
...     print(api.get_server_info())
get(endpoint, **kwargs)[source]

Shortcut for GET requests via request

Parameters:
  • endpoint (str) – PolyAnalyst API endpoint

  • kwargsrequests.request() keyword arguments

Return type:

Any

get_parameters()[source]

Returns list of nodes with parameters supported by Parameters node.

Deprecated since version 0.18.0: Use Parameters.get() instead.

Return type:

List[Dict[str, Union[str, List]]]

get_project_import_status(import_id)[source]

Get the status of project import

Parameters:

import_id (str) – the import identifier

New in version 0.24.0.

Return type:

Dict

get_server_info()[source]

Returns general server information including build number, version and commit hashes.

Return type:

Optional[Dict[str, Union[int, str, Dict[str, str]]], None]

get_versions()[source]

Returns api versions supported by PolyAnalyst server.

Return type:

List[str]

import_project(file_path, project_space='', on_conflict='Cancel', wait=False)[source]

Import project from file on server file system.

Parameters:
  • file_path (str) – absolute path to the file on server file system

  • project_space (str) – the name of the folder in the project manager where you want to import the project. The default folder is Root.

  • on_conflict (str) – the strategy to resolve import conflict. Allowed options are: Cancel, Overwrite, ChangeExistingId, ChangeImportingId. By default, the import will be cancelled if the project already exist.

  • wait (bool) – wait for project import to finish. False by default.

Return type:

Union[str, Dict]

Returns:

import identifier if wait is False and import status otherwise

New in version 0.24.0.

login()[source]

Logs in to PolyAnalyst Server with user credentials.

Return type:

None

logout()[source]

Logs out current user from PolyAnalyst server.

Return type:

None

post(endpoint, **kwargs)[source]

Shortcut for POST requests via request

Parameters:
  • endpoint (str) – PolyAnalyst API endpoint

  • kwargsrequests.request() keyword arguments

Return type:

Any

project(uuid)[source]

Returns Project instance with given uuid.

Parameters:

uuid (str) – The project uuid

Return type:

Project

request(url, method, **kwargs)[source]

Sends method request to endpoint and returns tuple of requests.Response and json-encoded content of a response.

Parameters:
  • url (str) – url or PolyAnalyst API endpoint

  • method (str) – request method (e.g. GET, POST)

  • kwargsrequests.request() keyword arguments

Return type:

Tuple[Response, Any]

run_task(id)[source]

Initiates scheduler task execution.

Parameters:

id (int) – the task ID

Return type:

None

class polyanalyst6api.drive.Drive(api)[source]
create_folder(name, path='')[source]

Create a new folder inside the PolyAnalyst’s user directory.

Parameters:
  • name (str) – the folder name

  • path (str) – a relative path of the folder’s parent directory

Return type:

None

delete_file(name, path='')[source]

Delete the file in the PolyAnalyst’s user directory.

Parameters:
  • name (str) – the filename

  • path (str) – a relative path of the file’s parent directory

Return type:

None

delete_folder(name, path='')[source]

Delete the folder in the PolyAnalyst’s user directory.

Parameters:
  • name (str) – the folder name

  • path (str) – a relative path of the folder’s parent directory

Return type:

None

download_file(name, path='', dest=None)[source]

Download the binary content of the file to memory or stream to local file.

Parameters:
  • name (str) – the filename

  • path (str) – a relative path of the file’s parent directory

  • dest (Optional[IO, None]) – the file or file-like object to write drive’s file content

Usage::
>>> file_content = api.drive.download_file(name='cars.csv', path='/data')
# The method call above downloads the whole file body into memory. If you're
# planning on downloading big file (more than 100 megabyte sized file),
# consider using the streaming download shown below.
>>> with open('local_file_that_doesnt_exist_yet.csv', 'wb+') as file:
...     api.drive.download_file(name='cars.csv', path='/data', dest=file)
Return type:

bytes

upload(source, dest='', recursive=True)[source]

Upload file or folder to PolyAnalyst server.

Pass recursive as False to just create folder on the server without uploading inner files and folders.

Parameters:
  • source (Union[str, PathLike]) – path to the local file or folder

  • dest (str) – (optional) path to the folder in the PolyAnalyst’s user directory

  • recursive (bool) – (optional) upload subdirectories recursively

Raises:

TypeError if source is not string or path-like object. ValueError if source does not exist

Return type:

None

upload_file(file, name=None, path='')[source]

Upload the file to the PolyAnalyst’s user directory.

Warning

Make sure to create a new file or file-like object for every Drive.upload_file() call!

Note

Always prefer Drive.upload() over this method.

Parameters:
  • file (IO) – the file or file-like object to upload

  • name (Optional[str, None]) – the filename other than file’s name

  • path (str) – (optional) a relative path of the file’s parent directory

Usage::
>>> drive = Drive(...)
>>> with open('CarData.csv', mode='rb') as file:
...     drive.upload_file(file, name='cars.csv', path='/data')
Return type:

None

class polyanalyst6api.project.Project(api, uuid)[source]

This class maintains all operations with the PolyAnalyst’s project and nodes.

Parameters:
  • api – An instance of API class

  • uuid (str) – The uuid of the project you want to interact with

abort()[source]

Aborts the execution of all nodes in the project.

Return type:

None

dataset(node)[source]

Get dataset wrapper object.

Parameters:

node (Union[str, Dict[str, str]]) – node name or dict with name and type of the node

New in version 0.16.0.

delete(force_unload=False)[source]

Delete the project from server.

Parameters:

force_unload (bool) – Delete project regardless other users

By default, the project will be deleted only if it’s not loaded to memory. To delete the project that loaded to memory (there are users working on this project right now) set force_unload to True. This operation available only for project owner and administrators, and cannot be undone.

Return type:

None

execute(*args, wait=False)[source]

Initiates execution of nodes and returns execution wave identifier.

Parameters:
  • args (Union[str, Dict[str, str]]) – node names and/or dicts with name and type of nodes

  • wait (bool) – wait for nodes execution to complete

Usage:

>>> wave_id = prj.execute('Internet Source', 'Python')

use wait=True to wait for the passed nodes execution to complete.

>>> prj.execute('Export to MS Word', wait=True)

or, if there are several nodes in the project with the same name, pass them as dicts with name and type keys (and because of this, you can also pass items of Project.get_node_list())

>>> prj.execute(
...     {'name': 'Example node', 'type': 'DataSource'},
...     {'name': 'Example node', 'type': 'Dataset'},
...     'Federated Search',
...     prj.get_node_list()[1],
... )

or, if you want to execute all nodes, call this method with no args:

>>> prj.execute()
Return type:

Optional[int, None]

get_execution_statistics()[source]

Returns the execution statistics for nodes in the project.

Similar to Project.get_nodes() but nodes contains extra information and the project statistics.

Deprecated since version 0.15.0: Use Project.get_execution_stats() instead.

Return type:

Tuple[Dict[str, Dict[str, Union[str, int]]], Dict[str, int]]

get_execution_stats(skip_hidden=None)[source]

Returns nodes execution statistics.

Parameters:

skip_hidden (Optional[bool, None]) – Return statistics only of nodes in the project (i.e. exclude publication and compound nodes).

New in version 0.15.0.

Changed in version 0.25.0: Added skip_hidden optional parameter.

Return type:

List[Dict[str, Union[str, int]]]

get_node_list()[source]

Returns a list of project nodes.

New in version 0.15.0.

Return type:

List[Dict[str, Union[str, int]]]

get_nodes()[source]

Returns a dictionary of project’s nodes information.

Deprecated since version 0.15.0: Use Project.get_node_list() instead.

Return type:

Dict[str, Dict[str, Union[str, int]]]

get_tasks()[source]

Returns task list info.

Return type:

List[Dict[str, Any]]

is_running(wave_id)[source]

Checks that execution wave is still running in the project.

If wave_id is -1 then the project is checked against any active execution, saving, publishing operations.

Parameters:

wave_id (int) – Execution wave identifier

Return type:

bool

parameters(name)[source]

Get parameters wrapper object.

Parameters:

name (str) – Parameters node name

New in version 0.18.0.

preview(node)[source]

Returns first 1000 rows of data from node, texts and strings are cutoff after 250 symbols.

Parameters:

node (Union[str, Dict[str, str]]) – node name or dict with name and type of node

Deprecated since version 0.16.0: Use Dataset.preview() instead.

Return type:

List[Dict[str, Any]]

repair()[source]

Initiate the project repairing operation.

Return type:

None

save()[source]

Initiates saving of all changes that have been made in the project.

Return type:

None

set_parameters(node, node_type, parameters, declare_unsync=True, hard_update=True)[source]

Set parameters of the selected Parameters node in the project.

Parameters:
  • node (str) – name of Parameters node

  • node_type (str) – node type, which parameters need to be set. The types are listed in NodeTypes.

  • parameters (Dict[str, Any]) – default parameters of the node to be set.

  • declare_unsync (bool) – reset the status of the Parameters node.

  • hard_update (bool) – update every child node with new parameters if True, otherwise reset their statuses. Works only if declare_unsync is True.

Deprecated since version 0.18.0: Use Parameters.set() instead.

Return type:

None

unload()[source]

Unload the project from the memory.

From version 0.26.2 this function ensures that the project is unloaded despite PABusy error by repeating requests (maximum 10) until PA either returns an ok response or returns an error ‘the project has not been opened’, which means project is also unloaded.

Raises:

PABusy if PABusy returned for all 10 request attempts

Raises:

APIException if the project has been unloaded before this function was called

Return type:

None

wait_for_completion(node, wave_id=None)[source]

Waits for the node in a sequence of nodes to complete. Returns True if node have completed successfully and False otherwise.

Unlike execute(…, wait=True), which returns only after an entire node sequence has completed, this method returns immediately after the specified node has completed.

Parameters:
  • node (Union[str, Dict[str, str]]) – Node name or dict with name and type of node that runs within execution wave

  • wave_id (Optional[int, None]) – Execution wave identifier

Deprecated since version 0.17.0: Use Project.is_running() instead.

Changed in version 0.23.0: Introduced this deprecated method back. Added wave_id argument.

Return type:

bool

class polyanalyst6api.project.DataSet(prj, node)[source]
get_info()[source]

Get information about dataset.

Return type:

Dict[str, Any]

get_progress()[source]

Get dataset progress.

Return type:

Dict[str, Union[str, int]]

iter_rows(start=0, stop=None)[source]

Iterate over rows in dataset.

Parameters:
  • start (int) –

  • stop (Optional[int, None]) –

Raises:

ValueError if start or stop is out of datasets’ row range

Usage:

# download first 10 rows
>>> head = []
>>> for row in ds.iter_rows(0, 10):
...     head.append(row)
# download full dataset and convert it to pandas.DataFrame
>>> table = list(ds.iter_rows())
>>> df = pandas.DataFrame(table)
Return type:

Iterator[Dict[str, Union[bool, str, int, float, None]]]

preview(precision=6, include_blank_cells=False)[source]

Get dataset preview.

Contains the first 1000 rows, string/text are cut off after 250 symbols. By default, numbers are rounded to 6 significant digits and blank cells are omitted.

Parameters:
  • precision (int) – (optional) number of significant digits. 6 by default.

  • include_blank_cells (bool) – (optional) include blank cells in dataset. False by default.

Raises:

APIException if non-default parameters are used with the old version of server, which doesn’t support them. In this case retry the method with default parameters.

New in version 0.24.0: The precision and include_blank_cells parameters.

Return type:

List[Dict[str, Any]]

class polyanalyst6api.project.Parameters(prj, _id)[source]
clear(*node_types, declare_unsync=True)[source]

Clears parameters and strategies of node_types for the Parameters node. If node_types is empty it clears parameters and strategies of all nodes.

Parameters:
  • node_types (List[str]) – node types which parameters needs to be cleared

  • declare_unsync (bool) – reset status of the Parameters node

Return type:

Optional[List[str], None]

get()[source]

Returns list of nodes with parameters and strategies supported by Parameters node.

set(node_type, parameters, strategies=None, declare_unsync=True, hard_update=True, wait=True)[source]

Sets node_type parameters and strategies for the Parameters node.

If parameters is the list of dictionaries with parameters then /configure-array endpoint is used otherwise /configure.

Parameters:
  • node_type (str) – node type which parameters needs to be set

  • parameters (Union[Dict[str, str], List[Dict[str, str]]]) – node type parameters

  • strategies (Optional[List[int], None]) – node type strategies

  • declare_unsync (bool) – reset status of the Parameters node. True by default.

  • hard_update (bool) – update every child node with new parameters if True, otherwise reset their statuses. Works only if declare_unsync is True. True by default.

  • wait (bool) – wait for this node to set parameters for each child node. True by default.

Return type:

Optional[List[str], None]

Exceptions

exception polyanalyst6api.PAException[source]

Generic error class, catch-all for most polyanalyst6api issues.

exception polyanalyst6api.ClientException[source]

Indicate errors that don’t involve interaction with PolyAnalyst’s API.

exception polyanalyst6api.APIException(msg, endpoint=None, status_code=None)[source]

Indicate errors that involve responses from PolyAnalyst’s API.

Parameters:
  • msg (str) – The exception message

  • endpoint (str) – The resource endpoint

  • status_code (int) – The http status code