API Reference¶
This page contains specific information on the polyanalyst6api package classes and methods.
- class polyanalyst6api.api.API(url=None, username=None, password=None, ldap_server=None, version='1.0')[source]¶
PolyAnalyst API
- Parameters:
url (
Optional
[str
,None
]) – (optional) The scheme, host and port(if exists) of a PolyAnalyst server (e.g.https://localhost:5043/
,http://example.polyanalyst.com
)username (
Optional
[str
,None
]) – (optional) The username to log in withpassword (
Optional
[str
,None
]) – (optional) The password for specified usernameldap_server (
Optional
[str
,None
]) – (optional) LDAP Server addressversion (
str
) – (optional) Choose which PolyAnalyst API version to use. Default:1.0
If ldap_server is provided, then login will be performed via LDAP Server.
Usage:
>>> with API(POLYANALYST_URL, YOUR_USERNAME, YOUR_PASSWORD) as api: ... print(api.get_server_info())
or if you’re using configuration file (New in version 0.23.0):
>>> with API() as api: ... print(api.get_server_info())
- get(endpoint, **kwargs)[source]¶
Shortcut for GET requests via
request
- Parameters:
endpoint (
str
) – PolyAnalyst API endpointkwargs –
requests.request()
keyword arguments
- Return type:
Any
- get_parameters()[source]¶
Returns list of nodes with parameters supported by
Parameters
node.Deprecated since version 0.18.0: Use
Parameters.get()
instead.- Return type:
List
[Dict
[str
,Union
[str
,List
]]]
- get_project_import_status(import_id)[source]¶
Get the status of project import
- Parameters:
import_id (
str
) – the import identifier
New in version 0.24.0.
- Return type:
Dict
- get_server_info()[source]¶
Returns general server information including build number, version and commit hashes.
- Return type:
Optional
[Dict
[str
,Union
[int
,str
,Dict
[str
,str
]]],None
]
- get_versions()[source]¶
Returns api versions supported by PolyAnalyst server.
- Return type:
List
[str
]
- import_project(file_path, project_space='', on_conflict='Cancel', wait=False)[source]¶
Import project from file on server file system.
- Parameters:
file_path (
str
) – absolute path to the file on server file systemproject_space (
str
) – the name of the folder in the project manager where you want to import the project. The default folder is Root.on_conflict (
str
) – the strategy to resolve import conflict. Allowed options are: Cancel, Overwrite, ChangeExistingId, ChangeImportingId. By default, the import will be cancelled if the project already exist.wait (
bool
) – wait for project import to finish. False by default.
- Return type:
Union
[str
,Dict
]- Returns:
import identifier if wait is False and import status otherwise
New in version 0.24.0.
- post(endpoint, **kwargs)[source]¶
Shortcut for POST requests via
request
- Parameters:
endpoint (
str
) – PolyAnalyst API endpointkwargs –
requests.request()
keyword arguments
- Return type:
Any
- project(uuid)[source]¶
Returns
Project
instance with given uuid.- Parameters:
uuid (
str
) – The project uuid- Return type:
- request(url, method, **kwargs)[source]¶
Sends
method
request toendpoint
and returns tuple ofrequests.Response
and json-encoded content of a response.- Parameters:
url (
str
) – url or PolyAnalyst API endpointmethod (
str
) – request method (e.g. GET, POST)kwargs –
requests.request()
keyword arguments
- Return type:
Tuple
[Response
,Any
]
- class polyanalyst6api.drive.Drive(api)[source]¶
- create_folder(name, path='')[source]¶
Create a new folder inside the PolyAnalyst’s user directory.
- Parameters:
name (
str
) – the folder namepath (
str
) – a relative path of the folder’s parent directory
- Return type:
None
- delete_file(name, path='')[source]¶
Delete the file in the PolyAnalyst’s user directory.
- Parameters:
name (
str
) – the filenamepath (
str
) – a relative path of the file’s parent directory
- Return type:
None
- delete_folder(name, path='')[source]¶
Delete the folder in the PolyAnalyst’s user directory.
- Parameters:
name (
str
) – the folder namepath (
str
) – a relative path of the folder’s parent directory
- Return type:
None
- download_file(name, path='', dest=None)[source]¶
Download the binary content of the file to memory or stream to local file.
- Parameters:
name (
str
) – the filenamepath (
str
) – a relative path of the file’s parent directorydest (
Optional
[IO
,None
]) – the file or file-like object to write drive’s file content
- Usage::
>>> file_content = api.drive.download_file(name='cars.csv', path='/data') # The method call above downloads the whole file body into memory. If you're # planning on downloading big file (more than 100 megabyte sized file), # consider using the streaming download shown below. >>> with open('local_file_that_doesnt_exist_yet.csv', 'wb+') as file: ... api.drive.download_file(name='cars.csv', path='/data', dest=file)
- Return type:
bytes
- upload(source, dest='', recursive=True)[source]¶
Upload file or folder to PolyAnalyst server.
Pass
recursive
as False to just create folder on the server without uploading inner files and folders.- Parameters:
source (
Union
[str
,PathLike
]) – path to the local file or folderdest (
str
) – (optional) path to the folder in the PolyAnalyst’s user directoryrecursive (
bool
) – (optional) upload subdirectories recursively
- Raises:
TypeError if
source
is not string or path-like object. ValueError ifsource
does not exist- Return type:
None
- upload_file(file, name=None, path='')[source]¶
Upload the file to the PolyAnalyst’s user directory.
Warning
Make sure to create a new file or file-like object for every
Drive.upload_file()
call!Note
Always prefer
Drive.upload()
over this method.- Parameters:
file (
IO
) – the file or file-like object to uploadname (
Optional
[str
,None
]) – the filename other than file’s namepath (
str
) – (optional) a relative path of the file’s parent directory
- Usage::
>>> drive = Drive(...) >>> with open('CarData.csv', mode='rb') as file: ... drive.upload_file(file, name='cars.csv', path='/data')
- Return type:
None
- class polyanalyst6api.project.Project(api, uuid)[source]¶
This class maintains all operations with the PolyAnalyst’s project and nodes.
- Parameters:
api – An instance of
API
classuuid (
str
) – The uuid of the project you want to interact with
- dataset(node)[source]¶
Get dataset wrapper object.
- Parameters:
node (
Union
[str
,Dict
[str
,str
]]) – node name or dict with name and type of the node
New in version 0.16.0.
- delete(force_unload=False)[source]¶
Delete the project from server.
- Parameters:
force_unload (
bool
) – Delete project regardless other users
By default, the project will be deleted only if it’s not loaded to memory. To delete the project that loaded to memory (there are users working on this project right now) set
force_unload
toTrue
. This operation available only for project owner and administrators, and cannot be undone.- Return type:
None
- execute(*args, wait=False)[source]¶
Initiates execution of nodes and returns execution wave identifier.
- Parameters:
args (
Union
[str
,Dict
[str
,str
]]) – node names and/or dicts with name and type of nodeswait (
bool
) – wait for nodes execution to complete
Usage:
>>> wave_id = prj.execute('Internet Source', 'Python')
use
wait=True
to wait for the passed nodes execution to complete.>>> prj.execute('Export to MS Word', wait=True)
or, if there are several nodes in the project with the same name, pass them as dicts with name and type keys (and because of this, you can also pass items of
Project.get_node_list()
)>>> prj.execute( ... {'name': 'Example node', 'type': 'DataSource'}, ... {'name': 'Example node', 'type': 'Dataset'}, ... 'Federated Search', ... prj.get_node_list()[1], ... )
or, if you want to execute all nodes, call this method with no
args
:>>> prj.execute()
- Return type:
Optional
[int
,None
]
- get_execution_statistics()[source]¶
Returns the execution statistics for nodes in the project.
Similar to
Project.get_nodes()
but nodes contains extra information and the project statistics.Deprecated since version 0.15.0: Use
Project.get_execution_stats()
instead.- Return type:
Tuple
[Dict
[str
,Dict
[str
,Union
[str
,int
]]],Dict
[str
,int
]]
- get_execution_stats(skip_hidden=None)[source]¶
Returns nodes execution statistics.
- Parameters:
skip_hidden (
Optional
[bool
,None
]) – Return statistics only of nodes in the project (i.e. exclude publication and compound nodes).
New in version 0.15.0.
Changed in version 0.25.0: Added skip_hidden optional parameter.
- Return type:
List
[Dict
[str
,Union
[str
,int
]]]
- get_node_list()[source]¶
Returns a list of project nodes.
New in version 0.15.0.
- Return type:
List
[Dict
[str
,Union
[str
,int
]]]
- get_nodes()[source]¶
Returns a dictionary of project’s nodes information.
Deprecated since version 0.15.0: Use
Project.get_node_list()
instead.- Return type:
Dict
[str
,Dict
[str
,Union
[str
,int
]]]
- is_running(wave_id)[source]¶
Checks that execution wave is still running in the project.
If wave_id is -1 then the project is checked against any active execution, saving, publishing operations.
- Parameters:
wave_id (
int
) – Execution wave identifier- Return type:
bool
- parameters(name)[source]¶
Get parameters wrapper object.
- Parameters:
name (
str
) – Parameters node name
New in version 0.18.0.
- preview(node)[source]¶
Returns first 1000 rows of data from
node
, texts and strings are cutoff after 250 symbols.- Parameters:
node (
Union
[str
,Dict
[str
,str
]]) – node name or dict with name and type of node
Deprecated since version 0.16.0: Use
Dataset.preview()
instead.- Return type:
List
[Dict
[str
,Any
]]
- save()[source]¶
Initiates saving of all changes that have been made in the project.
- Return type:
None
- set_parameters(node, node_type, parameters, declare_unsync=True, hard_update=True)[source]¶
Set parameters of the selected Parameters node in the project.
- Parameters:
node (
str
) – name of Parameters nodenode_type (
str
) – node type, which parameters need to be set. The types are listed in NodeTypes.parameters (
Dict
[str
,Any
]) – default parameters of the node to be set.declare_unsync (
bool
) – reset the status of the Parameters node.hard_update (
bool
) – update every child node with new parameters if True, otherwise reset their statuses. Works only if declare_unsync is True.
Deprecated since version 0.18.0: Use
Parameters.set()
instead.- Return type:
None
- unload()[source]¶
Unload the project from the memory.
From version 0.26.2 this function ensures that the project is unloaded despite PABusy error by repeating requests (maximum 10) until PA either returns an ok response or returns an error ‘the project has not been opened’, which means project is also unloaded.
- Raises:
PABusy if PABusy returned for all 10 request attempts
- Raises:
APIException if the project has been unloaded before this function was called
- Return type:
None
- wait_for_completion(node, wave_id=None)[source]¶
Waits for the node in a sequence of nodes to complete. Returns True if node have completed successfully and False otherwise.
Unlike execute(…, wait=True), which returns only after an entire node sequence has completed, this method returns immediately after the specified node has completed.
- Parameters:
node (
Union
[str
,Dict
[str
,str
]]) – Node name or dict with name and type of node that runs within execution wavewave_id (
Optional
[int
,None
]) – Execution wave identifier
Deprecated since version 0.17.0: Use
Project.is_running()
instead.Changed in version 0.23.0: Introduced this deprecated method back. Added wave_id argument.
- Return type:
bool
- class polyanalyst6api.project.DataSet(prj, node)[source]¶
-
- iter_rows(start=0, stop=None)[source]¶
Iterate over rows in dataset.
- Parameters:
start (
int
) –stop (
Optional
[int
,None
]) –
- Raises:
ValueError if start or stop is out of datasets’ row range
Usage:
# download first 10 rows >>> head = [] >>> for row in ds.iter_rows(0, 10): ... head.append(row) # download full dataset and convert it to pandas.DataFrame >>> table = list(ds.iter_rows()) >>> df = pandas.DataFrame(table)
- Return type:
Iterator
[Dict
[str
,Union
[bool
,str
,int
,float
,None
]]]
- preview(precision=6, include_blank_cells=False)[source]¶
Get dataset preview.
Contains the first 1000 rows, string/text are cut off after 250 symbols. By default, numbers are rounded to 6 significant digits and blank cells are omitted.
- Parameters:
precision (
int
) – (optional) number of significant digits. 6 by default.include_blank_cells (
bool
) – (optional) include blank cells in dataset. False by default.
- Raises:
APIException if non-default parameters are used with the old version of server, which doesn’t support them. In this case retry the method with default parameters.
New in version 0.24.0: The precision and include_blank_cells parameters.
- Return type:
List
[Dict
[str
,Any
]]
- class polyanalyst6api.project.Parameters(prj, _id)[source]¶
- clear(*node_types, declare_unsync=True)[source]¶
Clears parameters and strategies of node_types for the Parameters node. If node_types is empty it clears parameters and strategies of all nodes.
- Parameters:
node_types (
List
[str
]) – node types which parameters needs to be cleareddeclare_unsync (
bool
) – reset status of the Parameters node
- Return type:
Optional
[List
[str
],None
]
- set(node_type, parameters, strategies=None, declare_unsync=True, hard_update=True, wait=True)[source]¶
Sets node_type parameters and strategies for the Parameters node.
If parameters is the list of dictionaries with parameters then /configure-array endpoint is used otherwise /configure.
- Parameters:
node_type (
str
) – node type which parameters needs to be setparameters (
Union
[Dict
[str
,str
],List
[Dict
[str
,str
]]]) – node type parametersstrategies (
Optional
[List
[int
],None
]) – node type strategiesdeclare_unsync (
bool
) – reset status of the Parameters node. True by default.hard_update (
bool
) – update every child node with new parameters if True, otherwise reset their statuses. Works only if declare_unsync is True. True by default.wait (
bool
) – wait for this node to set parameters for each child node. True by default.
- Return type:
Optional
[List
[str
],None
]
Exceptions¶
- exception polyanalyst6api.PAException[source]¶
Generic error class, catch-all for most polyanalyst6api issues.