caveclient package

Subpackages

caveclient.tools package

Submodules

caveclient.annotationengine module

caveclient.annotationengine.AnnotationClient(server_address, dataset_name=None, aligned_volume_name=None, auth_client=None, api_version='latest', verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

Factory for returning AnnotationClient

Parameters

server_address (str) – server_address to use to connect to (i.e. https://minniev1.microns-daf.com)
datastack_name (str) – Name of the datastack.
auth_client (AuthClient or None, optional) – Authentication client to use to connect to server. If None, do not use authentication.
api_version (str or int (default: latest)) – What version of the api to use, 0: Legacy client (i.e www.dynamicannotationframework.com) 2: new api version, (i.e. minniev1.microns-daf.com) ‘latest’: default to the most recent (current 2)
verify (str (default : True)) – whether to verify https
max_retries (Int or None, optional) – Set the number of retries per request, by default None. If None, defaults to requests package default.
pool_block (Bool or None, optional) – If True, restricts pool of threads to max size, by default None. If None, defaults to requests package default.
pool_maxsize (Int or None, optional) – Sets the max number of threads in the pool, by default None. If None, defaults to requests package default.
over_client – client to overwrite configuration with

Returns

List of datastack names for available datastacks on the annotation engine

Return type

ClientBaseWithDatastack

class caveclient.annotationengine.AnnotationClientV2(server_address, auth_header, api_version, endpoints, server_name, aligned_volume_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, schema_client=None)[source]

Bases: ClientBase

property aligned_volume_name

create_table(table_name: str, schema_name: str, description: str, voxel_resolution: List[float], reference_table: str = None, track_target_id_updates: bool = None, flat_segmentation_source: str = None, user_id: int = None, aligned_volume_name: str = None, write_permission: str = 'PRIVATE', read_permission: str = 'PUBLIC', notice_text: str = None)[source]

Creates a new data table based on an existing schema

Parameters

table_name (str) – Name of the new table. Cannot be the same as an existing table
schema_name (str) – Name of the schema for the new table.
descrption (str) – Human readable description for what is in the table. Should include information about who generated the table What data it covers, and how it should be interpreted. And who should you talk to if you want to use it. An Example: a manual synapse table to detect chandelier synapses on 81 PyC cells with complete AISs [created by Agnes - agnesb@alleninstitute.org, uploaded by Forrest]
voxel_resolution (list[float]) – voxel resolution points will be uploaded in, typically nm, i.e [1,1,1] means nanometers [4,4,40] would be 4nm, 4nm, 40nm voxels
reference_table (str or None) – If the schema you are using is a reference schema Meaning it is an annotation of another annotation. Then you need to specify what the target table those annotations are in.
track_target_id_updates (bool or None) – Indicates whether to automatically update reference table’s foreign key if target annotation table row is updated.
flat_segmentation_source (str or None) – the source to a flat segmentation that corresponds to this table i.e. precomputed:gs:mybucket his_tables_annotation
user_id (int) – If you are uploading this schema on someone else’s behalf and you want to link this table with their ID, you can specify it here Otherwise, the table will be created with your userID in the user_id column.
aligned_volume_name (str or None, optional,) – Name of the aligned_volume. If None, uses the one specified in the client.
write_permission (str, optional) – What permissions to give the table for writing. One of PRIVATE: only you can write to this table (DEFAULT) GROUP: only members that share a group with you can write (excluding some groups) PUBLIC: Anyone can write to this table. Note all data is logged, and deletes are done by marking rows as deleted, so all data is always recoverable
read_permission (str, optional) – What permissions to give the table for reading. One of PRIVATE: only you can read this table. Intended to be used for sorting out bugs. GROUP: only members that share a group with you can read (intended for within group vetting) PUBLIC: anyone with permissions to read this datastack can read this data (DEFAULT)
notice_text (str, optional) – Text the user will see when querying this table. Can be used to warn users of flaws, and uncertainty in the data, or to advertise citations that should be used with this table. Defaults to None, no text. If you want to remove text, send empty string.

Returns

Response JSON

Return type

json

Examples

Basic annotation table: description = “Some description about the table” voxel_res = [4,4,40] client.create_table(“some_synapse_table”, “synapse”, description, voxel_res)

delete_annotation(table_name: str, annotation_ids: dict, aligned_volume_name: str = None)[source]

Delete one or more annotations in a table. Annotations that are deleted are recorded as ‘non-valid’ but are not physically removed from the table.

Parameters

table_name (str) – Name of the table where annotations will be added
data (dict or list,) – A list of (or a single) dict of schematized annotation data matching the target table. each dict must contain an “id” field which is the ID of the annotation to update
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

Response JSON: a list of new annotation IDs.

Return type

json

delete_table(table_name: str, aligned_volume_name: str = None)[source]

Marks a table for deletion requires super admin privileges

Parameters

(str) (table_name) – name of table to mark for deletion
aligned_volume_name (str or None, optional,) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

Response JSON

Return type

json

get_annotation(table_name: str, annotation_ids: int, aligned_volume_name: str = None)[source]

Retrieve an annotation or annotations by id(s) and table name.

Parameters

table_name (str) – Name of the table
annotation_ids (int or iterable) – ID or IDS of the annotation to retreive
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

Annotation data

Return type

list

get_annotation_count(table_name: str, aligned_volume_name: str = None)[source]

Get number of annotations in a table

Parameters

(str) (table_name) – name of table to mark for deletion
aligned_volume_name (str or None, optional,) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

number of annotations

Return type

int

get_table_metadata(table_name: str, aligned_volume_name: str = None)[source]

Get metadata about a table

Parameters

(str) (table_name) – name of table to mark for deletion
aligned_volume_name (str or None, optional,) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

metadata about table

Return type

json

get_tables(aligned_volume_name: str = None)[source]

Gets a list of table names for a aligned_volume_name

Parameters: aligned_volume_name (str or None, optional) – Name of the aligned_volume, by default None. If None, uses the one specified in the client. Will be set correctly if you are using the framework_client
Returns: List of table names
Return type: list

post_annotation(table_name: str, data: dict, aligned_volume_name: str = None)[source]

Post one or more new annotations to a table in the AnnotationEngine. All inserted annotations will be marked as ‘valid’. To invalidate annotations refer to ‘update_annotation’, ‘update_annotation_df’ and ‘delete_annotation’ methods.

Parameters

table_name (str) – Name of the table where annotations will be added
data (dict or list,) – A list of (or a single) dict of schematized annotation data matching the target table.
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

Response JSON

Return type

json

post_annotation_df(table_name: str, df: DataFrame, position_columns: Iterable[str], aligned_volume_name=None)[source]

Post one or more new annotations to a table in the AnnotationEngine. All inserted annotations will be marked as ‘valid’. To invalidate annotations see ‘update_annotation’, ‘update_annotation_df’ and ‘delete_annotation’ methods.

Parameters

table_name (str) – Name of the table where annotations will be added
df (pd.DataFrame) – A pandas dataframe containing the annotations. Columns should be fields in schema, position columns need to be called out in position_columns argument.
position_columns (dict or (list or np.array or pd.Index) or None) – if None, will look for all columns with ‘X_position’ in the name and assume they go in fields called “X”. if Iterable assumes each column given ends in _position. (i.e. [‘pt_position’] if ‘pt’ is the name of the position field in schema) if Mapping, keys are names of columns in dataframe, values are the names of the fields (i.e. {‘pt_column’: ‘pt’} would be correct if you had one column named ‘pt_column’ which needed to go into a schema with a position column called ‘pt’)
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

Response JSON

Return type

json

static process_position_columns(df: DataFrame, position_columns: Iterable[str])[source]

Process a dataframe into a list of dictionaries, nesting thing

Parameters

df (pd.DataFrame) – dataframe to process
position_columns (Iterable[str] or Mapping[str, str] or None) – see post_annotation_df

Returns

json list of annotations ready for posting

stage_annotations(table_name=None, schema_name=None, update=False, id_field=False, table_resolution=None, annotation_resolution=None)[source]

Get a StagedAnnotations object to help produce correctly formatted annotations for a given table or schema. StagedAnnotation objects can be uploaded directly with upload_staged_annotations.

Parameters

table_name (str, optional) – Table name to stage annotations for, by default None.
schema_name (str, optional) – Schema name to use to make annotations. Only needed if the table_name is not set, by default None
update (bool, optional) – Set to True if individual annotations are going to be updated, by default False.
id_field (bool, optional) – Set to True if id fields are to be specified. Not needed if update is True, which always needs id fields. Optional, by default False
table_resolution (list-like or None, optional) – Voxel resolution of spatial points in the table in nanometers. This is found automatically from the info service if a table name is provided, by default None. If annotation_resolution is also set, this allows points to be scaled correctly for the table.
annotation_resolution (list-like, optional) – Voxel resolution of spatial points provided by the user when creating annotations. If the table resolution is also available (manually or from the info service), annotations are correctly rescaled for the volume. By default, None.

update_annotation(table_name: str, data: dict, aligned_volume_name: str = None)[source]

Update one or more new annotations to a table in the AnnotationEngine. Updating is implemented by invalidating the old annotation and inserting a new annotation row, which will receive a new primary key ID.

Notes

If annotations ids were user provided upon insertion the database will autoincrement from the current max id in the table.

Parameters

table_name (str) – Name of the table where annotations will be added
data (dict or list,) – A list of (or a single) dict of schematized annotation data matching the target table. each dict must contain an “id” field which is the ID of the annotation to update
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

Response JSON: a list of new annotation IDs.

Return type

json

update_annotation_df(table_name: str, df: DataFrame, position_columns: Iterable[str], aligned_volume_name=None)[source]

Update one or more annotations to a table in the AnnotationEngine using a dataframe as format. Updating is implemented by invalidating the old annotation and inserting a new annotation row, which will receive a new primary key ID.

Notes

If annotations ids were user provided upon insertion the database will autoincrement from the current max id in the table.

Parameters

table_name (str) – Name of the table where annotations will be added
df (pd.DataFrame) – A pandas dataframe containing the annotations. Columns should be fields in schema, position columns need to be called out in position_columns argument.
position_columns (dict or (list or np.array or pd.Index) or None) – if None, will look for all columns with ‘X_position’ in the name and assume they go in fields called “X”. if Iterable assumes each column given ends in _position. (i.e. [‘pt_position’] if ‘pt’ is the name of the position field in schema) if Mapping, keys are names of columns in dataframe, values are the names of the fields (i.e. {‘pt_column’: ‘pt’} would be correct if you had one column named ‘pt_column’ which needed to go into a schema with a position column called ‘pt’)
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

Response JSON

Return type

json

update_metadata(table_name: str, description: str = None, flat_segmentation_source: str = None, read_permission: str = None, write_permission: str = None, user_id: int = None, notice_text: str = None, aligned_volume_name: str = None)[source]

update the metadata on an existing table

Parameters

table_name (str) – name of table to update
description (str, optional) – text description of the the table. Defaults to None (will not update).
flat_segmentation_source (str, optional) – cloudpath to a flat segmentation associated with this table. Defaults to None (will not update).
read_permission – str, optional What permissions to give the table for reading. One of PRIVATE: only you can read this table. Intended to be used for sorting out bugs. GROUP: only members that share a group with you can read (intended for within group vetting) PUBLIC: anyone with permissions to read this datastack can read this data Defaults to None (will not update).
write_permission – str, optional What permissions to give the table for writing. One of PRIVATE: only you can write to this table GROUP: only members that share a group with you can write (excluding some groups) PUBLIC: Anyone can write to this table. Note all data is logged, and deletes are done by marking rows as deleted, so all data is always recoverable Defaults to None (will not update).
user_id (int, optional) – change ownership of this table to this user_id. Note, if you use this you will not be able to update the metadata on this table any longer and depending on permissions may not be able to read or write to it Defaults to None. (will not update)
notice_text – str, optional Text the user will see when querying this table. Can be used to warn users of flaws, and uncertainty in the data, or to advertise citations that should be used with this table. Defaults to None. (will not update)
aligned_volume_name – str or None, optional Name of the aligned_volume. If None, uses the one specified in the client.

upload_staged_annotations(staged_annos: StagedAnnotations, aligned_volume_name: str = None)[source]

Upload annotations directly from an Annotation Guide object. This method uses the options specified in the object, including table name and if the annotation is an update or not.

Parameters

staged_annos (guide.AnnotationGuide) – AnnotationGuide object with a specified table name and a collection of annotations already filled in.
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.

Returns

If new annotations are posted, a list of ids. If annotations are being updated, a dictionary with the mapping from old ids to new ids.

Return type

List or dict

caveclient.auth module

class caveclient.auth.AuthClient(token_file=None, token_key=None, token=None, server_address='https://global.daf-apis.com')[source]

Bases: object

Client to find and use auth tokens to access the dynamic annotation framework services.

Parameters

token_file (str, optional) – Path to a JSON key:value file holding your auth token. By default, “~/.cloudvolume/secrets/cave-secret.json” (will check deprecated token name “chunkedgraph-secret.json” as well)
token_key (str, optional) – Key for the token in the token_file. By default, “token”
token (str or None, optional) – Direct entry of the token as a string. If provided, overrides the files. If None, attempts to use the file paths.
server_address (str, optional,) – URL to the auth server. By default, uses a default server address.

get_group_users(group_id)[source]

Get users in a group

Parameters: group_id (int) – ID value for a given group
Returns: List of dicts of user ids. Returns empty list if group does not exist.
Return type: list

get_new_token(open=False)[source]

Currently, returns instructions for getting a new token based on the current settings and saving it to the local environment. New OAuth tokens are currently not able to be retrieved programmatically.

Parameters: open (bool, optional) – If True, opens a web browser to the web page where you can generate a new token.

get_token(token_key=None)[source]

Load a token with a given key the specified token file

Parameters: token_key (str or None, optional) – key in the token file JSON, by default None. If None, uses ‘token’.

get_tokens()[source]

Get the tokens setup for this users

Returns: a list of dictionary of tokens, each with the keys “id”: the id of this token “token”: the token (str) “user_id”: the users id (should be your ID)
Return type: list[dict]

get_user_information(user_ids)[source]

Get user data.

Parameters: user_id (list of int) – user_ids to look up

property request_header: Formatted request header with the specified token

save_token(token=None, token_key='token', overwrite=False, token_file=None, switch_token=True, write_to_server_file=True)[source]

Conveniently save a token in the correct format.

After getting a new token by following the instructions in authclient.get_new_token(), you can save it with a fully default configuration by running:

token = ‘my_shiny_new_token’

authclient.save_token(token=token)

Now on next load, authclient=AuthClient() will make an authclient instance using this token. If you would like to specify more information about the json file where the token will be stored, see the parameters below.

Parameters

token (str, optional) – New token to save, by default None
token_key (str, optional) – Key for the token in the token_file json, by default “token”
overwrite (bool, optional) – Allow an existing token to be changed, by default False
token_file (str, optional) – Path to the token file, by default None. If None, uses the default file location specified above.
switch_token (bool, optional) – If True, switch the auth client over into using the new token, by default True
write_to_server_file (bool, optional) – If True, will write token to a server specific file to support this machine interacting with multiple auth servers.

setup_token(make_new=True, open=True)[source]

Currently, returns instructions for getting your auth token based on the current settings and saving it to the local environment. New OAuth tokens are currently not able to be retrieved programmatically.

Parameters

make_new (bool, optional) – If True, will make a new token, else prompt you to open a page to retrieve an existing token.
open (bool, optional) – If True, opens a web browser to the web page where you can retrieve a token.

property token: Secret token used to authenticate yourself to the Connectome Annotation Versioning Engine services.

caveclient.auth.write_token(token, filepath, key, overwrite=True)[source]

caveclient.base module

exception caveclient.base.AuthException[source]: Bases: Exception

class caveclient.base.BaseEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: JSONEncoder

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

class caveclient.base.ClientBase(server_address, auth_header, api_version, endpoints, server_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

Bases: object

property api_version

property default_url_mapping

property fc

static raise_for_status(r, log_warning=True)[source]: Raises HTTPError, if one occurred.

property server_address

class caveclient.base.ClientBaseWithDataset(server_address, auth_header, api_version, endpoints, server_name, dataset_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

Bases: ClientBase

property dataset_name

class caveclient.base.ClientBaseWithDatastack(server_address, auth_header, api_version, endpoints, server_name, datastack_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

Bases: ClientBase

property datastack_name

caveclient.base.handle_response(response, as_json=True, log_warning=True)[source]: Deal with potential errors in endpoint response and return json for default case

caveclient.chunkedgraph module

PyChunkedgraph service python interface

caveclient.chunkedgraph.ChunkedGraphClient(server_address=None, table_name=None, auth_client=None, api_version='latest', timestamp=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

class caveclient.chunkedgraph.ChunkedGraphClientV1(server_address, auth_header, api_version, endpoints, server_key='cg_server_address', timestamp=None, table_name=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

Bases: ClientBase

ChunkedGraph Client for the v1 API

property base_resolution

MIP 0 resolution for voxels assumed by the ChunkedGraph

Returns: 3-long list of x/y/z voxel dimensions in nm
Return type: list

property cloudvolume_path

property default_url_mapping

do_merge(supervoxels, coords, resolution=(4, 4, 40)) → None[source]

Perform a merge on the chunked graph.

Parameters

supervoxels (iterable) – An N-long list of supervoxels to merge.
coords (np.array) – An Nx3 array of coordinates of the supervoxels in units of resolution.
resolution (tuple, optional) – What to multiply coords by to get nanometers. Defaults to (4,4,40).

execute_split(source_points, sink_points, root_id, source_supervoxels=None, sink_supervoxels=None) → Tuple[int, list][source]

Execute a multicut split based on points or supervoxels.

Parameters

source_points (array or list) – Nx3 list or array of 3d points in nm coordinates for source points (red).
sink_points (array or list) – Mx3 list or array of 3d points in nm coordinates for sink points (blue).
root_id (int) – Root ID of object to do split preview.
source_supervoxels (array, list or None, optional) – If providing source supervoxels, an N-length array of supervoxel IDs or Nones matched to source points. If None, treats as a full array of Nones. By default None.
sink_supervoxels (array, list or None, optional) – If providing sink supervoxels, an M-length array of supervoxel IDs or Nones matched to source points. If None, treats as a full array of Nones. By default None.

Returns

operation_id (int) – Unique ID of the split operation
new_root_ids (list of int) – List of new root IDs resulting from the split operation.

find_path(root_id, src_pt, dst_pt, precision_mode=False) → Tuple[ndarray, ndarray, ndarray][source]

Find a path between two locations on a root ID using the level 2 chunked graph.

Parameters

root_id (int) – Root ID to query.
src_pt (np.array) – 3-element array of xyz coordinates in nm for the source point.
dst_pt (np.array) – 3-element array of xyz coordinates in nm for the destination point.
precision_mode (bool, optional) – Whether to perform the search in precision mode. Defaults to False.

Returns

centroids_list (np.array) – Array of centroids along the path.
l2_path (np.array of int) – Array of level 2 chunk IDs along the path.
failed_l2_ids (np.array of int) – Array of level 2 chunk IDs that failed to find a path.

get_change_log(root_id, filtered=True) → dict[source]

Get the change log (splits and merges) for an object.

Parameters

root_id (int) – Object root ID to look up.
filtered (bool) – Whether to filter the change log to only include splits and merges which affect the final state of the object (filtered=True), as opposed to including edit history for objects which as some point were split from the query object root_id (filtered=False). Defaults to True.

Returns

Dictionary summarizing split and merge events in the object history, containing the following keys:

”n_merges”: int: Number of merges
”n_splits”: int: Number of splits
”operations_ids”: list of int: Identifiers for each operation
”past_ids”: list of int: Previous root ids for this object
”user_info”: dict of dict: Dictionary keyed by user (string) to a dictionary specifying how many merges and splits that user performed on this object

Return type

dict

get_children(node_id) → ndarray[source]

Get the children of a node in the chunked graph hierarchy.

Parameters: node_id (int) – Node ID to query.
Returns: IDs of child nodes.
Return type: np.array of np.int64

get_contact_sites(root_id, bounds, calc_partners=False) → dict[source]

Get contacts for a root ID.

Parameters

root_id (int) – Root ID to query.
bounds (np.array) – Bounds within a 3x2 numpy array of bounds [[minx,maxx],[miny,maxy],[minz,maxz]] for which to find contacts. Running this query without bounds is too slow.
calc_partners (bool, optional) – If True, get partner root IDs. By default, False.

Returns

Dict relating ids to contacts

Return type

dict

get_delta_roots(timestamp_past: datetime, timestamp_future: datetime = datetime.datetime(2024, 1, 19, 2, 8, 18, 663989, tzinfo=datetime.timezone.utc)) → Tuple[ndarray, ndarray][source]

Get the list of roots that have changed between timetamp_past and timestamp_future.

Parameters

timestamp_past (datetime.datetime) – Past timepoint to query
timestamp_future (datetime.datetime, optional) – Future timepoint to query. Defaults to datetime.datetime.now(datetime.timezone.utc).

Returns

old_roots (np.ndarray of np.int64) – Roots that have expired in that interval.
new_roots (np.ndarray of np.int64) – Roots that are new in that interval.

get_latest_roots(root_id, timestamp=None, timestamp_future=None) → ndarray[source]

Returns root IDs that are related to the given root_id at a given timestamp. Can be used to find the “latest” root IDs associated with an object.

Parameters

root_id (int) – Object root ID.
timestamp (datetime.datetime or None, optional) – Timestamp of where to query IDs from. If None then will assume you want till now.
timestamp_future (datetime.datetime or None, optional) – DEPRECATED name, use timestamp instead. Timestamp to suggest IDs from (note can be in the past relative to the root). By default, None.

Returns

1d array with all latest successors.

Return type

np.ndarray

get_leaves(root_id, bounds=None, stop_layer: int = None) → ndarray[source]

Get all supervoxels for a root ID.

Parameters

root_id (int) – Root ID to query.
bounds (np.array or None, optional) – If specified, returns supervoxels within a 3x2 numpy array of bounds [[minx,maxx],[miny,maxy],[minz,maxz]]. If None, finds all supervoxels.
stop_layer (int, optional) – If specified, returns chunkedgraph nodes at layer stop_layer default will be stop_layer=1 (supervoxels).

Returns

Array of supervoxel IDs (or node ids if stop_layer>1).

Return type

np.array of np.int64

get_lineage_graph(root_id, timestamp_past=None, timestamp_future=None, as_nx_graph=False, exclude_links_to_future=False, exclude_links_to_past=False) → Union[dict, DiGraph][source]

Returns the lineage graph for a root ID, optionally cut off in the past or the future.

Each change in the chunked graph creates a new root ID for the object after that change. This function returns a graph of all root IDs for a given object, tracing the history of the object in terms of merges and splits.

Parameters

root_id (int) – Object root ID.
timestamp_past (datetime.datetime or None, optional) – Cutoff for the lineage graph backwards in time. By default, None.
timestamp_future (datetime.datetime or None, optional) – Cutoff for the lineage graph going forwards in time. By default, None.
as_nx_graph (bool) – If True, a NetworkX graph is returned.
exclude_links_to_future (bool) – If True, links from nodes before timestamp_future to after timestamp_future are removed. If False, the link(s) which has one node before timestamp and one node after timestamp is kept.
exclude_links_to_past (bool) – If True, links from nodes before timestamp_past to after timestamp_past are removed. If False, the link(s) which has one node before timestamp and one node after timestamp is kept.

Returns

dict – Dictionary describing the lineage graph and operations for the root ID. Not returned if as_nx_graph is True. The dictionary contains the following keys:

”directed”bool
Whether the graph is directed.

”graph”dict
Dictionary of graph attributes.

”links”list of dict
Each element of the list is a dictionary describing an edge in the lineage graph as “source” and “target” keys.

”multigraph”bool
Whether the graph is a multigraph.

”nodes”list of dict
Each element of the list is a dictionary describing a node in the lineage graph, usually with “id”, “timestamp”, and “operation_id” keys.
nx.DiGraph – NetworkX directed graph of the lineage graph. Only returned if as_nx_graph is True.

get_merge_log(root_id) → list[source]

Get the merge log (splits and merges) for an object.

Parameters: root_id (int) – Object root ID to look up.
Returns: List of merge events in the history of the object.
Return type: list

get_oldest_timestamp() → datetime[source]

Get the oldest timestamp in the database.

Returns: Oldest timestamp in the database.
Return type: datetime.datetime

get_operation_details(operation_ids: Iterable[int]) → dict[source]

Get the details of a list of operations.

Parameters

operation_ids (Iterable of int) – List/array of operation IDs.

Returns

A dict of dicts of operation info, keys are operation IDs (as strings), values are a dictionary of operation info for the operation. These dictionaries contain the following keys:

”added_edges”/”removed_edges”: list of list of int: List of edges added (if a merge) or removed (if a split) by this operation. Each edge is a list of two supervoxel IDs (source and target).
”roots”: list of int: List of root IDs that were created by this operation.
”sink_coords”: list of list of int: List of sink coordinates for this operation. The sink is one of the points placed by the user when specifying the operation. Each sink coordinate is a list of three integers (x, y, z), corresponding to spatial coordinates in segmentation voxel space.
”source_coords”: list of list of int: List of source coordinates for this operation. The source is one of the points placed by the user when specifying the operation. Each source coordinate is a list of three integers (x, y, z), corresponding to spatial coordinates in segmentation voxel space.
”timestamp”: str: Timestamp of the operation.
”user”: str: User ID number who performed the operation (as a string).

Return type

dict of str to dict

get_original_roots(root_id, timestamp_past=None) → ndarray[source]

Returns root IDs that are the latest successors of a given root ID.

Parameters

root_id (int) – Object root ID.
timestamp_past (datetime.datetime or None, optional) – Cutoff for the search going backwards in time. By default, None.

Returns

1d array with all latest successors.

Return type

np.ndarray

get_past_ids(root_ids, timestamp_past=None, timestamp_future=None) → dict[source]

For a set of root IDs, get the list of IDs at a past or future time point that could contain parts of the same object.

Parameters

root_ids (Iterable of int) – Iterable of root IDs to query.
timestamp_past (datetime.datetime or None, optional) – Time of a point in the past for which to look up root ids. Default is None.
timestamp_future (datetime.datetime or None, optional) – Time of a point in the future for which to look up root ids. Not implemented on the server currently. Default is None.

Returns

Dict with keys “future_id_map” and “past_id_map”. Each is a dict whose keys are the supplied root_ids and whose values are the list of related root IDs at timestamp_past/timestamp_future.

Return type

dict

get_root_id(supervoxel_id, timestamp=None, level2=False) → int64[source]

Get the root ID for a specified supervoxel.

Parameters

supervoxel_id (int) – Supervoxel id value
timestamp (datetime.datetime, optional) – UTC datetime to specify the state of the chunkedgraph at which to query, by default None. If None, uses the current time.

Returns

Root ID containing the supervoxel.

Return type

np.int64

get_root_timestamps(root_ids) → ndarray[source]

Retrieves timestamps when roots where created.

Parameters: root_ids (Iterable of int) – Iterable of root IDs to query.
Returns: Array of timestamps when root_ids were created.
Return type: np.array of datetime.datetime

get_roots(supervoxel_ids, timestamp=None, stop_layer=None) → ndarray[source]

Get the root ID for a list of supervoxels.

Parameters

supervoxel_ids (list or np.array of int) – Supervoxel IDs to look up.
timestamp (datetime.datetime, optional) – UTC datetime to specify the state of the chunkedgraph at which to query, by default None. If None, uses the current time.
stop_layer (int or None, optional) – If True, looks up IDs only up to a given stop layer. Default is None.

Returns

Root IDs containing each supervoxel.

Return type

np.array of np.uint64

get_subgraph(root_id, bounds) → Tuple[ndarray, ndarray, ndarray][source]

Get subgraph of root id within a bounding box.

Parameters

root_id (int) – Root (or any node ID) of chunked graph to query.
bounds (np.array) – 3x2 bounding box (x,y,z) x (min,max) in chunked graph coordinates.

Returns

np.array of np.int64 – Node IDs in the subgraph.
np.array of np.double – Affinities of edges in the subgraph.
np.array of np.int32 – Areas of nodes in the subgraph.

get_tabular_change_log(root_ids, filtered=True) → dict[source]

Get a detailed changelog for neurons.

Parameters

root_ids (list of int) – Object root IDs to look up.
filtered (bool) – Whether to filter the change log to only include splits and merges which affect the final state of the object (filtered=True), as opposed to including edit history for objects which as some point were split from the query objects in root_ids (filtered=False). Defaults to True.

Returns

The keys are the root IDs, and the values are DataFrames with the following columns and datatypes:

”operation_id”: int: Identifier for the operation.
”timestamp”: int: Timestamp of the operation, provided in milliseconds. To convert to datetime, use datetime.datetime.utcfromtimestamp(timestamp/1000).
”user_id”: int: User who performed the operation.
”before_root_ids: list of int: Root IDs of objects that existed before the operation.
”after_root_ids: list of int: Root IDs of objects created by the operation. Note that this only records the root id that was kept as part of the query object, so there will only be one in this list.
”is_merge”: bool: Whether the operation was a merge.
”user_name”: str: Name of the user who performed the operation.
”user_affiliation”: str: Affiliation of the user who performed the operation.

Return type

dict of pd.DataFrame

get_user_operations(user_id: int, timestamp_start: datetime, include_undo: bool = True, timestamp_end: datetime = None) → DataFrame[source]

Get operation details for a user ID. Currently, this is only available to admins.

Parameters

user_id (int) – User ID to query (use 0 for all users (admin only)).
timestamp_start (datetime.datetime, optional) – Timestamp to start filter (UTC).
include_undo (bool, optional) – Whether to include undos. Defaults to True.
timestamp_end (datetime.datetime, optional) – Timestamp to end filter (UTC). Defaults to now.

Returns

DataFrame including the following columns:

”operation_id”: int: Identifier for the operation.
”timestamp”: datetime.datetime: Timestamp of the operation.
”user_id”: int: User who performed the operation.

Return type

pd.DataFrame

is_latest_roots(root_ids, timestamp=None) → ndarray[source]

Check whether these root IDs are still a root at this timestamp.

Parameters

root_ids (list or array of int) – Root IDs to check.
timestamp (datetime.datetime, optional) – Timestamp to check whether these IDs are valid root IDs in the chunked graph. Defaults to None (assumes now).

Returns

Array of whether these are valid root IDs.

Return type

np.array of bool

is_valid_nodes(node_ids, start_timestamp=None, end_timestamp=None) → ndarray[source]

Check whether nodes are valid for given timestamp range.

Valid is defined as existing in the chunked graph. This makes no statement about these IDs being roots, supervoxel or anything in-between. It also does not take into account whether a root ID has since been edited.

Parameters

node_ids (list or array of int) – Node IDs to check.
start_timestamp (datetime.datetime, optional) – Timestamp to check whether these IDs were valid after this timestamp. Defaults to None (assumes now).
end_timestamp (datetime.datetime, optional) – Timestamp to check whether these IDs were valid before this timestamp. Defaults to None (assumes now).

Returns

Array of whether these are valid IDs.

Return type

np.array of bool

level2_chunk_graph(root_id) → list[source]

Get graph of level 2 chunks, the smallest agglomeration level above supervoxels.

Parameters: root_id (int) – Root id of object
Returns: Edge list for level 2 chunked graph. Each element of the list is an edge, and each edge is a list of two node IDs (source and target).
Return type: list of list

preview_split(source_points, sink_points, root_id, source_supervoxels=None, sink_supervoxels=None, return_additional_ccs=False) → Tuple[list, list, bool, list][source]

Get supervoxel connected components from a preview multicut split.

Parameters

source_points (array or list) – Nx3 list or array of 3d points in nm coordinates for source points (red).
sink_points (array or list) – Mx3 list or array of 3d points in nm coordinates for sink points (blue).
root_id (int) – Root ID of object to do split preview.
source_supervoxels (array, list or None, optional) – If providing source supervoxels, an N-length array of supervoxel IDs or Nones matched to source points. If None, treats as a full array of Nones. By default None.
sink_supervoxels (array, list or None, optional) – If providing sink supervoxels, an M-length array of supervoxel IDs or Nones matched to source points. If None, treats as a full array of Nones. By default None.
return_additional_ccs (bool, optional) – If True, returns any additional connected components beyond the ones with source and sink points. In most situations, this can be ignored. By default, False.

Returns

source_connected_component (list) – Supervoxel IDs in the component with the most source points.
sink_connected_component (list) – Supervoxel IDs in the component with the most sink points.
successful_split (bool) – True if the split worked.
other_connected_components (optional) (list of lists of int) – List of lists of supervoxel IDs for any other resulting connected components. Only returned if return_additional_ccs is True.

remesh_level2_chunks(chunk_ids) → None[source]

Submit specific level 2 chunks to be remeshed in case of a problem.

Parameters: chunk_ids (list) – List of level 2 chunk IDs.

property segmentation_info: Complete segmentation metadata

suggest_latest_roots(root_id, timestamp=None, stop_layer=None, return_all=False, return_fraction_overlap=False)[source]

Suggest latest roots for a given root id, based on overlap of component chunk IDs. Note that edits change chunk IDs, and so this effectively measures the fraction of unchanged chunks at a given chunk layer, which sets the size scale of chunks. Higher layers are coarser.

Parameters

root_id (int) – Root ID of the potentially outdated object.
timestamp (datetime, optional) – Datetime at which “latest” roots are being computed, by default None. If None, the current time is used. Note that this has to be a timestamp after the creation of the root_id.
stop_layer (int, optional) – Chunk level at which to compute overlap, by default None. No value will take the 4th from the top layer, which emphasizes speed and works well for larger objects. Lower values are slower but more fine-grained. Values under 2 (i.e. supervoxels) are not recommended except in extremely fine grained scenarios.
return_all (bool, optional) – If True, return all current IDs sorted from most overlap to least, by default False. If False, only the top is returned.
return_fraction_overlap (bool, optional) – If True, return all fractions sorted by most overlap to least, by default False. If False, only the top value is returned.

property table_name

undo_operation(operation_id) → dict[source]

Undo an operation.

Parameters: operation_id (int) – Operation ID to undo.
Return type: dict

caveclient.chunkedgraph.package_bounds(bounds)[source]

caveclient.chunkedgraph.package_split_data(root_id, source_points, sink_points, source_supervoxels, sink_supervoxels)[source]: Create the data for preview or executed split operations

caveclient.chunkedgraph.package_timestamp(timestamp, name='timestamp')[source]

caveclient.chunkedgraph.root_id_int_list_check(root_id, make_unique=False)[source]

caveclient.emannotationschemas module

caveclient.emannotationschemas.SchemaClient(server_address=None, auth_client=None, api_version='latest', max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

class caveclient.emannotationschemas.SchemaClientLegacy(server_address, auth_header, api_version, endpoints, server_name, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

Bases: ClientBase

get_schemas()[source]

Get the available schema types

Returns: List of schema types available on the Schema service.
Return type: list

schema_definition(schema_type)[source]

Get the definition of a specified schema_type

Parameters: schema_type (str) – Name of a schema_type
Returns: Schema definition
Return type: json

caveclient.endpoints module

caveclient.format_utils module

caveclient.format_utils.format_cave_explorer(objurl)[source]

caveclient.format_utils.format_cloudvolume(objurl)[source]

caveclient.format_utils.format_graphene(objurl)[source]

caveclient.format_utils.format_neuroglancer(objurl)[source]

caveclient.format_utils.format_precomputed_https(objurl)[source]

caveclient.format_utils.format_precomputed_neuroglancer(objurl)[source]

caveclient.format_utils.format_raw(objurl)[source]

caveclient.format_utils.format_verbose_graphene(objurl)[source]

caveclient.frameworkclient module

class caveclient.frameworkclient.CAVEclient(datastack_name=None, server_address=None, auth_token_file=None, auth_token_key=None, auth_token=None, global_only=False, max_retries=3, pool_maxsize=None, pool_block=None, desired_resolution=None, info_cache=None, write_server_cache=True)[source]: Bases: object

class caveclient.frameworkclient.CAVEclientFull(datastack_name=None, server_address=None, auth_token_file='~/.cloudvolume/secrets/cave-secret.json', auth_token_key='token', auth_token=None, max_retries=3, pool_maxsize=None, pool_block=None, desired_resolution=None, info_cache=None)[source]

Bases: CAVEclientGlobal

A manager for all clients sharing common datastack and authentication information.

This client wraps all the other clients and keeps track of the things that need to be consistent across them. To instantiate a client:

client = CAVEclient(datastack_name='my_datastack',
                         server_address='www.myserver.com',
                         auth_token_file='~/.mysecrets/secrets.json')

Then * client.info is an InfoService client (see infoservice.InfoServiceClient) * client.state is a neuroglancer state client (see jsonservice.JSONService) * client.schema is an EM Annotation Schemas client (see emannotationschemas.SchemaClient) * client.chunkedgraph is a Chunkedgraph client (see chunkedgraph.ChunkedGraphClient) * client.annotation is an Annotation DB client (see annotationengine.AnnotationClient)

All subclients are loaded lazily and share the same datastack name, server address, and auth tokens where used.

Parameters

datastack_name (str, optional) – Datastack name for the services. Almost all services need this and will not work if it is not passed.
server_address (str or None) – URL of the framework server. If None, chooses the default server global.daf-apis.com. Optional, defaults to None.
auth_token_file (str or None) – Path to a json file containing the auth token. If None, uses the default location. See Auth client documentation. Optional, defaults to None.
auth_token_key (str) – Dictionary key for the token in the the JSON file. Optional, default is ‘token’.
auth_token (str or None) – Direct entry of an auth token. If None, uses the file arguments to find the token. Optional, default is None.
max_retries (int or None, optional) – Sets the default number of retries on failed requests. Optional, by default 2.
pool_maxsize (int or None, optional) – Sets the max number of threads in a requests pool, although this value will be exceeded if pool_block is set to False. Optional, uses requests defaults if None.
pool_block (bool or None, optional) – If True, prevents the number of threads in a requests pool from exceeding the max size. Optional, uses requests defaults (False) if None.
desired_resolution (Iterable[float]or None, optional) – If given, should be a list or array of the desired resolution you want queries returned in useful for materialization queries.
info_cache (dict or None, optional) – Pre-computed info cache, bypassing the lookup of datastack info from the info service. Should only be used in cases where this information is cached and thus repetitive lookups can be avoided.

property annotation

property chunkedgraph

property datastack_name

property l2cache

property materialize

property state

class caveclient.frameworkclient.CAVEclientGlobal(server_address=None, auth_token_file=None, auth_token_key=None, auth_token=None, max_retries=3, pool_maxsize=None, pool_block=None, info_cache=None)[source]

Bases: object

A manager for all clients sharing common datastack and authentication information.

This client wraps all the other clients and keeps track of the things that need to be consistent across them. To instantiate a client:

client = CAVEclient(datastack_name='my_datastack',
                         server_address='www.myserver.com',
                         auth_token_file='~/.mysecrets/secrets.json')

Then * client.info is an InfoService client (see infoservice.InfoServiceClient) * client.auth handles authentication * client.state is a neuroglancer state client (see jsonservice.JSONService) * client.schema is an EM Annotation Schemas client (see emannotationschemas.SchemaClient)

All subclients are loaded lazily and share the same datastack name, server address, and auth tokens (where used).

Parameters

server_address (str or None) – URL of the framework server. If None, chooses the default server global.daf-apis.com. Optional, defaults to None.
auth_token_file (str or None) – Path to a json file containing the auth token. If None, uses the default location. See Auth client documentation. Optional, defaults to None.
auth_token_key (str) – Dictionary key for the token in the the JSON file. Optional, default is ‘token’.
auth_token (str or None) – Direct entry of an auth token. If None, uses the file arguments to find the token. Optional, default is None.
max_retries (int or None, optional) – Sets the default number of retries on failed requests. Optional, by default 2.
pool_maxsize (int or None, optional) – Sets the max number of threads in a requests pool, although this value will be exceeded if pool_block is set to False. Optional, uses requests defaults if None.
pool_block (bool or None, optional) – If True, prevents the number of threads in a requests pool from exceeding the max size. Optional, uses requests defaults (False) if None.
info_cache (dict or None, optional) – Pre-computed info cache, bypassing the lookup of datastack info from the info service. Should only be used in cases where this information is cached and thus repetitive lookups can be avoided.

property annotation

property auth

change_auth(auth_token_file=None, auth_token_key=None, auth_token=None)[source]

Change the authentication token and reset services.

Parameters

auth_token_file (str, optional) – New auth token json file path, by default None, which defaults to the existing state.
auth_token_key (str, optional) – New dictionary key under which the token is stored in the json file, by default None, which defaults to the existing state.
auth_token (str, optional) – Direct entry of a new token, by default None.

property chunkedgraph

property datastack_name

property info: InfoServiceClient

property schema

property server_address

property state

exception caveclient.frameworkclient.GlobalClientError[source]: Bases: Exception

caveclient.infoservice module

caveclient.infoservice.InfoServiceClient(server_address=None, datastack_name=None, auth_client=None, api_version='latest', verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, info_cache=None)[source]

class caveclient.infoservice.InfoServiceClientV2(server_address, auth_header, api_version, endpoints, server_name, datastack_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, info_cache=None)[source]

Bases: ClientBaseWithDatastack

property aligned_volume_id

property aligned_volume_name

annotation_endpoint(datastack_name=None, use_stored=True)[source]

AnnotationEngine endpoint for a dataset.

Parameters

datastack_name (str or None, optional) – Name of the datastack to look up. If None, uses the value specified by the client. Default is None.
use_stored (bool, optional) – If True, uses the cached value if available. If False, re-queries the InfoService. Default is True.

Returns

Location of the AnnotationEngine

Return type

str

get_aligned_volume_info(datastack_name: str = None, use_stored=True)[source]

Gets the info record for a aligned_volume

Parameters

datastack_name (str, optional) – datastack_name to look up. If None, uses the one specified by the client. By default None
use_stored (bool, optional) – If True and the information has already been queried for that dataset, then uses the cached version. If False, re-queries the infromation. By default True

Returns

The complete info record for the aligned_volume

Return type

dict or None

get_aligned_volume_info_by_id(aligned_volume_id: int = None, use_stored=True)[source]

get_aligned_volumes()[source]

get_datastack_info(datastack_name=None, use_stored=True)[source]

Gets the info record for a datastack

Parameters

datastack_name (str, optional) – datastack to look up. If None, uses the one specified by the client. By default None
use_stored (bool, optional) – If True and the information has already been queried for that datastack, then uses the cached version. If False, re-queries the infromation. By default True

Returns

The complete info record for the datastack

Return type

dict or None

get_datastacks()[source]

Query which datastacks are available at the info service

Returns: List of datastack names
Return type: list

get_datastacks_by_aligned_volume(aligned_volume: str = None)[source]

Lookup what datastacks are associated with this aligned volume

Parameters: aligned_volume (str, optional) – aligned volume to lookup. Defaults to None.
Raises: ValueError – if no aligned volume is specified
Returns: a list of datastack string
Return type: list

image_cloudvolume(**kwargs)[source]

Generate a cloudvolume instance based on the image source, using authentication if needed and sensible default values for reading CAVE resources. By default, fill_missing is True and bounded is False. All keyword arguments are passed onto the CloudVolume initialization function, and defaults can be overridden.

Requires cloudvolume to be installed, which is not included by default.

image_source(datastack_name=None, use_stored=True, format_for='raw')[source]

Cloud path to the imagery for the dataset

Parameters

datastack_name (str or None, optional) – Name of the datastack to look up. If None, uses the value specified by the client. Default is None.
use_stored (bool, optional) – If True, uses the cached value if available. If False, re-queries the InfoService. Default is True.
format_for ('raw', 'cloudvolume', or 'neuroglancer', optional) – Formats the path for different uses. If ‘raw’ (default), the path in the InfoService is passed along. If ‘cloudvolume’, a “precomputed://gs://” type path is converted to a full https URL. If ‘neuroglancer’, a full https URL is converted to a “precomputed://gs://” type path.

Returns

Formatted cloud path to the flat segmentation

Return type

str

local_server(datastack_name=None, use_stored=True)[source]

refresh_stored_data()[source]: Reload the stored info values from the server.

segmentation_cloudvolume(use_client_secret=True, **kwargs)[source]

Generate a cloudvolume instance based on the segmentation source, using authentication if needed and sensible default values for reading CAVE resources. By default, fill_missing is True and bounded is False. All keyword arguments are passed onto the CloudVolume initialization function, and defaults can be overridden.

Requires cloudvolume to be installed, which is not included by default.

segmentation_source(datastack_name=None, format_for='raw', use_stored=True)[source]

Cloud path to the chunkgraph-backed Graphene segmentation for a dataset

Parameters

datastack_name (str or None, optional) – Name of the datastack to look up. If None, uses the value specified by the client. Default is None.
use_stored (bool, optional) – If True, uses the cached value if available. If False, re-queries the InfoService. Default is True.
format_for ('raw', 'cloudvolume', or 'neuroglancer', optional) – Formats the path for different uses. If ‘raw’ (default), the path in the InfoService is passed along. If ‘cloudvolume’, a “graphene://https://” type path is used If ‘neuroglancer’, a “graphene://https://” type path is used, as needed by Neuroglancer.

Returns

Formatted cloud path to the Graphene segmentation

Return type

str

synapse_segmentation_source(datastack_name=None, use_stored=True, format_for='raw')[source]

Cloud path to the synapse segmentation for a dataset

Parameters

datastack_name (str or None, optional) – Name of the dataset to look up. If None, uses the value specified by the client. Default is None.
use_stored (bool, optional) – If True, uses the cached value if available. If False, re-queries the InfoService. Default is True.
format_for ('raw', 'cloudvolume', or 'neuroglancer', optional) – Formats the path for different uses. If ‘raw’ (default), the path in the InfoService is passed along. If ‘cloudvolume’, a “precomputed://gs://” type path is converted to a full https URL. If ‘neuroglancer’, a full https URL is converted to a “precomputed://gs://” type path.

Returns

Formatted cloud path to the synapse segmentation

Return type

str

viewer_resolution(datastack_name=None, use_stored=True)[source]

get the viewer resolution metadata for this datastack

Parameters

datastack_name (_type_, optional) – _description_. Defaults to None. If None use the default one configured in the client
use_stored (bool, optional) – _description_. Defaults to True. Use the cached value, if False go get a new value from server

Returns

voxel resolution as a len(3) np.array

Return type

np.array

viewer_site(datastack_name=None, use_stored=True)[source]: Get the base Neuroglancer URL for the dataset

caveclient.jsonservice module

caveclient.jsonservice.JSONService(server_address=None, auth_client=None, api_version='latest', ngl_url=None, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

Client factory to interface with the JSON state service.

Parameters

server_address (str, optional) – URL to the JSON state server. If None, set to the default global server address. By default None.
auth_client (An Auth client, optional) – An auth client with a token for the same global server, by default None
api_version (int or 'latest', optional) – Which endpoint API version to use or ‘latest’. By default, ‘latest’ tries to ask the server for which versions are available, if such functionality exists, or if not it defaults to the latest version for which there is a client. By default ‘latest’
ngl_url (str or None, optional) – Default neuroglancer deployment URL. Only used for V1 and later.

class caveclient.jsonservice.JSONServiceV1(server_address, auth_header, api_version, endpoints, server_name, ngl_url, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]

Bases: ClientBase

build_neuroglancer_url(state_id, ngl_url=None, target_site=None, static_url=False)[source]

Build a URL for a Neuroglancer deployment that will automatically retrieve specified state. If the datastack is specified, this is prepopulated from the info file field “viewer_site”. If no ngl_url is specified in either the function or the client, a fallback neuroglancer deployment is used.

Parameters

state_id (int) – State id to retrieve
ngl_url (str) – Base url of a neuroglancer deployment. If None, defaults to the value for the datastack or the client. As a fallback, a default deployment is used.
target_site ('seunglab' or 'cave-explorer' or 'mainline' or None) – Set this to ‘seunglab’ for a seunglab deployment, or either ‘cave-explorer’/’mainline’ for a google main branch deployment. If None, checks the info field of the neuroglancer endpoint to determine which to use. Default is None.
static_url (bool) – If True, treats “state_id” as a static URL directly to the JSON and does not use the state service.

Returns

The full URL requested

Return type

str

get_neuroglancer_info(ngl_url=None)[source]

Get the info field from a Neuroglancer deployment

Parameters: ngl_url (str (optional)) – URL to a Neuroglancer deployment. If None, defaults to the value for the datastack or the client.
Returns: JSON-formatted info field from the Neuroglancer deployment
Return type: dict

get_state_json(state_id)[source]

Download a Neuroglancer JSON state

Parameters: state_id (int) – ID of a JSON state uploaded to the state service.
Returns: JSON specifying a Neuroglancer state.
Return type: dict

property ngl_url

save_state_json_local(json_state, filename, overwrite=False)[source]

Save a Neuroglancer JSON state to a JSON file locally.

Parameters

json_state (dict) – Dict representation of a neuroglancer state
filename (str) – Filename to save the state to
overwrite (bool) – Whether to overwrite the file if it exists. Default False.

Return type

None

property state_service_endpoint: Endpoint URL for posting JSON state

upload_state_json(json_state, state_id=None, timestamp=None)[source]

Upload a Neuroglancer JSON state

Parameters

json_state (dict) – Dict representation of a neuroglancer state
state_id (int) – ID of a JSON state uploaded to the state service. Using a state_id is an admin feature.
timestamp (time.time) – Timestamp for json state date. Requires state_id.

Returns

state_id of the uploaded JSON state

Return type

int

caveclient.jsonservice.neuroglancer_json_encoder(obj)[source]: JSON encoder for neuroglancer states. Differs from normal in that it expresses ints as strings

caveclient.l2cache module

caveclient.l2cache.L2CacheClient(server_address=None, table_name=None, auth_client=None, api_version='latest', max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, verify=True)[source]

class caveclient.l2cache.L2CacheClientLegacy(server_address, auth_header, api_version, endpoints, server_name, table_name=None, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, verify=True)[source]

Bases: ClientBase

property attributes

cache_metadata()[source]

Retrieves the meta data for the cache

Returns: keys are attribute names, values are datatypes
Return type: dict

property default_url_mapping

get_l2data(l2_ids, attributes=None)[source]

Gets the attributed statistics data for L2 ids.

Parameters

l2_ids (list or np.ndarray) – a list of level 2 ids
attributes (list, optional) – a list of attributes to retrieve. Defaults to None which will return all that are available. Available stats are [‘area_nm2’, ‘chunk_intersect_count’, ‘max_dt_nm’, ‘mean_dt_nm’, ‘pca’, ‘pca_val’, ‘rep_coord_nm’, ‘size_nm3’]. See docs for more description.

Returns

keys are l2 ids, values are data

Return type

dict

has_cache(datastack_name=None)[source]

Checks if the l2 cache is available for the dataset

Parameters: datastack_name (str, optional) – The name of the datastack to check, by default None (if None, uses the client’s datastack)
Returns: True if the l2 cache is available, False otherwise
Return type: bool

table_mapping()[source]

Retrieves table mappings for l2 cache.

Returns: keys are pcg table names, values are dicts with fields l2cache_id and cv_path.
Return type: dict

caveclient.materializationengine module

caveclient.materializationengine.MaterializationClient(server_address, datastack_name=None, auth_client=None, cg_client=None, synapse_table=None, api_version='latest', version=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, desired_resolution=None, over_client=None)[source]

Factory for returning AnnotationClient

Parameters

server_address (str) – server_address to use to connect to (i.e. https://minniev1.microns-daf.com)
datastack_name (str) – Name of the datastack.
auth_client (AuthClient or None, optional) – Authentication client to use to connect to server. If None, do not use authentication.
api_version (str or int (default: latest)) – What version of the api to use, 0: Legacy client (i.e www.dynamicannotationframework.com) 2: new api version, (i.e. minniev1.microns-daf.com) ‘latest’: default to the most recent (current 2)
cg_client (caveclient.chunkedgraph.ChunkedGraphClient) – chunkedgraph client for live materializations
synapse_table (str) – default synapse table for queries
version (default version to query) – if None will default to latest version
desired_resolution (Iterable[float] or None, optional) – If given, should be a list or array of the desired resolution you want queries returned in useful for materialization queries.

Returns

List of datastack names for available datastacks on the annotation engine

Return type

ClientBaseWithDatastack

class caveclient.materializationengine.MaterializatonClientV2(server_address, auth_header, api_version, endpoints, server_name, datastack_name, cg_client=None, synapse_table=None, version=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, desired_resolution=None)[source]

Bases: ClientBase

property datastack_name

get_annotation_count(table_name: str, datastack_name=None, version=None)[source]

Get number of annotations in a table

Parameters

(str) (table_name) – name of table to mark for deletion
datastack_name (str or None, optional,) – Name of the datastack_name. If None, uses the one specified in the client.
version (int or None, optional) – the version to query, else get the tables in the most recent version

Returns

number of annotations

Return type

int

get_table_metadata(table_name: str, datastack_name=None, version: int = None, log_warning: bool = True)[source]

Get metadata about a table

Parameters

table_name (str) – name of table to mark for deletion
datastack_name – str or None, optional, Name of the datastack_name. If None, uses the one specified in the client.
version (int, optional) – version to get. If None, uses the one specified in the client.
log_warning (bool, optional) – whether to print out warnings to the logger. Defaults to True.

Returns

metadata dictionary for table

Return type

dict

get_tables(datastack_name=None, version=None)[source]

Gets a list of table names for a datastack

Parameters

datastack_name (str or None, optional) – Name of the datastack, by default None. If None, uses the one specified in the client. Will be set correctly if you are using the framework_client
version (int or None, optional) – the version to query, else get the tables in the most recent version

Returns

List of table names

Return type

list

get_timestamp(version: int = None, datastack_name: str = None)[source]

Get datetime.datetime timestamp for a materialization version.

Parameters

version (int or None, optional) – Materialization version, by default None. If None, defaults to the value set in the client.
datastack_name (str or None, optional) – Datastack name, by default None. If None, defaults to the value set in the client.

Returns

Datetime when the materialization version was frozen.

Return type

datetime.datetime

get_version_metadata(version: int = None, datastack_name: str = None)[source]

get metadata about a version

Parameters

version (int, optional) – version number to get metadata about. Defaults to client default version.
datastack_name (str, optional) – datastack to query. Defaults to client default datastack.

get_versions(datastack_name=None, expired=False)[source]

get versions available

Parameters: datastack_name ([type], optional) – [description]. Defaults to None.

get_versions_metadata(datastack_name=None, expired=False)[source]

get the metadata for all the versions that are presently available and valid

Parameters

datastack_name (str, optional) – datastack to query. If None, defaults to the value set in the client.
expired (bool, optional) – whether to include expired versions. Defaults to False.

Returns

a list of metadata dictionaries

Return type

list[dict]

property homepage

ingest_annotation_table(table_name: str, datastack_name: str = None)[source]

Trigger supervoxel lookup and rootID looksup of new annotations in a table.

Parameters

table_name (str) – table to drigger
datastack_name (str, optional) – datastack to trigger it. Defaults to what is set in client.

Returns

status code of response from server

Return type

response

join_query(tables, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, suffixes: list = None, datastack_name: str = None, return_df: bool = True, split_positions: bool = False, materialization_version: int = None, metadata: bool = True, desired_resolution: Iterable = None, random_sample: int = None)[source]

generic query on materialization tables

Args:

tables: list of lists with length 2 or ‘str’

list of two lists: first entries are table names, second
entries are the columns used for the join

filter_in_dict (dict of dicts, optional):
outer layer: keys are table names inner layer: keys are column names, values are allowed entries. Defaults to None.

filter_out_dict (dict of dicts, optional):
outer layer: keys are table names inner layer: keys are column names, values are not allowed entries. Defaults to None.

filter_equal_dict (dict of dicts, optional):
outer layer: keys are table names inner layer: keys are column names, values are specified entry. Defaults to None.

filter_spatial (dict of dicts, optional):
outer layer: keys are table names: inner layer: keys are column names, values are bounding boxes

as [[min_x, min_y,min_z],[max_x, max_y, max_z]] Expressed in units of the voxel_resolution of this dataset.

Defaults to None

filter_regex_dict (dict of dicts, optional):

outer layer: keys are table names: inner layer: keys are column names, values are regex strings Defaults to None

select_columns (dict of lists of str, optional): keys are table names,values are the list of columns from that table.
Defaults to None, which will select all tables. Will be passed to server as select_column_maps. Passing a list will be passed as select_columns which is deprecated.

offset (int, optional): result offset to use. Defaults to None.
will only return top K results.

limit (int, optional): maximum results to return (server will set upper limit, see get_server_config) suffixes (dict, optional): suffixes to use for duplicate columns, keys are table names, values are the suffix datastack_name (str, optional): datastack to query.

If None defaults to one specified in client.

return_df (bool, optional): whether to return as a dataframe
default True, if False, data is returned as json (slower)

split_positions (bool, optional): whether to break position columns into x,y,z columns
default False, if False data is returned as one column with [x,y,z] array (slower)

materialization_version (int, optional): version to query.
If None defaults to one specified in client.

metadata: (bool, optional)toggle to return metadata
If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

desired_resolution (Iterable, optional):
What resolution to convert position columns to. Defaults to None will use defaults.

random_sample: (int, optional) : if given, will do a tablesample of the table to return that many annotations

Returns: a pandas dataframe of results of query
Return type: pd.DataFrame

live_live_query(table: str, timestamp: datetime, joins=None, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, split_positions: bool = False, metadata: bool = True, suffixes: dict = None, desired_resolution: Iterable = None, allow_missing_lookups: bool = False, random_sample: int = None)[source]

Beta method for querying cave annotation tables with rootIDs and annotations at a particular timestamp. Note: this method requires more explicit mapping of filters and selection to table as its designed to test a more general endpoint that should eventually support complex joins.

Parameters

table (str) – principle table to query
timestamp (datetime) – timestamp to use for querying
joins (list) – a list of joins, where each join is a list of [table1,column1, table2, column2]
filter_in_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and list values to accept . Defaults to None.
filter_out_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and list values to reject. Defaults to None.
filter_equal_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values to equate. Defaults to None.
filter_spatial_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values of 2x3 list of bounds. Defaults to None.
filter_regex_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values of regex strings. Defaults to None.
select_columns (_type_, optional) – a dictionary with tables as keys, values are list of columns. Defaults to None.
offset (int, optional) – value to offset query by. Defaults to None.
limit (int, optional) – limit of query. Defaults to None.
datastack_name (str, optional) – datastack to query. Defaults to set by client.
split_positions (bool, optional) – whether to split positions into seperate columns, True is faster. Defaults to False.
metadata (bool, optional) – whether to attach metadata to dataframe. Defaults to True.
suffixes (dict, optional) – what suffixes to use on joins, keys are table_names, values are suffixes. Defaults to None.
desired_resolution (Iterable, optional) – What resolution to convert position columns to. Defaults to None will use defaults.
allow_missing_lookups (bool, optional) – If there are annotations without supervoxels and rootids yet, allow results. Defaults to False.
random_sample – (int, optional) : if given, will do a tablesample of the table to return that many annotations

Example

live_live_query(“table_name”,datetime.datetime.now(datetime.timezone.utc),

joins=[[table_name, table_column, joined_table, joined_column],: [joined_table, joincol2, third_table, joincol_third]]
suffixes={: “table_name”:”suffix1”, “joined_table”:”suffix2”, “third_table”:”suffix3”

}, select_columns= {

“table_name”:[ “column”,”names”], “joined_table”:[“joined_colum”]

}, filter_in_dict= {

“table_name”:{
“column_name”:[included,values]

}

}, filter_out_dict= {

“table_name”:{
“column_name”:[excluded,values]

}

}, filter_equal_dict”={

“table_name”:{
“column_name”:value

},

filter_spatial_dict”= {: “table_name”: { “column_name”: [[min_x, min_y, min_z], [max_x, max_y, max_z]]

} filter_regex_dict”= {

“table_name”: { “column_name”: “regex_string”

} :returns: result of query :rtype: pd.DataFrame

live_query(table: str, timestamp: datetime, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, split_positions: bool = False, post_filter: bool = True, metadata: bool = True, merge_reference: bool = True, desired_resolution: Iterable = None, random_sample: int = None)[source]

generic query on materialization tables

Parameters

table – ‘str’
timestamp (datetime.datetime) – time to materialize (in utc) pass datetime.datetime.now(datetime.timezone.utc) for present time
filter_in_dict (dict , optional) – keys are column names, values are allowed entries. Defaults to None.
filter_out_dict (dict, optional) – keys are column names, values are not allowed entries. Defaults to None.
filter_equal_dict (dict, optional) – inner layer: keys are column names, values are specified entry. Defaults to None.
filter_spatial (dict, optional) –

inner layer: keys are column names, values are bounding boxes
as [[min_x, min_y,min_z],[max_x, max_y, max_z]] Expressed in units of the voxel_resolution of this dataset. Defaults to None
filter_regex_dict (dict, optional) – inner layer: keys are column names, values are regex strings
offset (int, optional) – offset in query result
limit (int, optional) – maximum results to return (server will set upper limit, see get_server_config)
select_columns (list of str, optional) – columns to select. Defaults to None.
suffixes – (list[str], optional): suffixes to use on duplicate columns
offset – result offset to use. Defaults to None. will only return top K results.
datastack_name (str, optional) – datastack to query. If None defaults to one specified in client.
split_positions (bool, optional) – whether to break position columns into x,y,z columns default False, if False data is returned as one column with [x,y,z] array (slower)
post_filter (bool, optional) – whether to filter down the result based upon the filters specified if false, it will return the query with present root_ids in the root_id columns, but the rows will reflect the filters translated into their past IDs. So if, for example, a cell had a false merger split off since the last materialization. those annotations on that incorrect portion of the cell will be included if this is False, but will be filtered down if this is True. (Default=True)
metadata – (bool, optional) : toggle to return metadata If True (and return_df is also True), return table and query metadata in the df.attr dictionary.
merge_reference – (bool, optional) : toggle to automatically join reference table If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that
desired_resolution – (Iterable[float], Optional) : desired resolution you want all spatial points returned in If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata
random_sample – (int, optional) : if given, will do a tablesample of the table to return that many annotations

Returns: pd.DataFrame: a pandas dataframe of results of query

lookup_supervoxel_ids(table_name: str, annotation_ids: list = None, datastack_name: str = None)[source]

Trigger supervoxel lookups of new annotations in a table.

Parameters

table_name (str) – table to drigger
annotation_ids – (list, optional): list of annotation ids to lookup. Default is None, which will trigger lookup of entire table.
datastack_name (str, optional) – datastack to trigger it. Defaults to what is set in client.

Returns

status code of response from server

Return type

response

map_filters(filters, timestamp, timestamp_past)[source]

translate a list of filter dictionaries: from a point in the future, to a point in the past

Parameters

filters (list[dict]) – filter dictionaries with
timestamp ([type]) – [description]
timestamp_past ([type]) – [description]

Returns

[description]

Return type

[type]

most_recent_version(datastack_name=None)[source]

get the most recent version of materialization for this datastack name

Parameters

datastack_name (str, optional) – datastack name to find most
of. (recent materialization) –
None (If) –
client. (uses the one specified in the) –

query_table(table: str, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, return_df: bool = True, split_positions: bool = False, materialization_version: int = None, timestamp: datetime = None, metadata: bool = True, merge_reference: bool = True, desired_resolution: Iterable = None, get_counts: bool = False, random_sample: int = None)[source]

generic query on materialization tables

Parameters

table – ‘str’
filter_in_dict (dict , optional) – keys are column names, values are allowed entries. Defaults to None.
filter_out_dict (dict, optional) – keys are column names, values are not allowed entries. Defaults to None.
filter_equal_dict (dict, optional) – inner layer: keys are column names, values are specified entry. Defaults to None.
filter_spatial_dict (dict, optional) –

inner layer: keys are column names, values are bounding boxes
as [[min_x, min_y,min_z],[max_x, max_y, max_z]] Expressed in units of the voxel_resolution of this dataset.
filter_regex_dict (dict, optional) – inner layer: keys are column names, values are regex strings
offset (int, optional) – offset in query result
limit (int, optional) – maximum results to return (server will set upper limit, see get_server_config)
select_columns (list of str, optional) – columns to select. Defaults to None.
suffixes – (list[str], optional): suffixes to use on duplicate columns
offset – result offset to use. Defaults to None. will only return top K results.
datastack_name (str, optional) – datastack to query. If None defaults to one specified in client.
return_df (bool, optional) – whether to return as a dataframe default True, if False, data is returned as json (slower)
split_positions (bool, optional) – whether to break position columns into x,y,z columns default False, if False data is returned as one column with [x,y,z] array (slower)
materialization_version (int, optional) – version to query. If None defaults to one specified in client.
timestamp (datetime.datetime, optional) – timestamp to query If passsed will do a live query. Error if also passing a materialization version
metadata – (bool, optional) : toggle to return metadata (default True) If True (and return_df is also True), return table and query metadata in the df.attr dictionary.
merge_reference – (bool, optional) : toggle to automatically join reference table If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that
desired_resolution – (Iterable[float], Optional) : desired resolution you want all spatial points returned in If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata
random_sample – (int, optional) : if given, will do a tablesample of the table to return that many annotations

Returns: pd.DataFrame: a pandas dataframe of results of query

synapse_query(pre_ids: Union[int, Iterable, ndarray] = None, post_ids: Union[int, Iterable, ndarray] = None, bounding_box=None, bounding_box_column: str = 'post_pt_position', timestamp: datetime = None, remove_autapses: bool = True, include_zeros: bool = True, limit: int = None, offset: int = None, split_positions: bool = False, desired_resolution: Iterable[float] = None, materialization_version: int = None, synapse_table: str = None, datastack_name: str = None, metadata: bool = True)[source]

Convience method for quering synapses. Will use the synapse table specified in the info service by default. It will also remove autapses by default. NOTE: This is not designed to allow querying of the entire synapse table. A query with no filters will return only a limited number of rows (configured by the server) and will do so in a non-deterministic fashion. Please contact your dataset administrator if you want access to the entire table.

Parameters

pre_ids (Union[int, Iterable, optional) – pre_synaptic cell(s) to query. Defaults to None.
post_ids (Union[int, Iterable, optional) – post synaptic cell(s) to query. Defaults to None.
timestamp (datetime.datetime, optional) – timestamp to query (optional). If passed recalculate query at timestamp, do not pass with materialization_verison
bounding_box – [[min_x, min_y, min_z],[max_x, max_y, max_z]] bounding box to filter synapse locations. Expressed in units of the voxel_resolution of this dataset (optional)
bounding_box_column (str, optional) – which synapse location column to filter by (Default to “post_pt_position”)
remove_autapses (bool, optional) – post-hoc filter out synapses. Defaults to True.
include_zeros (bool, optional) – whether to include synapses to/from id=0 (out of segmentation). Defaults to True.
limit (int, optional) – number of synapses to limit, Defaults to None (server side limit applies)
offset (int, optional) – number of synapses to offset query, Defaults to None (no offset).
split_positions (bool, optional) – whether to return positions as seperate x,y,z columns (faster) defaults to False
desired_resolution – Iterable[float] or None, optional If given, should be a list or array of the desired resolution you want queries returned in useful for materialization queries.
synapse_table (str, optional) – synapse table to query. If None, defaults to self.synapse_table.
datastack_name – (str, optional): datastack to query
materialization_version (int, optional) – version to query. defaults to self.materialization_version if not specified
metadata – (bool, optional) : toggle to return metadata If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

property tables

property version

property views

class caveclient.materializationengine.MaterializatonClientV3(*args, **kwargs)[source]

Bases: MaterializatonClientV2

get_tables_metadata(datastack_name=None, version: int = None, log_warning: bool = True)[source]

Get metadata about a table

Parameters

datastack_name – str or None, optional, Name of the datastack_name. If None, uses the one specified in the client.
version (int, optional) – version to get. If None, uses the one specified in the client.
log_warning (bool, optional) – whether to print out warnings to the logger. Defaults to True.

Returns

metadata dictionary for table

Return type

dict

get_unique_string_values(table: str, datastack_name: str = None)[source]

get unique string values for a table

Args: table: ‘str’ datastack_name (str, optional): datastack to query.

If None defaults to one specified in client.

Returns: dict[str]: a dictionary of column names and unique values

get_view_metadata(view_name: str, materialization_version: int = None, datastack_name: str = None, log_warning: bool = True)[source]

get metadata for a view

Parameters

view_name (str) – name of view to query
materialization_version (int, optional) – version to query. Defaults to None. (will use version set by client)
log_warning (bool, optional) – whether to log warnings. Defaults to True.

Returns

metadata of view

Return type

dict

get_view_schema(view_name: str, materialization_version: int = None, datastack_name: str = None, log_warning: bool = True)[source]

get schema for a view

Parameters

view_name (str) – name of view to query
materialization_version (int, optional) – version to query. Defaults to None. (will version set by client)
log_warning (bool, optional) – whether to log warnings. Defaults to True.

Returns

schema of view

Return type

dict

get_view_schemas(materialization_version: int = None, datastack_name: str = None, log_warning: bool = True)[source]

get schemas for all views

Parameters

materialization_version (int, optional) – version to query. Defaults to None. (will version set by client)
log_warning (bool, optional) – whether to log warnings. Defaults to True.

Returns

schemas of all views

Return type

dict

get_views(version: int = None, datastack_name: str = None)[source]

get all available views for a version

Parameters

version (int, optional) – version to query. Defaults to None. (will version set by client)
datastack_name (str, optional) – datastack to query. Defaults to None. (will use datastack set by client)

Returns

a list of views

Return type

list[str]

live_live_query(table: str, timestamp: datetime, joins=None, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, split_positions: bool = False, metadata: bool = True, suffixes: dict = None, desired_resolution: Iterable = None, allow_missing_lookups: bool = False, allow_invalid_root_ids: bool = False, random_sample: int = None)[source]

Beta method for querying cave annotation tables with rootIDs and annotations at a particular timestamp. Note: this method requires more explicit mapping of filters and selection to table as its designed to test a more general endpoint that should eventually support complex joins.

Parameters

table (str) – principle table to query
timestamp (datetime) – timestamp to use for querying
joins (list) – a list of joins, where each join is a list of [table1,column1, table2, column2]
filter_in_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and list values to accept . Defaults to None.
filter_out_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and list values to reject. Defaults to None.
filter_equal_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values to equate. Defaults to None.
filter_spatial_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values of 2x3 list of bounds. Defaults to None.
filter_regex_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values of regex strings. Defaults to None.
select_columns (_type_, optional) – a dictionary with tables as keys, values are list of columns. Defaults to None.
offset (int, optional) – value to offset query by. Defaults to None.
limit (int, optional) – limit of query. Defaults to None.
datastack_name (str, optional) – datastack to query. Defaults to set by client.
split_positions (bool, optional) – whether to split positions into seperate columns, True is faster. Defaults to False.
metadata (bool, optional) – whether to attach metadata to dataframe. Defaults to True.
suffixes (dict, optional) – what suffixes to use on joins, keys are table_names, values are suffixes. Defaults to None.
desired_resolution (Iterable, optional) – What resolution to convert position columns to. Defaults to None will use defaults.
allow_missing_lookups (bool, optional) – If there are annotations without supervoxels and rootids yet, allow results. Defaults to False.
allow_invalid_root_ids (bool, optional) – If True, ignore root ids not valid at the given timestamp, otherwise raise an Error. Defaults to False.
random_sample (int, optional) – If given, will do a tablesample of the table to return that many annotations

Example

live_live_query(“table_name”,datetime.datetime.now(datetime.timezone.utc),

joins=[[table_name, table_column, joined_table, joined_column],: [joined_table, joincol2, third_table, joincol_third]]
suffixes={: “table_name”:”suffix1”, “joined_table”:”suffix2”, “third_table”:”suffix3”

}, select_columns= {

“table_name”:[ “column”,”names”], “joined_table”:[“joined_colum”]

}, filter_in_dict= {

“table_name”:{
“column_name”:[included,values]

}

}, filter_out_dict= {

“table_name”:{
“column_name”:[excluded,values]

}

}, filter_equal_dict”={

“table_name”:{
“column_name”:value

},

filter_spatial_dict”= {

“table_name”: {: “column_name”: [[min_x, min_y, min_z], [max_x, max_y, max_z]]

} filter_regex_dict”= {

“table_name”: {
“column_name”: “regex”

}

} :returns: result of query :rtype: pd.DataFrame

query_view(view_name: str, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, return_df: bool = True, split_positions: bool = False, materialization_version: int = None, metadata: bool = True, merge_reference: bool = True, desired_resolution: Iterable = None, get_counts: bool = False, random_sample: int = None)[source]

generic query on a view

Args: table: ‘str’

filter_in_dict (dict , optional):
keys are column names, values are allowed entries. Defaults to None.

filter_out_dict (dict, optional):
keys are column names, values are not allowed entries. Defaults to None.

filter_equal_dict (dict, optional):
inner layer: keys are column names, values are specified entry. Defaults to None.

filter_spatial (dict, optional):

inner layer: keys are column names, values are bounding boxes
as [[min_x, min_y,min_z],[max_x, max_y, max_z]] Expressed in units of the voxel_resolution of this dataset.

filter_regex_dict (dict, optional):
inner layer: keys are column names, values are regex strings.

offset (int, optional): offset in query result limit (int, optional): maximum results to return (server will set upper limit, see get_server_config) select_columns (list of str, optional): columns to select. Defaults to None. suffixes: (list[str], optional): suffixes to use on duplicate columns offset (int, optional): result offset to use. Defaults to None.

will only return top K results.

datastack_name (str, optional): datastack to query.
If None defaults to one specified in client.

return_df (bool, optional): whether to return as a dataframe
default True, if False, data is returned as json (slower)

split_positions (bool, optional): whether to break position columns into x,y,z columns
default False, if False data is returned as one column with [x,y,z] array (slower)

materialization_version (int, optional): version to query.
If None defaults to one specified in client.

metadata: (bool, optional)toggle to return metadata (default True)
If True (and return_df is also True), return table and query metadata in the df.attr dictionary.

merge_reference: (bool, optional)toggle to automatically join reference table
If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that

desired_resolution: (Iterable[float], Optional)desired resolution you want all spatial points returned in
If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata

random_sample: (int, optional) : if given, will do a tablesample of the table to return that many annotations

Returns: pd.DataFrame: a pandas dataframe of results of query

caveclient.materializationengine.concatenate_position_columns(df, inplace=False)[source]

function to take a dataframe with x,y,z position columns and replace them with one column per position with an xyz numpy array. Edits occur

Parameters

df (pd.DataFrame) – dataframe to alter
inplace (bool) – whether to perform edits in place

Returns

[description]

Return type

pd.DataFrame

caveclient.materializationengine.convert_position_columns(df, given_resolution, desired_resolution)[source]

function to take a dataframe with x,y,z position columns and convert them to the desired resolution from the given resolution

Parameters

df (pd.DataFrame) – dataframe to alter
given_resolution (Iterable[float]) – what the given resolution is
desired_resoultion (Iterable[float]) – what the desired resolution is

Returns

[description]

Return type

pd.DataFrame

caveclient.materializationengine.convert_timestamp(ts: datetime)[source]

caveclient.materializationengine.deserialize_query_response(response)[source]: Deserialize pyarrow responses

caveclient.materializationengine.string_format_timestamp(ts)[source]

caveclient.session_config module

caveclient.session_config.patch_session(session, max_retries=None, pool_block=None, pool_maxsize=None)[source]

Patch session to configure retry and poolsize options

Parameters

session (requests session) – Session to modify
max_retries (Int or None, optional) – Set the number of retries per request, by default None. If None, defaults to requests package default.
pool_block (Bool or None, optional) – If True, restricts pool of threads to max size, by default None. If None, defaults to requests package default.
pool_maxsize (Int or None, optional) – Sets the max number of threads in the pool, by default None. If None, defaults to requests package default.

caveclient package

Subpackages

Submodules

caveclient.annotationengine module

caveclient.auth module

caveclient.base module

caveclient.chunkedgraph module

caveclient.emannotationschemas module

caveclient.endpoints module

caveclient.format_utils module

caveclient.frameworkclient module

caveclient.infoservice module

caveclient.jsonservice module

caveclient.l2cache module

caveclient.materializationengine module

caveclient.session_config module

caveclient.timeit module

Module contents