caveclient package
Subpackages
Submodules
caveclient.annotationengine module
- caveclient.annotationengine.AnnotationClient(server_address, dataset_name=None, aligned_volume_name=None, auth_client=None, api_version='latest', verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
Factory for returning AnnotationClient
- Parameters
server_address (str) – server_address to use to connect to (i.e. https://minniev1.microns-daf.com)
datastack_name (str) – Name of the datastack.
auth_client (AuthClient or None, optional) – Authentication client to use to connect to server. If None, do not use authentication.
api_version (str or int (default: latest)) – What version of the api to use, 0: Legacy client (i.e www.dynamicannotationframework.com) 2: new api version, (i.e. minniev1.microns-daf.com) ‘latest’: default to the most recent (current 2)
verify (str (default : True)) – whether to verify https
max_retries (Int or None, optional) – Set the number of retries per request, by default None. If None, defaults to requests package default.
pool_block (Bool or None, optional) – If True, restricts pool of threads to max size, by default None. If None, defaults to requests package default.
pool_maxsize (Int or None, optional) – Sets the max number of threads in the pool, by default None. If None, defaults to requests package default.
over_client – client to overwrite configuration with
- Returns
List of datastack names for available datastacks on the annotation engine
- Return type
- class caveclient.annotationengine.AnnotationClientV2(server_address, auth_header, api_version, endpoints, server_name, aligned_volume_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, schema_client=None)[source]
Bases:
ClientBase
- property aligned_volume_name
- create_table(table_name: str, schema_name: str, description: str, voxel_resolution: List[float], reference_table: str = None, track_target_id_updates: bool = None, flat_segmentation_source: str = None, user_id: int = None, aligned_volume_name: str = None, write_permission: str = 'PRIVATE', read_permission: str = 'PUBLIC', notice_text: str = None)[source]
Creates a new data table based on an existing schema
- Parameters
table_name (str) – Name of the new table. Cannot be the same as an existing table
schema_name (str) – Name of the schema for the new table.
descrption (str) – Human readable description for what is in the table. Should include information about who generated the table What data it covers, and how it should be interpreted. And who should you talk to if you want to use it. An Example: a manual synapse table to detect chandelier synapses on 81 PyC cells with complete AISs [created by Agnes - agnesb@alleninstitute.org, uploaded by Forrest]
voxel_resolution (list[float]) – voxel resolution points will be uploaded in, typically nm, i.e [1,1,1] means nanometers [4,4,40] would be 4nm, 4nm, 40nm voxels
reference_table (str or None) – If the schema you are using is a reference schema Meaning it is an annotation of another annotation. Then you need to specify what the target table those annotations are in.
track_target_id_updates (bool or None) – Indicates whether to automatically update reference table’s foreign key if target annotation table row is updated.
flat_segmentation_source (str or None) – the source to a flat segmentation that corresponds to this table i.e. precomputed:gs:mybucket his_tables_annotation
user_id (int) – If you are uploading this schema on someone else’s behalf and you want to link this table with their ID, you can specify it here Otherwise, the table will be created with your userID in the user_id column.
aligned_volume_name (str or None, optional,) – Name of the aligned_volume. If None, uses the one specified in the client.
write_permission (str, optional) – What permissions to give the table for writing. One of PRIVATE: only you can write to this table (DEFAULT) GROUP: only members that share a group with you can write (excluding some groups) PUBLIC: Anyone can write to this table. Note all data is logged, and deletes are done by marking rows as deleted, so all data is always recoverable
read_permission (str, optional) – What permissions to give the table for reading. One of PRIVATE: only you can read this table. Intended to be used for sorting out bugs. GROUP: only members that share a group with you can read (intended for within group vetting) PUBLIC: anyone with permissions to read this datastack can read this data (DEFAULT)
notice_text (str, optional) – Text the user will see when querying this table. Can be used to warn users of flaws, and uncertainty in the data, or to advertise citations that should be used with this table. Defaults to None, no text. If you want to remove text, send empty string.
- Returns
Response JSON
- Return type
json
Examples
Basic annotation table: description = “Some description about the table” voxel_res = [4,4,40] client.create_table(“some_synapse_table”, “synapse”, description, voxel_res)
- delete_annotation(table_name: str, annotation_ids: dict, aligned_volume_name: str = None)[source]
Delete one or more annotations in a table. Annotations that are deleted are recorded as ‘non-valid’ but are not physically removed from the table.
- Parameters
table_name (str) – Name of the table where annotations will be added
data (dict or list,) – A list of (or a single) dict of schematized annotation data matching the target table. each dict must contain an “id” field which is the ID of the annotation to update
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
Response JSON: a list of new annotation IDs.
- Return type
json
- delete_table(table_name: str, aligned_volume_name: str = None)[source]
Marks a table for deletion requires super admin privileges
- Parameters
(str) (table_name) – name of table to mark for deletion
aligned_volume_name (str or None, optional,) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
Response JSON
- Return type
json
- get_annotation(table_name: str, annotation_ids: int, aligned_volume_name: str = None)[source]
Retrieve an annotation or annotations by id(s) and table name.
- Parameters
table_name (str) – Name of the table
annotation_ids (int or iterable) – ID or IDS of the annotation to retreive
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
Annotation data
- Return type
list
- get_annotation_count(table_name: str, aligned_volume_name: str = None)[source]
Get number of annotations in a table
- Parameters
(str) (table_name) – name of table to mark for deletion
aligned_volume_name (str or None, optional,) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
number of annotations
- Return type
int
- get_table_metadata(table_name: str, aligned_volume_name: str = None)[source]
Get metadata about a table
- Parameters
(str) (table_name) – name of table to mark for deletion
aligned_volume_name (str or None, optional,) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
metadata about table
- Return type
json
- get_tables(aligned_volume_name: str = None)[source]
Gets a list of table names for a aligned_volume_name
- Parameters
aligned_volume_name (str or None, optional) – Name of the aligned_volume, by default None. If None, uses the one specified in the client. Will be set correctly if you are using the framework_client
- Returns
List of table names
- Return type
list
- post_annotation(table_name: str, data: dict, aligned_volume_name: str = None)[source]
Post one or more new annotations to a table in the AnnotationEngine. All inserted annotations will be marked as ‘valid’. To invalidate annotations refer to ‘update_annotation’, ‘update_annotation_df’ and ‘delete_annotation’ methods.
- Parameters
table_name (str) – Name of the table where annotations will be added
data (dict or list,) – A list of (or a single) dict of schematized annotation data matching the target table.
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
Response JSON
- Return type
json
- post_annotation_df(table_name: str, df: DataFrame, position_columns: Iterable[str], aligned_volume_name=None)[source]
Post one or more new annotations to a table in the AnnotationEngine. All inserted annotations will be marked as ‘valid’. To invalidate annotations see ‘update_annotation’, ‘update_annotation_df’ and ‘delete_annotation’ methods.
- Parameters
table_name (str) – Name of the table where annotations will be added
df (pd.DataFrame) – A pandas dataframe containing the annotations. Columns should be fields in schema, position columns need to be called out in position_columns argument.
position_columns (dict or (list or np.array or pd.Index) or None) – if None, will look for all columns with ‘X_position’ in the name and assume they go in fields called “X”. if Iterable assumes each column given ends in _position. (i.e. [‘pt_position’] if ‘pt’ is the name of the position field in schema) if Mapping, keys are names of columns in dataframe, values are the names of the fields (i.e. {‘pt_column’: ‘pt’} would be correct if you had one column named ‘pt_column’ which needed to go into a schema with a position column called ‘pt’)
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
Response JSON
- Return type
json
- static process_position_columns(df: DataFrame, position_columns: Iterable[str])[source]
Process a dataframe into a list of dictionaries, nesting thing
- Parameters
df (pd.DataFrame) – dataframe to process
position_columns (Iterable[str] or Mapping[str, str] or None) – see post_annotation_df
- Returns
json list of annotations ready for posting
- stage_annotations(table_name=None, schema_name=None, update=False, id_field=False, table_resolution=None, annotation_resolution=None)[source]
Get a StagedAnnotations object to help produce correctly formatted annotations for a given table or schema. StagedAnnotation objects can be uploaded directly with upload_staged_annotations.
- Parameters
table_name (str, optional) – Table name to stage annotations for, by default None.
schema_name (str, optional) – Schema name to use to make annotations. Only needed if the table_name is not set, by default None
update (bool, optional) – Set to True if individual annotations are going to be updated, by default False.
id_field (bool, optional) – Set to True if id fields are to be specified. Not needed if update is True, which always needs id fields. Optional, by default False
table_resolution (list-like or None, optional) – Voxel resolution of spatial points in the table in nanometers. This is found automatically from the info service if a table name is provided, by default None. If annotation_resolution is also set, this allows points to be scaled correctly for the table.
annotation_resolution (list-like, optional) – Voxel resolution of spatial points provided by the user when creating annotations. If the table resolution is also available (manually or from the info service), annotations are correctly rescaled for the volume. By default, None.
- update_annotation(table_name: str, data: dict, aligned_volume_name: str = None)[source]
Update one or more new annotations to a table in the AnnotationEngine. Updating is implemented by invalidating the old annotation and inserting a new annotation row, which will receive a new primary key ID.
Notes
If annotations ids were user provided upon insertion the database will autoincrement from the current max id in the table.
- Parameters
table_name (str) – Name of the table where annotations will be added
data (dict or list,) – A list of (or a single) dict of schematized annotation data matching the target table. each dict must contain an “id” field which is the ID of the annotation to update
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
Response JSON: a list of new annotation IDs.
- Return type
json
- update_annotation_df(table_name: str, df: DataFrame, position_columns: Iterable[str], aligned_volume_name=None)[source]
Update one or more annotations to a table in the AnnotationEngine using a dataframe as format. Updating is implemented by invalidating the old annotation and inserting a new annotation row, which will receive a new primary key ID.
Notes
If annotations ids were user provided upon insertion the database will autoincrement from the current max id in the table.
- Parameters
table_name (str) – Name of the table where annotations will be added
df (pd.DataFrame) – A pandas dataframe containing the annotations. Columns should be fields in schema, position columns need to be called out in position_columns argument.
position_columns (dict or (list or np.array or pd.Index) or None) – if None, will look for all columns with ‘X_position’ in the name and assume they go in fields called “X”. if Iterable assumes each column given ends in _position. (i.e. [‘pt_position’] if ‘pt’ is the name of the position field in schema) if Mapping, keys are names of columns in dataframe, values are the names of the fields (i.e. {‘pt_column’: ‘pt’} would be correct if you had one column named ‘pt_column’ which needed to go into a schema with a position column called ‘pt’)
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
Response JSON
- Return type
json
- update_metadata(table_name: str, description: str = None, flat_segmentation_source: str = None, read_permission: str = None, write_permission: str = None, user_id: int = None, notice_text: str = None, aligned_volume_name: str = None)[source]
update the metadata on an existing table
- Parameters
table_name (str) – name of table to update
description (str, optional) – text description of the the table. Defaults to None (will not update).
flat_segmentation_source (str, optional) – cloudpath to a flat segmentation associated with this table. Defaults to None (will not update).
read_permission – str, optional What permissions to give the table for reading. One of PRIVATE: only you can read this table. Intended to be used for sorting out bugs. GROUP: only members that share a group with you can read (intended for within group vetting) PUBLIC: anyone with permissions to read this datastack can read this data Defaults to None (will not update).
write_permission – str, optional What permissions to give the table for writing. One of PRIVATE: only you can write to this table GROUP: only members that share a group with you can write (excluding some groups) PUBLIC: Anyone can write to this table. Note all data is logged, and deletes are done by marking rows as deleted, so all data is always recoverable Defaults to None (will not update).
user_id (int, optional) – change ownership of this table to this user_id. Note, if you use this you will not be able to update the metadata on this table any longer and depending on permissions may not be able to read or write to it Defaults to None. (will not update)
notice_text – str, optional Text the user will see when querying this table. Can be used to warn users of flaws, and uncertainty in the data, or to advertise citations that should be used with this table. Defaults to None. (will not update)
aligned_volume_name – str or None, optional Name of the aligned_volume. If None, uses the one specified in the client.
- upload_staged_annotations(staged_annos: StagedAnnotations, aligned_volume_name: str = None)[source]
Upload annotations directly from an Annotation Guide object. This method uses the options specified in the object, including table name and if the annotation is an update or not.
- Parameters
staged_annos (guide.AnnotationGuide) – AnnotationGuide object with a specified table name and a collection of annotations already filled in.
aligned_volume_name (str or None, optional) – Name of the aligned_volume. If None, uses the one specified in the client.
- Returns
If new annotations are posted, a list of ids. If annotations are being updated, a dictionary with the mapping from old ids to new ids.
- Return type
List or dict
caveclient.auth module
- class caveclient.auth.AuthClient(token_file=None, token_key=None, token=None, server_address='https://global.daf-apis.com')[source]
Bases:
object
Client to find and use auth tokens to access the dynamic annotation framework services.
- Parameters
token_file (str, optional) – Path to a JSON key:value file holding your auth token. By default, “~/.cloudvolume/secrets/cave-secret.json” (will check deprecated token name “chunkedgraph-secret.json” as well)
token_key (str, optional) – Key for the token in the token_file. By default, “token”
token (str or None, optional) – Direct entry of the token as a string. If provided, overrides the files. If None, attempts to use the file paths.
server_address (str, optional,) – URL to the auth server. By default, uses a default server address.
- get_group_users(group_id)[source]
Get users in a group
- Parameters
group_id (int) – ID value for a given group
- Returns
List of dicts of user ids. Returns empty list if group does not exist.
- Return type
list
- get_new_token(open=False)[source]
Currently, returns instructions for getting a new token based on the current settings and saving it to the local environment. New OAuth tokens are currently not able to be retrieved programmatically.
- Parameters
open (bool, optional) – If True, opens a web browser to the web page where you can generate a new token.
- get_token(token_key=None)[source]
Load a token with a given key the specified token file
- Parameters
token_key (str or None, optional) – key in the token file JSON, by default None. If None, uses ‘token’.
- get_tokens()[source]
Get the tokens setup for this users
- Returns
a list of dictionary of tokens, each with the keys “id”: the id of this token “token”: the token (str) “user_id”: the users id (should be your ID)
- Return type
list[dict]
- get_user_information(user_ids)[source]
Get user data.
- Parameters
user_id (list of int) – user_ids to look up
- property request_header
Formatted request header with the specified token
- save_token(token=None, token_key='token', overwrite=False, token_file=None, switch_token=True, write_to_server_file=True)[source]
Conveniently save a token in the correct format.
After getting a new token by following the instructions in authclient.get_new_token(), you can save it with a fully default configuration by running:
token = ‘my_shiny_new_token’
authclient.save_token(token=token)
Now on next load, authclient=AuthClient() will make an authclient instance using this token. If you would like to specify more information about the json file where the token will be stored, see the parameters below.
- Parameters
token (str, optional) – New token to save, by default None
token_key (str, optional) – Key for the token in the token_file json, by default “token”
overwrite (bool, optional) – Allow an existing token to be changed, by default False
token_file (str, optional) – Path to the token file, by default None. If None, uses the default file location specified above.
switch_token (bool, optional) – If True, switch the auth client over into using the new token, by default True
write_to_server_file (bool, optional) – If True, will write token to a server specific file to support this machine interacting with multiple auth servers.
- setup_token(make_new=True, open=True)[source]
Currently, returns instructions for getting your auth token based on the current settings and saving it to the local environment. New OAuth tokens are currently not able to be retrieved programmatically.
- Parameters
make_new (bool, optional) – If True, will make a new token, else prompt you to open a page to retrieve an existing token.
open (bool, optional) – If True, opens a web browser to the web page where you can retrieve a token.
- property token
Secret token used to authenticate yourself to the Connectome Annotation Versioning Engine services.
caveclient.base module
- class caveclient.base.BaseEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]
Bases:
JSONEncoder
- default(obj)[source]
Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
- class caveclient.base.ClientBase(server_address, auth_header, api_version, endpoints, server_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
Bases:
object
- property api_version
- property default_url_mapping
- property fc
- property server_address
- class caveclient.base.ClientBaseWithDataset(server_address, auth_header, api_version, endpoints, server_name, dataset_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
Bases:
ClientBase
- property dataset_name
- class caveclient.base.ClientBaseWithDatastack(server_address, auth_header, api_version, endpoints, server_name, datastack_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
Bases:
ClientBase
- property datastack_name
caveclient.chunkedgraph module
PyChunkedgraph service python interface
- caveclient.chunkedgraph.ChunkedGraphClient(server_address=None, table_name=None, auth_client=None, api_version='latest', timestamp=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
- class caveclient.chunkedgraph.ChunkedGraphClientV1(server_address, auth_header, api_version, endpoints, server_key='cg_server_address', timestamp=None, table_name=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
Bases:
ClientBase
ChunkedGraph Client for the v1 API
- property base_resolution
MIP 0 resolution for voxels assumed by the ChunkedGraph
- Returns
3-long list of x/y/z voxel dimensions in nm
- Return type
list
- property cloudvolume_path
- property default_url_mapping
- do_merge(supervoxels, coords, resolution=(4, 4, 40)) None [source]
Perform a merge on the chunked graph.
- Parameters
supervoxels (iterable) – An N-long list of supervoxels to merge.
coords (np.array) – An Nx3 array of coordinates of the supervoxels in units of resolution.
resolution (tuple, optional) – What to multiply coords by to get nanometers. Defaults to (4,4,40).
- execute_split(source_points, sink_points, root_id, source_supervoxels=None, sink_supervoxels=None) Tuple[int, list] [source]
Execute a multicut split based on points or supervoxels.
- Parameters
source_points (array or list) – Nx3 list or array of 3d points in nm coordinates for source points (red).
sink_points (array or list) – Mx3 list or array of 3d points in nm coordinates for sink points (blue).
root_id (int) – Root ID of object to do split preview.
source_supervoxels (array, list or None, optional) – If providing source supervoxels, an N-length array of supervoxel IDs or Nones matched to source points. If None, treats as a full array of Nones. By default None.
sink_supervoxels (array, list or None, optional) – If providing sink supervoxels, an M-length array of supervoxel IDs or Nones matched to source points. If None, treats as a full array of Nones. By default None.
- Returns
operation_id (int) – Unique ID of the split operation
new_root_ids (list of int) – List of new root IDs resulting from the split operation.
- find_path(root_id, src_pt, dst_pt, precision_mode=False) Tuple[ndarray, ndarray, ndarray] [source]
Find a path between two locations on a root ID using the level 2 chunked graph.
- Parameters
root_id (int) – Root ID to query.
src_pt (np.array) – 3-element array of xyz coordinates in nm for the source point.
dst_pt (np.array) – 3-element array of xyz coordinates in nm for the destination point.
precision_mode (bool, optional) – Whether to perform the search in precision mode. Defaults to False.
- Returns
centroids_list (np.array) – Array of centroids along the path.
l2_path (np.array of int) – Array of level 2 chunk IDs along the path.
failed_l2_ids (np.array of int) – Array of level 2 chunk IDs that failed to find a path.
- get_change_log(root_id, filtered=True) dict [source]
Get the change log (splits and merges) for an object.
- Parameters
root_id (int) – Object root ID to look up.
filtered (bool) – Whether to filter the change log to only include splits and merges which affect the final state of the object (filtered=True), as opposed to including edit history for objects which as some point were split from the query object root_id (filtered=False). Defaults to True.
- Returns
Dictionary summarizing split and merge events in the object history, containing the following keys:
- ”n_merges”: int
Number of merges
- ”n_splits”: int
Number of splits
- ”operations_ids”: list of int
Identifiers for each operation
- ”past_ids”: list of int
Previous root ids for this object
- ”user_info”: dict of dict
Dictionary keyed by user (string) to a dictionary specifying how many merges and splits that user performed on this object
- Return type
dict
- get_children(node_id) ndarray [source]
Get the children of a node in the chunked graph hierarchy.
- Parameters
node_id (int) – Node ID to query.
- Returns
IDs of child nodes.
- Return type
np.array of np.int64
- get_contact_sites(root_id, bounds, calc_partners=False) dict [source]
Get contacts for a root ID.
- Parameters
root_id (int) – Root ID to query.
bounds (np.array) – Bounds within a 3x2 numpy array of bounds
[[minx,maxx],[miny,maxy],[minz,maxz]]
for which to find contacts. Running this query without bounds is too slow.calc_partners (bool, optional) – If True, get partner root IDs. By default, False.
- Returns
Dict relating ids to contacts
- Return type
dict
- get_delta_roots(timestamp_past: datetime, timestamp_future: datetime = datetime.datetime(2024, 1, 19, 2, 8, 18, 663989, tzinfo=datetime.timezone.utc)) Tuple[ndarray, ndarray] [source]
Get the list of roots that have changed between timetamp_past and timestamp_future.
- Parameters
timestamp_past (datetime.datetime) – Past timepoint to query
timestamp_future (datetime.datetime, optional) – Future timepoint to query. Defaults to
datetime.datetime.now(datetime.timezone.utc)
.
- Returns
old_roots (np.ndarray of np.int64) – Roots that have expired in that interval.
new_roots (np.ndarray of np.int64) – Roots that are new in that interval.
- get_latest_roots(root_id, timestamp=None, timestamp_future=None) ndarray [source]
Returns root IDs that are related to the given root_id at a given timestamp. Can be used to find the “latest” root IDs associated with an object.
- Parameters
root_id (int) – Object root ID.
timestamp (datetime.datetime or None, optional) – Timestamp of where to query IDs from. If None then will assume you want till now.
timestamp_future (datetime.datetime or None, optional) – DEPRECATED name, use timestamp instead. Timestamp to suggest IDs from (note can be in the past relative to the root). By default, None.
- Returns
1d array with all latest successors.
- Return type
np.ndarray
- get_leaves(root_id, bounds=None, stop_layer: int = None) ndarray [source]
Get all supervoxels for a root ID.
- Parameters
root_id (int) – Root ID to query.
bounds (np.array or None, optional) – If specified, returns supervoxels within a 3x2 numpy array of bounds
[[minx,maxx],[miny,maxy],[minz,maxz]]
. If None, finds all supervoxels.stop_layer (int, optional) – If specified, returns chunkedgraph nodes at layer stop_layer default will be stop_layer=1 (supervoxels).
- Returns
Array of supervoxel IDs (or node ids if stop_layer>1).
- Return type
np.array of np.int64
- get_lineage_graph(root_id, timestamp_past=None, timestamp_future=None, as_nx_graph=False, exclude_links_to_future=False, exclude_links_to_past=False) Union[dict, DiGraph] [source]
Returns the lineage graph for a root ID, optionally cut off in the past or the future.
Each change in the chunked graph creates a new root ID for the object after that change. This function returns a graph of all root IDs for a given object, tracing the history of the object in terms of merges and splits.
- Parameters
root_id (int) – Object root ID.
timestamp_past (datetime.datetime or None, optional) – Cutoff for the lineage graph backwards in time. By default, None.
timestamp_future (datetime.datetime or None, optional) – Cutoff for the lineage graph going forwards in time. By default, None.
as_nx_graph (bool) – If True, a NetworkX graph is returned.
exclude_links_to_future (bool) – If True, links from nodes before timestamp_future to after timestamp_future are removed. If False, the link(s) which has one node before timestamp and one node after timestamp is kept.
exclude_links_to_past (bool) – If True, links from nodes before timestamp_past to after timestamp_past are removed. If False, the link(s) which has one node before timestamp and one node after timestamp is kept.
- Returns
dict – Dictionary describing the lineage graph and operations for the root ID. Not returned if as_nx_graph is True. The dictionary contains the following keys:
- ”directed”bool
Whether the graph is directed.
- ”graph”dict
Dictionary of graph attributes.
- ”links”list of dict
Each element of the list is a dictionary describing an edge in the lineage graph as “source” and “target” keys.
- ”multigraph”bool
Whether the graph is a multigraph.
- ”nodes”list of dict
Each element of the list is a dictionary describing a node in the lineage graph, usually with “id”, “timestamp”, and “operation_id” keys.
nx.DiGraph – NetworkX directed graph of the lineage graph. Only returned if as_nx_graph is True.
- get_merge_log(root_id) list [source]
Get the merge log (splits and merges) for an object.
- Parameters
root_id (int) – Object root ID to look up.
- Returns
List of merge events in the history of the object.
- Return type
list
- get_oldest_timestamp() datetime [source]
Get the oldest timestamp in the database.
- Returns
Oldest timestamp in the database.
- Return type
datetime.datetime
- get_operation_details(operation_ids: Iterable[int]) dict [source]
Get the details of a list of operations.
- Parameters
operation_ids (Iterable of int) – List/array of operation IDs.
- Returns
A dict of dicts of operation info, keys are operation IDs (as strings), values are a dictionary of operation info for the operation. These dictionaries contain the following keys:
- ”added_edges”/”removed_edges”: list of list of int
List of edges added (if a merge) or removed (if a split) by this operation. Each edge is a list of two supervoxel IDs (source and target).
- ”roots”: list of int
List of root IDs that were created by this operation.
- ”sink_coords”: list of list of int
List of sink coordinates for this operation. The sink is one of the points placed by the user when specifying the operation. Each sink coordinate is a list of three integers (x, y, z), corresponding to spatial coordinates in segmentation voxel space.
- ”source_coords”: list of list of int
List of source coordinates for this operation. The source is one of the points placed by the user when specifying the operation. Each source coordinate is a list of three integers (x, y, z), corresponding to spatial coordinates in segmentation voxel space.
- ”timestamp”: str
Timestamp of the operation.
- ”user”: str
User ID number who performed the operation (as a string).
- Return type
dict of str to dict
- get_original_roots(root_id, timestamp_past=None) ndarray [source]
Returns root IDs that are the latest successors of a given root ID.
- Parameters
root_id (int) – Object root ID.
timestamp_past (datetime.datetime or None, optional) – Cutoff for the search going backwards in time. By default, None.
- Returns
1d array with all latest successors.
- Return type
np.ndarray
- get_past_ids(root_ids, timestamp_past=None, timestamp_future=None) dict [source]
For a set of root IDs, get the list of IDs at a past or future time point that could contain parts of the same object.
- Parameters
root_ids (Iterable of int) – Iterable of root IDs to query.
timestamp_past (datetime.datetime or None, optional) – Time of a point in the past for which to look up root ids. Default is None.
timestamp_future (datetime.datetime or None, optional) – Time of a point in the future for which to look up root ids. Not implemented on the server currently. Default is None.
- Returns
Dict with keys “future_id_map” and “past_id_map”. Each is a dict whose keys are the supplied root_ids and whose values are the list of related root IDs at timestamp_past/timestamp_future.
- Return type
dict
- get_root_id(supervoxel_id, timestamp=None, level2=False) int64 [source]
Get the root ID for a specified supervoxel.
- Parameters
supervoxel_id (int) – Supervoxel id value
timestamp (datetime.datetime, optional) – UTC datetime to specify the state of the chunkedgraph at which to query, by default None. If None, uses the current time.
- Returns
Root ID containing the supervoxel.
- Return type
np.int64
- get_root_timestamps(root_ids) ndarray [source]
Retrieves timestamps when roots where created.
- Parameters
root_ids (Iterable of int) – Iterable of root IDs to query.
- Returns
Array of timestamps when root_ids were created.
- Return type
np.array of datetime.datetime
- get_roots(supervoxel_ids, timestamp=None, stop_layer=None) ndarray [source]
Get the root ID for a list of supervoxels.
- Parameters
supervoxel_ids (list or np.array of int) – Supervoxel IDs to look up.
timestamp (datetime.datetime, optional) – UTC datetime to specify the state of the chunkedgraph at which to query, by default None. If None, uses the current time.
stop_layer (int or None, optional) – If True, looks up IDs only up to a given stop layer. Default is None.
- Returns
Root IDs containing each supervoxel.
- Return type
np.array of np.uint64
- get_subgraph(root_id, bounds) Tuple[ndarray, ndarray, ndarray] [source]
Get subgraph of root id within a bounding box.
- Parameters
root_id (int) – Root (or any node ID) of chunked graph to query.
bounds (np.array) – 3x2 bounding box (x,y,z) x (min,max) in chunked graph coordinates.
- Returns
np.array of np.int64 – Node IDs in the subgraph.
np.array of np.double – Affinities of edges in the subgraph.
np.array of np.int32 – Areas of nodes in the subgraph.
- get_tabular_change_log(root_ids, filtered=True) dict [source]
Get a detailed changelog for neurons.
- Parameters
root_ids (list of int) – Object root IDs to look up.
filtered (bool) – Whether to filter the change log to only include splits and merges which affect the final state of the object (filtered=True), as opposed to including edit history for objects which as some point were split from the query objects in root_ids (filtered=False). Defaults to True.
- Returns
The keys are the root IDs, and the values are DataFrames with the following columns and datatypes:
- ”operation_id”: int
Identifier for the operation.
- ”timestamp”: int
Timestamp of the operation, provided in milliseconds. To convert to datetime, use
datetime.datetime.utcfromtimestamp(timestamp/1000)
.- ”user_id”: int
User who performed the operation.
- ”before_root_ids: list of int
Root IDs of objects that existed before the operation.
- ”after_root_ids: list of int
Root IDs of objects created by the operation. Note that this only records the root id that was kept as part of the query object, so there will only be one in this list.
- ”is_merge”: bool
Whether the operation was a merge.
- ”user_name”: str
Name of the user who performed the operation.
- ”user_affiliation”: str
Affiliation of the user who performed the operation.
- Return type
dict of pd.DataFrame
- get_user_operations(user_id: int, timestamp_start: datetime, include_undo: bool = True, timestamp_end: datetime = None) DataFrame [source]
Get operation details for a user ID. Currently, this is only available to admins.
- Parameters
user_id (int) – User ID to query (use 0 for all users (admin only)).
timestamp_start (datetime.datetime, optional) – Timestamp to start filter (UTC).
include_undo (bool, optional) – Whether to include undos. Defaults to True.
timestamp_end (datetime.datetime, optional) – Timestamp to end filter (UTC). Defaults to now.
- Returns
DataFrame including the following columns:
- ”operation_id”: int
Identifier for the operation.
- ”timestamp”: datetime.datetime
Timestamp of the operation.
- ”user_id”: int
User who performed the operation.
- Return type
pd.DataFrame
- is_latest_roots(root_ids, timestamp=None) ndarray [source]
Check whether these root IDs are still a root at this timestamp.
- Parameters
root_ids (list or array of int) – Root IDs to check.
timestamp (datetime.datetime, optional) – Timestamp to check whether these IDs are valid root IDs in the chunked graph. Defaults to None (assumes now).
- Returns
Array of whether these are valid root IDs.
- Return type
np.array of bool
- is_valid_nodes(node_ids, start_timestamp=None, end_timestamp=None) ndarray [source]
Check whether nodes are valid for given timestamp range.
Valid is defined as existing in the chunked graph. This makes no statement about these IDs being roots, supervoxel or anything in-between. It also does not take into account whether a root ID has since been edited.
- Parameters
node_ids (list or array of int) – Node IDs to check.
start_timestamp (datetime.datetime, optional) – Timestamp to check whether these IDs were valid after this timestamp. Defaults to None (assumes now).
end_timestamp (datetime.datetime, optional) – Timestamp to check whether these IDs were valid before this timestamp. Defaults to None (assumes now).
- Returns
Array of whether these are valid IDs.
- Return type
np.array of bool
- level2_chunk_graph(root_id) list [source]
Get graph of level 2 chunks, the smallest agglomeration level above supervoxels.
- Parameters
root_id (int) – Root id of object
- Returns
Edge list for level 2 chunked graph. Each element of the list is an edge, and each edge is a list of two node IDs (source and target).
- Return type
list of list
- preview_split(source_points, sink_points, root_id, source_supervoxels=None, sink_supervoxels=None, return_additional_ccs=False) Tuple[list, list, bool, list] [source]
Get supervoxel connected components from a preview multicut split.
- Parameters
source_points (array or list) – Nx3 list or array of 3d points in nm coordinates for source points (red).
sink_points (array or list) – Mx3 list or array of 3d points in nm coordinates for sink points (blue).
root_id (int) – Root ID of object to do split preview.
source_supervoxels (array, list or None, optional) – If providing source supervoxels, an N-length array of supervoxel IDs or Nones matched to source points. If None, treats as a full array of Nones. By default None.
sink_supervoxels (array, list or None, optional) – If providing sink supervoxels, an M-length array of supervoxel IDs or Nones matched to source points. If None, treats as a full array of Nones. By default None.
return_additional_ccs (bool, optional) – If True, returns any additional connected components beyond the ones with source and sink points. In most situations, this can be ignored. By default, False.
- Returns
source_connected_component (list) – Supervoxel IDs in the component with the most source points.
sink_connected_component (list) – Supervoxel IDs in the component with the most sink points.
successful_split (bool) – True if the split worked.
other_connected_components (optional) (list of lists of int) – List of lists of supervoxel IDs for any other resulting connected components. Only returned if return_additional_ccs is True.
- remesh_level2_chunks(chunk_ids) None [source]
Submit specific level 2 chunks to be remeshed in case of a problem.
- Parameters
chunk_ids (list) – List of level 2 chunk IDs.
- property segmentation_info
Complete segmentation metadata
- suggest_latest_roots(root_id, timestamp=None, stop_layer=None, return_all=False, return_fraction_overlap=False)[source]
Suggest latest roots for a given root id, based on overlap of component chunk IDs. Note that edits change chunk IDs, and so this effectively measures the fraction of unchanged chunks at a given chunk layer, which sets the size scale of chunks. Higher layers are coarser.
- Parameters
root_id (int) – Root ID of the potentially outdated object.
timestamp (datetime, optional) – Datetime at which “latest” roots are being computed, by default None. If None, the current time is used. Note that this has to be a timestamp after the creation of the root_id.
stop_layer (int, optional) – Chunk level at which to compute overlap, by default None. No value will take the 4th from the top layer, which emphasizes speed and works well for larger objects. Lower values are slower but more fine-grained. Values under 2 (i.e. supervoxels) are not recommended except in extremely fine grained scenarios.
return_all (bool, optional) – If True, return all current IDs sorted from most overlap to least, by default False. If False, only the top is returned.
return_fraction_overlap (bool, optional) – If True, return all fractions sorted by most overlap to least, by default False. If False, only the top value is returned.
- property table_name
caveclient.emannotationschemas module
- caveclient.emannotationschemas.SchemaClient(server_address=None, auth_client=None, api_version='latest', max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
- class caveclient.emannotationschemas.SchemaClientLegacy(server_address, auth_header, api_version, endpoints, server_name, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
Bases:
ClientBase
caveclient.endpoints module
caveclient.format_utils module
caveclient.frameworkclient module
- class caveclient.frameworkclient.CAVEclient(datastack_name=None, server_address=None, auth_token_file=None, auth_token_key=None, auth_token=None, global_only=False, max_retries=3, pool_maxsize=None, pool_block=None, desired_resolution=None, info_cache=None, write_server_cache=True)[source]
Bases:
object
- class caveclient.frameworkclient.CAVEclientFull(datastack_name=None, server_address=None, auth_token_file='~/.cloudvolume/secrets/cave-secret.json', auth_token_key='token', auth_token=None, max_retries=3, pool_maxsize=None, pool_block=None, desired_resolution=None, info_cache=None)[source]
Bases:
CAVEclientGlobal
A manager for all clients sharing common datastack and authentication information.
This client wraps all the other clients and keeps track of the things that need to be consistent across them. To instantiate a client:
client = CAVEclient(datastack_name='my_datastack', server_address='www.myserver.com', auth_token_file='~/.mysecrets/secrets.json')
Then * client.info is an InfoService client (see infoservice.InfoServiceClient) * client.state is a neuroglancer state client (see jsonservice.JSONService) * client.schema is an EM Annotation Schemas client (see emannotationschemas.SchemaClient) * client.chunkedgraph is a Chunkedgraph client (see chunkedgraph.ChunkedGraphClient) * client.annotation is an Annotation DB client (see annotationengine.AnnotationClient)
All subclients are loaded lazily and share the same datastack name, server address, and auth tokens where used.
- Parameters
datastack_name (str, optional) – Datastack name for the services. Almost all services need this and will not work if it is not passed.
server_address (str or None) – URL of the framework server. If None, chooses the default server global.daf-apis.com. Optional, defaults to None.
auth_token_file (str or None) – Path to a json file containing the auth token. If None, uses the default location. See Auth client documentation. Optional, defaults to None.
auth_token_key (str) – Dictionary key for the token in the the JSON file. Optional, default is ‘token’.
auth_token (str or None) – Direct entry of an auth token. If None, uses the file arguments to find the token. Optional, default is None.
max_retries (int or None, optional) – Sets the default number of retries on failed requests. Optional, by default 2.
pool_maxsize (int or None, optional) – Sets the max number of threads in a requests pool, although this value will be exceeded if pool_block is set to False. Optional, uses requests defaults if None.
pool_block (bool or None, optional) – If True, prevents the number of threads in a requests pool from exceeding the max size. Optional, uses requests defaults (False) if None.
desired_resolution (Iterable[float]or None, optional) – If given, should be a list or array of the desired resolution you want queries returned in useful for materialization queries.
info_cache (dict or None, optional) – Pre-computed info cache, bypassing the lookup of datastack info from the info service. Should only be used in cases where this information is cached and thus repetitive lookups can be avoided.
- property annotation
- property chunkedgraph
- property datastack_name
- property l2cache
- property materialize
- property state
- class caveclient.frameworkclient.CAVEclientGlobal(server_address=None, auth_token_file=None, auth_token_key=None, auth_token=None, max_retries=3, pool_maxsize=None, pool_block=None, info_cache=None)[source]
Bases:
object
A manager for all clients sharing common datastack and authentication information.
This client wraps all the other clients and keeps track of the things that need to be consistent across them. To instantiate a client:
client = CAVEclient(datastack_name='my_datastack', server_address='www.myserver.com', auth_token_file='~/.mysecrets/secrets.json')
Then * client.info is an InfoService client (see infoservice.InfoServiceClient) * client.auth handles authentication * client.state is a neuroglancer state client (see jsonservice.JSONService) * client.schema is an EM Annotation Schemas client (see emannotationschemas.SchemaClient)
All subclients are loaded lazily and share the same datastack name, server address, and auth tokens (where used).
- Parameters
server_address (str or None) – URL of the framework server. If None, chooses the default server global.daf-apis.com. Optional, defaults to None.
auth_token_file (str or None) – Path to a json file containing the auth token. If None, uses the default location. See Auth client documentation. Optional, defaults to None.
auth_token_key (str) – Dictionary key for the token in the the JSON file. Optional, default is ‘token’.
auth_token (str or None) – Direct entry of an auth token. If None, uses the file arguments to find the token. Optional, default is None.
max_retries (int or None, optional) – Sets the default number of retries on failed requests. Optional, by default 2.
pool_maxsize (int or None, optional) – Sets the max number of threads in a requests pool, although this value will be exceeded if pool_block is set to False. Optional, uses requests defaults if None.
pool_block (bool or None, optional) – If True, prevents the number of threads in a requests pool from exceeding the max size. Optional, uses requests defaults (False) if None.
info_cache (dict or None, optional) – Pre-computed info cache, bypassing the lookup of datastack info from the info service. Should only be used in cases where this information is cached and thus repetitive lookups can be avoided.
- property annotation
- property auth
- change_auth(auth_token_file=None, auth_token_key=None, auth_token=None)[source]
Change the authentication token and reset services.
- Parameters
auth_token_file (str, optional) – New auth token json file path, by default None, which defaults to the existing state.
auth_token_key (str, optional) – New dictionary key under which the token is stored in the json file, by default None, which defaults to the existing state.
auth_token (str, optional) – Direct entry of a new token, by default None.
- property chunkedgraph
- property datastack_name
- property info: InfoServiceClient
- property schema
- property server_address
- property state
caveclient.infoservice module
- caveclient.infoservice.InfoServiceClient(server_address=None, datastack_name=None, auth_client=None, api_version='latest', verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, info_cache=None)[source]
- class caveclient.infoservice.InfoServiceClientV2(server_address, auth_header, api_version, endpoints, server_name, datastack_name, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, info_cache=None)[source]
Bases:
ClientBaseWithDatastack
- property aligned_volume_id
- property aligned_volume_name
- annotation_endpoint(datastack_name=None, use_stored=True)[source]
AnnotationEngine endpoint for a dataset.
- Parameters
datastack_name (str or None, optional) – Name of the datastack to look up. If None, uses the value specified by the client. Default is None.
use_stored (bool, optional) – If True, uses the cached value if available. If False, re-queries the InfoService. Default is True.
- Returns
Location of the AnnotationEngine
- Return type
str
- get_aligned_volume_info(datastack_name: str = None, use_stored=True)[source]
Gets the info record for a aligned_volume
- Parameters
datastack_name (str, optional) – datastack_name to look up. If None, uses the one specified by the client. By default None
use_stored (bool, optional) – If True and the information has already been queried for that dataset, then uses the cached version. If False, re-queries the infromation. By default True
- Returns
The complete info record for the aligned_volume
- Return type
dict or None
- get_datastack_info(datastack_name=None, use_stored=True)[source]
Gets the info record for a datastack
- Parameters
datastack_name (str, optional) – datastack to look up. If None, uses the one specified by the client. By default None
use_stored (bool, optional) – If True and the information has already been queried for that datastack, then uses the cached version. If False, re-queries the infromation. By default True
- Returns
The complete info record for the datastack
- Return type
dict or None
- get_datastacks()[source]
Query which datastacks are available at the info service
- Returns
List of datastack names
- Return type
list
- get_datastacks_by_aligned_volume(aligned_volume: str = None)[source]
Lookup what datastacks are associated with this aligned volume
- Parameters
aligned_volume (str, optional) – aligned volume to lookup. Defaults to None.
- Raises
ValueError – if no aligned volume is specified
- Returns
a list of datastack string
- Return type
list
- image_cloudvolume(**kwargs)[source]
Generate a cloudvolume instance based on the image source, using authentication if needed and sensible default values for reading CAVE resources. By default, fill_missing is True and bounded is False. All keyword arguments are passed onto the CloudVolume initialization function, and defaults can be overridden.
Requires cloudvolume to be installed, which is not included by default.
- image_source(datastack_name=None, use_stored=True, format_for='raw')[source]
Cloud path to the imagery for the dataset
- Parameters
datastack_name (str or None, optional) – Name of the datastack to look up. If None, uses the value specified by the client. Default is None.
use_stored (bool, optional) – If True, uses the cached value if available. If False, re-queries the InfoService. Default is True.
format_for ('raw', 'cloudvolume', or 'neuroglancer', optional) – Formats the path for different uses. If ‘raw’ (default), the path in the InfoService is passed along. If ‘cloudvolume’, a “precomputed://gs://” type path is converted to a full https URL. If ‘neuroglancer’, a full https URL is converted to a “precomputed://gs://” type path.
- Returns
Formatted cloud path to the flat segmentation
- Return type
str
- segmentation_cloudvolume(use_client_secret=True, **kwargs)[source]
Generate a cloudvolume instance based on the segmentation source, using authentication if needed and sensible default values for reading CAVE resources. By default, fill_missing is True and bounded is False. All keyword arguments are passed onto the CloudVolume initialization function, and defaults can be overridden.
Requires cloudvolume to be installed, which is not included by default.
- segmentation_source(datastack_name=None, format_for='raw', use_stored=True)[source]
Cloud path to the chunkgraph-backed Graphene segmentation for a dataset
- Parameters
datastack_name (str or None, optional) – Name of the datastack to look up. If None, uses the value specified by the client. Default is None.
use_stored (bool, optional) – If True, uses the cached value if available. If False, re-queries the InfoService. Default is True.
format_for ('raw', 'cloudvolume', or 'neuroglancer', optional) – Formats the path for different uses. If ‘raw’ (default), the path in the InfoService is passed along. If ‘cloudvolume’, a “graphene://https://” type path is used If ‘neuroglancer’, a “graphene://https://” type path is used, as needed by Neuroglancer.
- Returns
Formatted cloud path to the Graphene segmentation
- Return type
str
- synapse_segmentation_source(datastack_name=None, use_stored=True, format_for='raw')[source]
Cloud path to the synapse segmentation for a dataset
- Parameters
datastack_name (str or None, optional) – Name of the dataset to look up. If None, uses the value specified by the client. Default is None.
use_stored (bool, optional) – If True, uses the cached value if available. If False, re-queries the InfoService. Default is True.
format_for ('raw', 'cloudvolume', or 'neuroglancer', optional) – Formats the path for different uses. If ‘raw’ (default), the path in the InfoService is passed along. If ‘cloudvolume’, a “precomputed://gs://” type path is converted to a full https URL. If ‘neuroglancer’, a full https URL is converted to a “precomputed://gs://” type path.
- Returns
Formatted cloud path to the synapse segmentation
- Return type
str
- viewer_resolution(datastack_name=None, use_stored=True)[source]
get the viewer resolution metadata for this datastack
- Parameters
datastack_name (_type_, optional) – _description_. Defaults to None. If None use the default one configured in the client
use_stored (bool, optional) – _description_. Defaults to True. Use the cached value, if False go get a new value from server
- Returns
voxel resolution as a len(3) np.array
- Return type
np.array
caveclient.jsonservice module
- caveclient.jsonservice.JSONService(server_address=None, auth_client=None, api_version='latest', ngl_url=None, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
Client factory to interface with the JSON state service.
- Parameters
server_address (str, optional) – URL to the JSON state server. If None, set to the default global server address. By default None.
auth_client (An Auth client, optional) – An auth client with a token for the same global server, by default None
api_version (int or 'latest', optional) – Which endpoint API version to use or ‘latest’. By default, ‘latest’ tries to ask the server for which versions are available, if such functionality exists, or if not it defaults to the latest version for which there is a client. By default ‘latest’
ngl_url (str or None, optional) – Default neuroglancer deployment URL. Only used for V1 and later.
- class caveclient.jsonservice.JSONServiceV1(server_address, auth_header, api_version, endpoints, server_name, ngl_url, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None)[source]
Bases:
ClientBase
- build_neuroglancer_url(state_id, ngl_url=None, target_site=None, static_url=False)[source]
Build a URL for a Neuroglancer deployment that will automatically retrieve specified state. If the datastack is specified, this is prepopulated from the info file field “viewer_site”. If no ngl_url is specified in either the function or the client, a fallback neuroglancer deployment is used.
- Parameters
state_id (int) – State id to retrieve
ngl_url (str) – Base url of a neuroglancer deployment. If None, defaults to the value for the datastack or the client. As a fallback, a default deployment is used.
target_site ('seunglab' or 'cave-explorer' or 'mainline' or None) – Set this to ‘seunglab’ for a seunglab deployment, or either ‘cave-explorer’/’mainline’ for a google main branch deployment. If None, checks the info field of the neuroglancer endpoint to determine which to use. Default is None.
static_url (bool) – If True, treats “state_id” as a static URL directly to the JSON and does not use the state service.
- Returns
The full URL requested
- Return type
str
- get_neuroglancer_info(ngl_url=None)[source]
Get the info field from a Neuroglancer deployment
- Parameters
ngl_url (str (optional)) – URL to a Neuroglancer deployment. If None, defaults to the value for the datastack or the client.
- Returns
JSON-formatted info field from the Neuroglancer deployment
- Return type
dict
- get_state_json(state_id)[source]
Download a Neuroglancer JSON state
- Parameters
state_id (int) – ID of a JSON state uploaded to the state service.
- Returns
JSON specifying a Neuroglancer state.
- Return type
dict
- property ngl_url
- save_state_json_local(json_state, filename, overwrite=False)[source]
Save a Neuroglancer JSON state to a JSON file locally.
- Parameters
json_state (dict) – Dict representation of a neuroglancer state
filename (str) – Filename to save the state to
overwrite (bool) – Whether to overwrite the file if it exists. Default False.
- Return type
None
- property state_service_endpoint
Endpoint URL for posting JSON state
- upload_state_json(json_state, state_id=None, timestamp=None)[source]
Upload a Neuroglancer JSON state
- Parameters
json_state (dict) – Dict representation of a neuroglancer state
state_id (int) – ID of a JSON state uploaded to the state service. Using a state_id is an admin feature.
timestamp (time.time) – Timestamp for json state date. Requires state_id.
- Returns
state_id of the uploaded JSON state
- Return type
int
caveclient.l2cache module
- caveclient.l2cache.L2CacheClient(server_address=None, table_name=None, auth_client=None, api_version='latest', max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, verify=True)[source]
- class caveclient.l2cache.L2CacheClientLegacy(server_address, auth_header, api_version, endpoints, server_name, table_name=None, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, verify=True)[source]
Bases:
ClientBase
- property attributes
- cache_metadata()[source]
Retrieves the meta data for the cache
- Returns
keys are attribute names, values are datatypes
- Return type
dict
- property default_url_mapping
- get_l2data(l2_ids, attributes=None)[source]
Gets the attributed statistics data for L2 ids.
- Parameters
l2_ids (list or np.ndarray) – a list of level 2 ids
attributes (list, optional) – a list of attributes to retrieve. Defaults to None which will return all that are available. Available stats are [‘area_nm2’, ‘chunk_intersect_count’, ‘max_dt_nm’, ‘mean_dt_nm’, ‘pca’, ‘pca_val’, ‘rep_coord_nm’, ‘size_nm3’]. See docs for more description.
- Returns
keys are l2 ids, values are data
- Return type
dict
- has_cache(datastack_name=None)[source]
Checks if the l2 cache is available for the dataset
- Parameters
datastack_name (str, optional) – The name of the datastack to check, by default None (if None, uses the client’s datastack)
- Returns
True if the l2 cache is available, False otherwise
- Return type
bool
caveclient.materializationengine module
- caveclient.materializationengine.MaterializationClient(server_address, datastack_name=None, auth_client=None, cg_client=None, synapse_table=None, api_version='latest', version=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, desired_resolution=None, over_client=None)[source]
Factory for returning AnnotationClient
- Parameters
server_address (str) – server_address to use to connect to (i.e. https://minniev1.microns-daf.com)
datastack_name (str) – Name of the datastack.
auth_client (AuthClient or None, optional) – Authentication client to use to connect to server. If None, do not use authentication.
api_version (str or int (default: latest)) – What version of the api to use, 0: Legacy client (i.e www.dynamicannotationframework.com) 2: new api version, (i.e. minniev1.microns-daf.com) ‘latest’: default to the most recent (current 2)
cg_client (caveclient.chunkedgraph.ChunkedGraphClient) – chunkedgraph client for live materializations
synapse_table (str) – default synapse table for queries
version (default version to query) – if None will default to latest version
desired_resolution (Iterable[float] or None, optional) – If given, should be a list or array of the desired resolution you want queries returned in useful for materialization queries.
- Returns
List of datastack names for available datastacks on the annotation engine
- Return type
- class caveclient.materializationengine.MaterializatonClientV2(server_address, auth_header, api_version, endpoints, server_name, datastack_name, cg_client=None, synapse_table=None, version=None, verify=True, max_retries=None, pool_maxsize=None, pool_block=None, over_client=None, desired_resolution=None)[source]
Bases:
ClientBase
- property datastack_name
- get_annotation_count(table_name: str, datastack_name=None, version=None)[source]
Get number of annotations in a table
- Parameters
(str) (table_name) – name of table to mark for deletion
datastack_name (str or None, optional,) – Name of the datastack_name. If None, uses the one specified in the client.
version (int or None, optional) – the version to query, else get the tables in the most recent version
- Returns
number of annotations
- Return type
int
- get_table_metadata(table_name: str, datastack_name=None, version: int = None, log_warning: bool = True)[source]
Get metadata about a table
- Parameters
table_name (str) – name of table to mark for deletion
datastack_name – str or None, optional, Name of the datastack_name. If None, uses the one specified in the client.
version (int, optional) – version to get. If None, uses the one specified in the client.
log_warning (bool, optional) – whether to print out warnings to the logger. Defaults to True.
- Returns
metadata dictionary for table
- Return type
dict
- get_tables(datastack_name=None, version=None)[source]
Gets a list of table names for a datastack
- Parameters
datastack_name (str or None, optional) – Name of the datastack, by default None. If None, uses the one specified in the client. Will be set correctly if you are using the framework_client
version (int or None, optional) – the version to query, else get the tables in the most recent version
- Returns
List of table names
- Return type
list
- get_timestamp(version: int = None, datastack_name: str = None)[source]
Get datetime.datetime timestamp for a materialization version.
- Parameters
version (int or None, optional) – Materialization version, by default None. If None, defaults to the value set in the client.
datastack_name (str or None, optional) – Datastack name, by default None. If None, defaults to the value set in the client.
- Returns
Datetime when the materialization version was frozen.
- Return type
datetime.datetime
- get_version_metadata(version: int = None, datastack_name: str = None)[source]
get metadata about a version
- Parameters
version (int, optional) – version number to get metadata about. Defaults to client default version.
datastack_name (str, optional) – datastack to query. Defaults to client default datastack.
- get_versions(datastack_name=None, expired=False)[source]
get versions available
- Parameters
datastack_name ([type], optional) – [description]. Defaults to None.
- get_versions_metadata(datastack_name=None, expired=False)[source]
get the metadata for all the versions that are presently available and valid
- Parameters
datastack_name (str, optional) – datastack to query. If None, defaults to the value set in the client.
expired (bool, optional) – whether to include expired versions. Defaults to False.
- Returns
a list of metadata dictionaries
- Return type
list[dict]
- property homepage
- ingest_annotation_table(table_name: str, datastack_name: str = None)[source]
Trigger supervoxel lookup and rootID looksup of new annotations in a table.
- Parameters
table_name (str) – table to drigger
datastack_name (str, optional) – datastack to trigger it. Defaults to what is set in client.
- Returns
status code of response from server
- Return type
response
- join_query(tables, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, suffixes: list = None, datastack_name: str = None, return_df: bool = True, split_positions: bool = False, materialization_version: int = None, metadata: bool = True, desired_resolution: Iterable = None, random_sample: int = None)[source]
generic query on materialization tables
- Args:
- tables: list of lists with length 2 or ‘str’
- list of two lists: first entries are table names, second
entries are the columns used for the join
- filter_in_dict (dict of dicts, optional):
outer layer: keys are table names inner layer: keys are column names, values are allowed entries. Defaults to None.
- filter_out_dict (dict of dicts, optional):
outer layer: keys are table names inner layer: keys are column names, values are not allowed entries. Defaults to None.
- filter_equal_dict (dict of dicts, optional):
outer layer: keys are table names inner layer: keys are column names, values are specified entry. Defaults to None.
- filter_spatial (dict of dicts, optional):
outer layer: keys are table names: inner layer: keys are column names, values are bounding boxes
as [[min_x, min_y,min_z],[max_x, max_y, max_z]] Expressed in units of the voxel_resolution of this dataset.
Defaults to None
- filter_regex_dict (dict of dicts, optional):
outer layer: keys are table names: inner layer: keys are column names, values are regex strings Defaults to None
- select_columns (dict of lists of str, optional): keys are table names,values are the list of columns from that table.
Defaults to None, which will select all tables. Will be passed to server as select_column_maps. Passing a list will be passed as select_columns which is deprecated.
- offset (int, optional): result offset to use. Defaults to None.
will only return top K results.
limit (int, optional): maximum results to return (server will set upper limit, see get_server_config) suffixes (dict, optional): suffixes to use for duplicate columns, keys are table names, values are the suffix datastack_name (str, optional): datastack to query.
If None defaults to one specified in client.
- return_df (bool, optional): whether to return as a dataframe
default True, if False, data is returned as json (slower)
- split_positions (bool, optional): whether to break position columns into x,y,z columns
default False, if False data is returned as one column with [x,y,z] array (slower)
- materialization_version (int, optional): version to query.
If None defaults to one specified in client.
- metadata: (bool, optional)toggle to return metadata
If True (and return_df is also True), return table and query metadata in the df.attr dictionary.
- desired_resolution (Iterable, optional):
What resolution to convert position columns to. Defaults to None will use defaults.
random_sample: (int, optional) : if given, will do a tablesample of the table to return that many annotations
- Returns
a pandas dataframe of results of query
- Return type
pd.DataFrame
- live_live_query(table: str, timestamp: datetime, joins=None, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, split_positions: bool = False, metadata: bool = True, suffixes: dict = None, desired_resolution: Iterable = None, allow_missing_lookups: bool = False, random_sample: int = None)[source]
Beta method for querying cave annotation tables with rootIDs and annotations at a particular timestamp. Note: this method requires more explicit mapping of filters and selection to table as its designed to test a more general endpoint that should eventually support complex joins.
- Parameters
table (str) – principle table to query
timestamp (datetime) – timestamp to use for querying
joins (list) – a list of joins, where each join is a list of [table1,column1, table2, column2]
filter_in_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and list values to accept . Defaults to None.
filter_out_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and list values to reject. Defaults to None.
filter_equal_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values to equate. Defaults to None.
filter_spatial_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values of 2x3 list of bounds. Defaults to None.
filter_regex_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values of regex strings. Defaults to None.
select_columns (_type_, optional) – a dictionary with tables as keys, values are list of columns. Defaults to None.
offset (int, optional) – value to offset query by. Defaults to None.
limit (int, optional) – limit of query. Defaults to None.
datastack_name (str, optional) – datastack to query. Defaults to set by client.
split_positions (bool, optional) – whether to split positions into seperate columns, True is faster. Defaults to False.
metadata (bool, optional) – whether to attach metadata to dataframe. Defaults to True.
suffixes (dict, optional) – what suffixes to use on joins, keys are table_names, values are suffixes. Defaults to None.
desired_resolution (Iterable, optional) – What resolution to convert position columns to. Defaults to None will use defaults.
allow_missing_lookups (bool, optional) – If there are annotations without supervoxels and rootids yet, allow results. Defaults to False.
random_sample – (int, optional) : if given, will do a tablesample of the table to return that many annotations
Example
- live_live_query(“table_name”,datetime.datetime.now(datetime.timezone.utc),
- joins=[[table_name, table_column, joined_table, joined_column],
[joined_table, joincol2, third_table, joincol_third]]
- suffixes={
“table_name”:”suffix1”, “joined_table”:”suffix2”, “third_table”:”suffix3”
}, select_columns= {
“table_name”:[ “column”,”names”], “joined_table”:[“joined_colum”]
}, filter_in_dict= {
- “table_name”:{
“column_name”:[included,values]
}
}, filter_out_dict= {
- “table_name”:{
“column_name”:[excluded,values]
}
}, filter_equal_dict”={
- “table_name”:{
“column_name”:value
},
- filter_spatial_dict”= {
“table_name”: { “column_name”: [[min_x, min_y, min_z], [max_x, max_y, max_z]]
} filter_regex_dict”= {
“table_name”: { “column_name”: “regex_string”
} :returns: result of query :rtype: pd.DataFrame
- live_query(table: str, timestamp: datetime, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, split_positions: bool = False, post_filter: bool = True, metadata: bool = True, merge_reference: bool = True, desired_resolution: Iterable = None, random_sample: int = None)[source]
generic query on materialization tables
- Parameters
table – ‘str’
timestamp (datetime.datetime) – time to materialize (in utc) pass datetime.datetime.now(datetime.timezone.utc) for present time
filter_in_dict (dict , optional) – keys are column names, values are allowed entries. Defaults to None.
filter_out_dict (dict, optional) – keys are column names, values are not allowed entries. Defaults to None.
filter_equal_dict (dict, optional) – inner layer: keys are column names, values are specified entry. Defaults to None.
filter_spatial (dict, optional) –
- inner layer: keys are column names, values are bounding boxes
as [[min_x, min_y,min_z],[max_x, max_y, max_z]] Expressed in units of the voxel_resolution of this dataset. Defaults to None
filter_regex_dict (dict, optional) – inner layer: keys are column names, values are regex strings
offset (int, optional) – offset in query result
limit (int, optional) – maximum results to return (server will set upper limit, see get_server_config)
select_columns (list of str, optional) – columns to select. Defaults to None.
suffixes – (list[str], optional): suffixes to use on duplicate columns
offset – result offset to use. Defaults to None. will only return top K results.
datastack_name (str, optional) – datastack to query. If None defaults to one specified in client.
split_positions (bool, optional) – whether to break position columns into x,y,z columns default False, if False data is returned as one column with [x,y,z] array (slower)
post_filter (bool, optional) – whether to filter down the result based upon the filters specified if false, it will return the query with present root_ids in the root_id columns, but the rows will reflect the filters translated into their past IDs. So if, for example, a cell had a false merger split off since the last materialization. those annotations on that incorrect portion of the cell will be included if this is False, but will be filtered down if this is True. (Default=True)
metadata – (bool, optional) : toggle to return metadata If True (and return_df is also True), return table and query metadata in the df.attr dictionary.
merge_reference – (bool, optional) : toggle to automatically join reference table If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that
desired_resolution – (Iterable[float], Optional) : desired resolution you want all spatial points returned in If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata
random_sample – (int, optional) : if given, will do a tablesample of the table to return that many annotations
Returns: pd.DataFrame: a pandas dataframe of results of query
- lookup_supervoxel_ids(table_name: str, annotation_ids: list = None, datastack_name: str = None)[source]
Trigger supervoxel lookups of new annotations in a table.
- Parameters
table_name (str) – table to drigger
annotation_ids – (list, optional): list of annotation ids to lookup. Default is None, which will trigger lookup of entire table.
datastack_name (str, optional) – datastack to trigger it. Defaults to what is set in client.
- Returns
status code of response from server
- Return type
response
- map_filters(filters, timestamp, timestamp_past)[source]
- translate a list of filter dictionaries
from a point in the future, to a point in the past
- Parameters
filters (list[dict]) – filter dictionaries with
timestamp ([type]) – [description]
timestamp_past ([type]) – [description]
- Returns
[description]
- Return type
[type]
- most_recent_version(datastack_name=None)[source]
get the most recent version of materialization for this datastack name
- Parameters
datastack_name (str, optional) – datastack name to find most
of. (recent materialization) –
None (If) –
client. (uses the one specified in the) –
- query_table(table: str, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, return_df: bool = True, split_positions: bool = False, materialization_version: int = None, timestamp: datetime = None, metadata: bool = True, merge_reference: bool = True, desired_resolution: Iterable = None, get_counts: bool = False, random_sample: int = None)[source]
generic query on materialization tables
- Parameters
table – ‘str’
filter_in_dict (dict , optional) – keys are column names, values are allowed entries. Defaults to None.
filter_out_dict (dict, optional) – keys are column names, values are not allowed entries. Defaults to None.
filter_equal_dict (dict, optional) – inner layer: keys are column names, values are specified entry. Defaults to None.
filter_spatial_dict (dict, optional) –
- inner layer: keys are column names, values are bounding boxes
as [[min_x, min_y,min_z],[max_x, max_y, max_z]] Expressed in units of the voxel_resolution of this dataset.
filter_regex_dict (dict, optional) – inner layer: keys are column names, values are regex strings
offset (int, optional) – offset in query result
limit (int, optional) – maximum results to return (server will set upper limit, see get_server_config)
select_columns (list of str, optional) – columns to select. Defaults to None.
suffixes – (list[str], optional): suffixes to use on duplicate columns
offset – result offset to use. Defaults to None. will only return top K results.
datastack_name (str, optional) – datastack to query. If None defaults to one specified in client.
return_df (bool, optional) – whether to return as a dataframe default True, if False, data is returned as json (slower)
split_positions (bool, optional) – whether to break position columns into x,y,z columns default False, if False data is returned as one column with [x,y,z] array (slower)
materialization_version (int, optional) – version to query. If None defaults to one specified in client.
timestamp (datetime.datetime, optional) – timestamp to query If passsed will do a live query. Error if also passing a materialization version
metadata – (bool, optional) : toggle to return metadata (default True) If True (and return_df is also True), return table and query metadata in the df.attr dictionary.
merge_reference – (bool, optional) : toggle to automatically join reference table If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that
desired_resolution – (Iterable[float], Optional) : desired resolution you want all spatial points returned in If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata
random_sample – (int, optional) : if given, will do a tablesample of the table to return that many annotations
Returns: pd.DataFrame: a pandas dataframe of results of query
- synapse_query(pre_ids: Union[int, Iterable, ndarray] = None, post_ids: Union[int, Iterable, ndarray] = None, bounding_box=None, bounding_box_column: str = 'post_pt_position', timestamp: datetime = None, remove_autapses: bool = True, include_zeros: bool = True, limit: int = None, offset: int = None, split_positions: bool = False, desired_resolution: Iterable[float] = None, materialization_version: int = None, synapse_table: str = None, datastack_name: str = None, metadata: bool = True)[source]
Convience method for quering synapses. Will use the synapse table specified in the info service by default. It will also remove autapses by default. NOTE: This is not designed to allow querying of the entire synapse table. A query with no filters will return only a limited number of rows (configured by the server) and will do so in a non-deterministic fashion. Please contact your dataset administrator if you want access to the entire table.
- Parameters
pre_ids (Union[int, Iterable, optional) – pre_synaptic cell(s) to query. Defaults to None.
post_ids (Union[int, Iterable, optional) – post synaptic cell(s) to query. Defaults to None.
timestamp (datetime.datetime, optional) – timestamp to query (optional). If passed recalculate query at timestamp, do not pass with materialization_verison
bounding_box – [[min_x, min_y, min_z],[max_x, max_y, max_z]] bounding box to filter synapse locations. Expressed in units of the voxel_resolution of this dataset (optional)
bounding_box_column (str, optional) – which synapse location column to filter by (Default to “post_pt_position”)
remove_autapses (bool, optional) – post-hoc filter out synapses. Defaults to True.
include_zeros (bool, optional) – whether to include synapses to/from id=0 (out of segmentation). Defaults to True.
limit (int, optional) – number of synapses to limit, Defaults to None (server side limit applies)
offset (int, optional) – number of synapses to offset query, Defaults to None (no offset).
split_positions (bool, optional) – whether to return positions as seperate x,y,z columns (faster) defaults to False
desired_resolution – Iterable[float] or None, optional If given, should be a list or array of the desired resolution you want queries returned in useful for materialization queries.
synapse_table (str, optional) – synapse table to query. If None, defaults to self.synapse_table.
datastack_name – (str, optional): datastack to query
materialization_version (int, optional) – version to query. defaults to self.materialization_version if not specified
metadata – (bool, optional) : toggle to return metadata If True (and return_df is also True), return table and query metadata in the df.attr dictionary.
- property tables
- property version
- property views
- class caveclient.materializationengine.MaterializatonClientV3(*args, **kwargs)[source]
Bases:
MaterializatonClientV2
- get_tables_metadata(datastack_name=None, version: int = None, log_warning: bool = True)[source]
Get metadata about a table
- Parameters
datastack_name – str or None, optional, Name of the datastack_name. If None, uses the one specified in the client.
version (int, optional) – version to get. If None, uses the one specified in the client.
log_warning (bool, optional) – whether to print out warnings to the logger. Defaults to True.
- Returns
metadata dictionary for table
- Return type
dict
- get_unique_string_values(table: str, datastack_name: str = None)[source]
get unique string values for a table
Args: table: ‘str’ datastack_name (str, optional): datastack to query.
If None defaults to one specified in client.
Returns: dict[str]: a dictionary of column names and unique values
- get_view_metadata(view_name: str, materialization_version: int = None, datastack_name: str = None, log_warning: bool = True)[source]
get metadata for a view
- Parameters
view_name (str) – name of view to query
materialization_version (int, optional) – version to query. Defaults to None. (will use version set by client)
log_warning (bool, optional) – whether to log warnings. Defaults to True.
- Returns
metadata of view
- Return type
dict
- get_view_schema(view_name: str, materialization_version: int = None, datastack_name: str = None, log_warning: bool = True)[source]
get schema for a view
- Parameters
view_name (str) – name of view to query
materialization_version (int, optional) – version to query. Defaults to None. (will version set by client)
log_warning (bool, optional) – whether to log warnings. Defaults to True.
- Returns
schema of view
- Return type
dict
- get_view_schemas(materialization_version: int = None, datastack_name: str = None, log_warning: bool = True)[source]
get schemas for all views
- Parameters
materialization_version (int, optional) – version to query. Defaults to None. (will version set by client)
log_warning (bool, optional) – whether to log warnings. Defaults to True.
- Returns
schemas of all views
- Return type
dict
- get_views(version: int = None, datastack_name: str = None)[source]
get all available views for a version
- Parameters
version (int, optional) – version to query. Defaults to None. (will version set by client)
datastack_name (str, optional) – datastack to query. Defaults to None. (will use datastack set by client)
- Returns
a list of views
- Return type
list[str]
- live_live_query(table: str, timestamp: datetime, joins=None, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, split_positions: bool = False, metadata: bool = True, suffixes: dict = None, desired_resolution: Iterable = None, allow_missing_lookups: bool = False, allow_invalid_root_ids: bool = False, random_sample: int = None)[source]
Beta method for querying cave annotation tables with rootIDs and annotations at a particular timestamp. Note: this method requires more explicit mapping of filters and selection to table as its designed to test a more general endpoint that should eventually support complex joins.
- Parameters
table (str) – principle table to query
timestamp (datetime) – timestamp to use for querying
joins (list) – a list of joins, where each join is a list of [table1,column1, table2, column2]
filter_in_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and list values to accept . Defaults to None.
filter_out_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and list values to reject. Defaults to None.
filter_equal_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values to equate. Defaults to None.
filter_spatial_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values of 2x3 list of bounds. Defaults to None.
filter_regex_dict (dict, optional) – a dictionary with tables as keys, values are dicts with column keys and values of regex strings. Defaults to None.
select_columns (_type_, optional) – a dictionary with tables as keys, values are list of columns. Defaults to None.
offset (int, optional) – value to offset query by. Defaults to None.
limit (int, optional) – limit of query. Defaults to None.
datastack_name (str, optional) – datastack to query. Defaults to set by client.
split_positions (bool, optional) – whether to split positions into seperate columns, True is faster. Defaults to False.
metadata (bool, optional) – whether to attach metadata to dataframe. Defaults to True.
suffixes (dict, optional) – what suffixes to use on joins, keys are table_names, values are suffixes. Defaults to None.
desired_resolution (Iterable, optional) – What resolution to convert position columns to. Defaults to None will use defaults.
allow_missing_lookups (bool, optional) – If there are annotations without supervoxels and rootids yet, allow results. Defaults to False.
allow_invalid_root_ids (bool, optional) – If True, ignore root ids not valid at the given timestamp, otherwise raise an Error. Defaults to False.
random_sample (int, optional) – If given, will do a tablesample of the table to return that many annotations
Example
- live_live_query(“table_name”,datetime.datetime.now(datetime.timezone.utc),
- joins=[[table_name, table_column, joined_table, joined_column],
[joined_table, joincol2, third_table, joincol_third]]
- suffixes={
“table_name”:”suffix1”, “joined_table”:”suffix2”, “third_table”:”suffix3”
}, select_columns= {
“table_name”:[ “column”,”names”], “joined_table”:[“joined_colum”]
}, filter_in_dict= {
- “table_name”:{
“column_name”:[included,values]
}
}, filter_out_dict= {
- “table_name”:{
“column_name”:[excluded,values]
}
}, filter_equal_dict”={
- “table_name”:{
“column_name”:value
},
- filter_spatial_dict”= {
- “table_name”: {
“column_name”: [[min_x, min_y, min_z], [max_x, max_y, max_z]]
} filter_regex_dict”= {
- “table_name”: {
“column_name”: “regex”
}
} :returns: result of query :rtype: pd.DataFrame
- query_view(view_name: str, filter_in_dict=None, filter_out_dict=None, filter_equal_dict=None, filter_spatial_dict=None, filter_regex_dict=None, select_columns=None, offset: int = None, limit: int = None, datastack_name: str = None, return_df: bool = True, split_positions: bool = False, materialization_version: int = None, metadata: bool = True, merge_reference: bool = True, desired_resolution: Iterable = None, get_counts: bool = False, random_sample: int = None)[source]
generic query on a view
Args: table: ‘str’
- filter_in_dict (dict , optional):
keys are column names, values are allowed entries. Defaults to None.
- filter_out_dict (dict, optional):
keys are column names, values are not allowed entries. Defaults to None.
- filter_equal_dict (dict, optional):
inner layer: keys are column names, values are specified entry. Defaults to None.
- filter_spatial (dict, optional):
- inner layer: keys are column names, values are bounding boxes
as [[min_x, min_y,min_z],[max_x, max_y, max_z]] Expressed in units of the voxel_resolution of this dataset.
- filter_regex_dict (dict, optional):
inner layer: keys are column names, values are regex strings.
offset (int, optional): offset in query result limit (int, optional): maximum results to return (server will set upper limit, see get_server_config) select_columns (list of str, optional): columns to select. Defaults to None. suffixes: (list[str], optional): suffixes to use on duplicate columns offset (int, optional): result offset to use. Defaults to None.
will only return top K results.
- datastack_name (str, optional): datastack to query.
If None defaults to one specified in client.
- return_df (bool, optional): whether to return as a dataframe
default True, if False, data is returned as json (slower)
- split_positions (bool, optional): whether to break position columns into x,y,z columns
default False, if False data is returned as one column with [x,y,z] array (slower)
- materialization_version (int, optional): version to query.
If None defaults to one specified in client.
- metadata: (bool, optional)toggle to return metadata (default True)
If True (and return_df is also True), return table and query metadata in the df.attr dictionary.
- merge_reference: (bool, optional)toggle to automatically join reference table
If True, metadata will be queries and if its a reference table it will perform a join on the reference table to return the rows of that
- desired_resolution: (Iterable[float], Optional)desired resolution you want all spatial points returned in
If None, defaults to one specified in client, if that is None then points are returned as stored in the table and should be in the resolution specified in the table metadata
random_sample: (int, optional) : if given, will do a tablesample of the table to return that many annotations
Returns: pd.DataFrame: a pandas dataframe of results of query
- caveclient.materializationengine.concatenate_position_columns(df, inplace=False)[source]
function to take a dataframe with x,y,z position columns and replace them with one column per position with an xyz numpy array. Edits occur
- Parameters
df (pd.DataFrame) – dataframe to alter
inplace (bool) – whether to perform edits in place
- Returns
[description]
- Return type
pd.DataFrame
- caveclient.materializationengine.convert_position_columns(df, given_resolution, desired_resolution)[source]
function to take a dataframe with x,y,z position columns and convert them to the desired resolution from the given resolution
- Parameters
df (pd.DataFrame) – dataframe to alter
given_resolution (Iterable[float]) – what the given resolution is
desired_resoultion (Iterable[float]) – what the desired resolution is
- Returns
[description]
- Return type
pd.DataFrame
caveclient.session_config module
- caveclient.session_config.patch_session(session, max_retries=None, pool_block=None, pool_maxsize=None)[source]
Patch session to configure retry and poolsize options
- Parameters
session (requests session) – Session to modify
max_retries (Int or None, optional) – Set the number of retries per request, by default None. If None, defaults to requests package default.
pool_block (Bool or None, optional) – If True, restricts pool of threads to max size, by default None. If None, defaults to requests package default.
pool_maxsize (Int or None, optional) – Sets the max number of threads in the pool, by default None. If None, defaults to requests package default.