A Python client for Sage Bionetworks’Synapse, a collaborative, open-source research platform that allows teams to share data, track analyses, and collaborate. The Python client can be used as a library for development of software that communicates with Synapse or as a command-line utility.
# Here are a few ways to install the client. Choose the one that fits your use-case
# sudo may optionally be needed depending on your setup
pip install --upgrade synapseclient
pip install --upgrade "synapseclient[pandas]"
pip install --upgrade "synapseclient[pandas, pysftp, boto3]"
…or to upgrade an existing installation of the Synapse client:
# sudo may optionally be needed depending on your setup
pip install --upgrade synapseclient
The dependencies on pandas, pysftp, and boto3 are optional. Synapse
Tables integrate
with Pandas. The library pysftp is required for users of
SFTP file storage. All
libraries require native code to be compiled or installed separately from prebuilt
binaries.
The Synapse client can be used from the shell command prompt. Valid commands
include: query, get, cat, add, update, delete, and onweb. A few examples are
shown.
The Synapse client can be used to write software that interacts with the Sage Bionetworks Synapse repository. More examples can be found in the Tutorial section found here
Examples
Log-in and create a Synapse object
import synapseclient
syn = synapseclient.Synapse()
## You may optionally specify the debug flag to True to print out debug level messages.
## A debug level may help point to issues in your own code, or uncover a bug within ours.
# syn = synapseclient.Synapse(debug=True)
## log in using auth token
syn.login(authToken='auth_token')
Sync a local directory to synapse
This is the recommended way of synchronizing more than one file or directory to a synapse project through the use of synapseutils. Using this library allows us to handle scheduling everything required to sync an entire directory tree. Read more about the manifest file format in synapseutils.syncToSynapse
import synapseclient
import synapseutils
import os
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
path = os.path.expanduser("~/synapse_project")
manifest_path = f"{path}/my_project_manifest.tsv"
project_id = "syn1234"
# Create the manifest file on disk
with open(manifest_path, "w", encoding="utf-8") as f:
pass
# Walk the specified directory tree and create a TSV manifest file
synapseutils.generate_sync_manifest(
syn,
directory_path=path,
parent_id=project_id,
manifest_path=manifest_path,
)
# Using the generated manifest file, sync the files to Synapse
synapseutils.syncToSynapse(
syn,
manifestFile=manifest_path,
sendMessages=False,
)
Store a Project to Synapse
import synapseclient
from synapseclient.models import Project
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
project = Project('My uniquely named project')
project.store()
print(project.id)
print(project)
Store a Folder to Synapse (Does not upload files within the folder)
import synapseclient
from synapseclient.models import Folder
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
folder = Folder(name='my_folder', parent_id="syn123")
folder.store()
print(folder.id)
print(folder)
Store a File to Synapse
import synapseclient
from synapseclient.models import File
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
file = File(
path="path/to/file.txt",
parent_id="syn123",
)
file.store()
print(file.id)
print(file)
Get a data matrix
import synapseclient
from synapseclient.models import File
syn = synapseclient.Synapse()
## log in using auth token
syn.login(authToken='auth_token')
## retrieve a 100 by 4 matrix
matrix = File(id='syn1901033').get()
## inspect its properties
print(matrix.name)
print(matrix.description)
print(matrix.path)
## load the data matrix into a dictionary with an entry for each column
with open(matrix.path, 'r') as f:
labels = f.readline().strip().split('\t')
data = {label: [] for label in labels}
for line in f:
values = [float(x) for x in line.strip().split('\t')]
for i in range(len(labels)):
data[labels[i]].append(values[i])
## load the data matrix into a numpy array
import numpy as np
np.loadtxt(fname=matrix.path, skiprows=1)
The purpose of synapseutils is to create a space filled with convenience functions that includes traversing through large projects, copying entities, recursively downloading files and many more.
Example
import synapseutils
import synapseclient
syn = synapseclient.login()
# copies all Synapse entities to a destination location
synapseutils.copy(syn, "syn1234", destinationId = "syn2345")
# copies the wiki from the entity to a destination entity. Only a project can have sub wiki pages.
synapseutils.copyWiki(syn, "syn1234", destinationId = "syn2345")
# Traverses through Synapse directories, behaves exactly like os.walk()
walkedPath = synapseutils.walk(syn, "syn1234")
for dirpath, dirname, filename in walkedPath:
print(dirpath)
print(dirname)
print(filename)
OpenTelemetry (OTEL)
OpenTelemetry helps support the analysis of traces and spans which can provide insights into latency, errors, and other performance metrics. The synapseclient is ready to provide traces should you want them. The Synapse Python client supports OTLP Exports and can be configured via environment variables as defined here.
Once the docker container is running you can access the Jaeger UI via: http://localhost:16686
Environment Variable Configuration
By default, the OTEL exporter sends trace data to http://localhost:4318/v1/traces. You can customize the behavior through environment variables:
OTEL_SERVICE_NAME: Defines a unique identifier for your application or service in telemetry data (defaults to ‘synapseclient’). Set this to a descriptive name that represents your specific implementation, making it easier to filter and analyze traces in your monitoring system.
OTEL_EXPORTER_OTLP_ENDPOINT: Specifies the destination URL for sending telemetry data (defaults to ‘http://localhost:4318'). Configure this to direct data to your preferred OpenTelemetry collector or monitoring service.
OTEL_DEBUG_CONSOLE: Controls local visibility of telemetry data. Set to ‘true’ to output trace information to the console, which is useful for development and troubleshooting without an external collector.
OTEL_SERVICE_INSTANCE_ID: Distinguishes between multiple instances of the same service (e.g., ‘prod’, ‘development’, ‘local’). This helps identify which specific deployment or environment generated particular traces.
OTEL_EXPORTER_OTLP_HEADERS: Configures authentication and metadata for telemetry exports. Use this to add API keys, authentication tokens, or custom metadata when sending traces to secured collectors or third-party monitoring services.
Enabling OpenTelemetry in your code
To enable OpenTelemetry with the Synapse Python client, simply call the
enable_open_telemetry() method on the Synapse class. Additionally you can access an
instance of the OpenTelemetry tracer via the get_tracer() call. This will allow you
to create new spans for your code.
import synapseclient
# Enable OpenTelemetry with default settings
synapseclient.Synapse.enable_open_telemetry()
tracer = synapseclient.Synapse.get_tracer()
# Then create and use the Synapse client as usual
with tracer.start_as_current_span("my_function_span"):
syn = synapseclient.Synapse()
syn.login(authToken='auth_token')
Exporting Synapse Client Traces to SigNoz Cloud for developers
Prerequisites
Create an account and obtain access to Signoz Cloud.
Create an ingestion key by following the step here.
Environment Variable Configuration
The following environment variables are required to be set:
Explanation of both required and optional environment variables:
Required
OTEL_EXPORTER_OTLP_ENDPOINT: The OTLP endpoint to which telemetry is exported.
OTEL_EXPORTER_OTLP_HEADERS: Authentication/metadata for exports (e.g., API keys, tokens). For SigNoz, use signoz-ingestion-key=<key>.
Optional
OTEL_SERVICE_NAME: Unique identifier for your app/service in telemetry data (defaults to synapseclient). Use a descriptive name so you can easily filter and analyze traces per service.
OTEL_DEBUG_CONSOLE: Controls local visibility of telemetry data. Set to ‘true’ to output trace information to the console, which is useful for development and troubleshooting without an external collector.
OTEL_SERVICE_INSTANCE_ID: Distinguishes between multiple instances of the same service (e.g., ‘prod’, ‘development’, ‘local’). This helps identify which specific deployment or environment generated particular traces.
Enabling OpenTelemetry in your code
To enable OpenTelemetry with the Synapse Python client, simply call the
enable_open_telemetry() method on the Synapse class. Additionally you can access an
instance of the OpenTelemetry tracer via the get_tracer() call. This will allow you
to create new spans for your code.
import synapseclient
from dotenv import load_dotenv
# Set environment variables
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://ingest.us.signoz.cloud"
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = "signoz-ingestion-key=<your key>"
os.environ["OTEL_SERVICE_NAME"] = "your-service-name"
os.environ["OTEL_SERVICE_INSTANCE_ID"] = "local"
# Enable OpenTelemetry with default settings
synapseclient.Synapse.enable_open_telemetry()
tracer = synapseclient.Synapse.get_tracer()
# Then create and use the Synapse client as usual
with tracer.start_as_current_span("my_function_span"):
syn = synapseclient.Synapse()
syn.login(authToken='auth_token')
Advanced Configuration
You can pass additional resource attributes to enable_open_telemetry():
When OpenTelemetry is enabled in the Synapse client, the following happens automatically:
Instrumentation is set up for:
Threading (via ThreadingInstrumentor): Ensures proper context propagation across threads, which is essential for maintaining trace continuity in multi-threaded applications
HTTP libraries:
requests (via RequestsInstrumentor): Captures all HTTP requests made using the requests library, including methods, URLs, status codes, and timing information
httpx (via HTTPXClientInstrumentor): Tracks both synchronous and asynchronous HTTP requests made with the httpx library
urllib (via URLLibInstrumentor): Monitors lower-level HTTP operations made directly with Python’s standard library
Each instrumented HTTP library includes custom hooks that extract Synapse entity IDs from URLs when possible and add them as span attributes
Traces are configured to collect spans across your application:
Spans automatically capture operation duration, status, and errors.
An attribute propagation mechanism ensures that certain attributes (like synapse.transfer.direction and synapse.operation.category) are properly passed to child spans for uploads/downloads.
Trace data is exported via OTLP (OpenTelemetry Protocol).
Resource information is automatically added to your traces, including:
Python version
OS type
Synapse client version
Service name (defaults to “synapseclient” but can be customized via environment variables)
Service instance ID
Note that once enabled, OpenTelemetry cannot be disabled in the same process - you would need to restart your Python interpreter to disable it.
Synapse Python Client
A Python client for Sage Bionetworks’ Synapse, a collaborative, open-source research platform that allows teams to share data, track analyses, and collaborate. The Python client can be used as a library for development of software that communicates with Synapse or as a command-line utility.
There is also a Synapse client for R.
Documentation
For more information about the Python client, see:
For more information about interacting with Synapse, see:
For release information, see:
Installation
The Python Synapse client has been tested on versions 3.10, 3.11, 3.12, 3.13 and 3.14 on Mac OS X, Ubuntu Linux and Windows.
Starting from Synapse Python client version 3.0, Synapse Python client requires Python >= 3.10
Install using pip
The Python Synapse Client is on PyPI and can be installed with pip:
…or to upgrade an existing installation of the Synapse client:
The dependencies on
pandas,pysftp, andboto3are optional. Synapse Tables integrate with Pandas. The librarypysftpis required for users of SFTP file storage. All libraries require native code to be compiled or installed separately from prebuilt binaries.Install from source
Clone the source code repository.
Alternatively, you can use pip to install a particular branch, commit, or other git reference:
or
Command line usage
The Synapse client can be used from the shell command prompt. Valid commands include: query, get, cat, add, update, delete, and onweb. A few examples are shown.
downloading test data from Synapse
getting help
Note that a Synapse account is required.
Usage as a library
The Synapse client can be used to write software that interacts with the Sage Bionetworks Synapse repository. More examples can be found in the Tutorial section found here
Examples
Log-in and create a Synapse object
Sync a local directory to synapse
This is the recommended way of synchronizing more than one file or directory to a synapse project through the use of
synapseutils. Using this library allows us to handle scheduling everything required to sync an entire directory tree. Read more about the manifest file format insynapseutils.syncToSynapseStore a Project to Synapse
Store a Folder to Synapse (Does not upload files within the folder)
Store a File to Synapse
Get a data matrix
Authentication
Authentication toward Synapse can be accomplished with the clients using personal access tokens. Learn more about Synapse personal access tokens
Learn about the multiple ways one can login to Synapse.
Synapse Utilities (synapseutils)
The purpose of synapseutils is to create a space filled with convenience functions that includes traversing through large projects, copying entities, recursively downloading files and many more.
Example
OpenTelemetry (OTEL)
OpenTelemetry helps support the analysis of traces and spans which can provide insights into latency, errors, and other performance metrics. The synapseclient is ready to provide traces should you want them. The Synapse Python client supports OTLP Exports and can be configured via environment variables as defined here.
Read more about OpenTelemetry in Python here
Exporting Synapse Client Traces to Jaeger for developers
The following shows an example of setting up jaegertracing via docker and executing a simple python script that implements the Synapse Python client.
Running the jaeger docker container
Start a docker container with the following options:
Explanation of ports:
4318HTTP port for OTLP data collection16686Jaeger UI for visualizing tracesOnce the docker container is running you can access the Jaeger UI via:
http://localhost:16686Environment Variable Configuration
By default, the OTEL exporter sends trace data to
http://localhost:4318/v1/traces. You can customize the behavior through environment variables:OTEL_SERVICE_NAME: Defines a unique identifier for your application or service in telemetry data (defaults to ‘synapseclient’). Set this to a descriptive name that represents your specific implementation, making it easier to filter and analyze traces in your monitoring system.OTEL_EXPORTER_OTLP_ENDPOINT: Specifies the destination URL for sending telemetry data (defaults to ‘http://localhost:4318'). Configure this to direct data to your preferred OpenTelemetry collector or monitoring service.OTEL_DEBUG_CONSOLE: Controls local visibility of telemetry data. Set to ‘true’ to output trace information to the console, which is useful for development and troubleshooting without an external collector.OTEL_SERVICE_INSTANCE_ID: Distinguishes between multiple instances of the same service (e.g., ‘prod’, ‘development’, ‘local’). This helps identify which specific deployment or environment generated particular traces.OTEL_EXPORTER_OTLP_HEADERS: Configures authentication and metadata for telemetry exports. Use this to add API keys, authentication tokens, or custom metadata when sending traces to secured collectors or third-party monitoring services.Enabling OpenTelemetry in your code
To enable OpenTelemetry with the Synapse Python client, simply call the
enable_open_telemetry()method on the Synapse class. Additionally you can access an instance of the OpenTelemetry tracer via theget_tracer()call. This will allow you to create new spans for your code.Exporting Synapse Client Traces to SigNoz Cloud for developers
Prerequisites
Environment Variable Configuration
The following environment variables are required to be set:
OTEL_EXPORTER_OTLP_HEADERS:signoz-ingestion-key=<key>OTEL_EXPORTER_OTLP_ENDPOINT:https://ingest.us.signoz.cloudOTEL_SERVICE_NAME:your-service-nameExplanation of both required and optional environment variables:
Required
OTEL_EXPORTER_OTLP_ENDPOINT: The OTLP endpoint to which telemetry is exported.OTEL_EXPORTER_OTLP_HEADERS: Authentication/metadata for exports (e.g., API keys, tokens). For SigNoz, usesignoz-ingestion-key=<key>.Optional
OTEL_SERVICE_NAME: Unique identifier for your app/service in telemetry data (defaults to synapseclient). Use a descriptive name so you can easily filter and analyze traces per service.OTEL_DEBUG_CONSOLE: Controls local visibility of telemetry data. Set to ‘true’ to output trace information to the console, which is useful for development and troubleshooting without an external collector.OTEL_SERVICE_INSTANCE_ID: Distinguishes between multiple instances of the same service (e.g., ‘prod’, ‘development’, ‘local’). This helps identify which specific deployment or environment generated particular traces.Enabling OpenTelemetry in your code
To enable OpenTelemetry with the Synapse Python client, simply call the
enable_open_telemetry()method on the Synapse class. Additionally you can access an instance of the OpenTelemetry tracer via theget_tracer()call. This will allow you to create new spans for your code.Advanced Configuration
You can pass additional resource attributes to
enable_open_telemetry():When OpenTelemetry is enabled in the Synapse client, the following happens automatically:
Instrumentation is set up for:
ThreadingInstrumentor): Ensures proper context propagation across threads, which is essential for maintaining trace continuity in multi-threaded applicationsrequests(viaRequestsInstrumentor): Captures all HTTP requests made using the requests library, including methods, URLs, status codes, and timing informationhttpx(viaHTTPXClientInstrumentor): Tracks both synchronous and asynchronous HTTP requests made with the httpx libraryurllib(viaURLLibInstrumentor): Monitors lower-level HTTP operations made directly with Python’s standard libraryTraces are configured to collect spans across your application:
synapse.transfer.directionandsynapse.operation.category) are properly passed to child spans for uploads/downloads.Resource information is automatically added to your traces, including:
Note that once enabled, OpenTelemetry cannot be disabled in the same process - you would need to restart your Python interpreter to disable it.
License and Copyright
© Copyright 2013-25 Sage Bionetworks
This software is licensed under the Apache License, Version 2.0.