build(gradle): Bump software.amazon.awssdk:cleanrooms from 2.42.30 to 2.42.33 (#1286)
Bumps software.amazon.awssdk:cleanrooms from 2.42.30 to 2.42.33.
Dependabot will resolve any conflicts with this PR as long as you don’t alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)Signed-off-by: dependabot[bot] support@github.com Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Cryptographic Computing for Clean Rooms (C3R)
The Cryptographic Computing for Clean Rooms (C3R) encryption client and software development kit (SDK) provide client-side tooling which allows users to participate in AWS Clean Rooms collaborations leveraging cryptographic computing by pre- and post-processing data.
The AWS Clean Rooms User Guide contains detailed information regarding how to use the C3R encryption client in conjunction with an AWS Clean Rooms collaboration.
NOTICE: This project is released as open source under the Apache 2.0 license but is only intended for use with AWS Clean Rooms. Any other use cases may result in errors or inconsistent results.
Table of Contents
schemamodeencryptmodedecryptmodeGetting Started
Downloading Releases
The C3R encryption client command line interface and related JARs can be downloaded from the Releases section of this repository. The SDK artifacts are also available on Maven’s central repository.
System Requirements
Java Runtime Environment version 17 or newer.
Enough disk storage to hold cleartext data, temporary files, and the encrypted output. See the “Guidelines for the C3R encryption client“ section of the user guide for details on how settings affect storage needs.
Supported Data Formats
CSV and Parquet file formats are supported. For CSV files, the C3R encryption client treats all values as strings. For Parquet files, the data types are listed in What Parquet data types are supported?. See What data types can be encrypted? for information on encryption of particular data types. Further details and limitations are found in the “Supported file and data types“ section of the user guide.
The core functionality of the C3R encryption client is format agnostic; the SDK can be used for any format by implementing an appropriate RowReader and RowWriter.
AWS CLI Options in C3R
Modes which make API calls to AWS services feature optional
--profileand--regionflags, allowing for convenient selection of an AWS CLI named profile and AWS region respectively.C3R CLI Modes
The C3R encryption client is an executable JAR with a command line interface (CLI). It has several modes of operation which are described in the usage help message, e.g.:
These modes are briefly described in the subsequent portions of this README.
schemamodeFor the C3R encryption client to encrypt a tabular file for a collaboration, it must have a corresponding schema file specifying how the encrypted output should be derived from the input.
The C3R encryption client can help generate schema files for an
INPUTfile using theschemacommand. E.g.,See the “Generate an encryption schema for a tabular file“ section of the user guide for more information.
encryptmodeGiven the following:
a tabular
INPUTfile,a corresponding
SCHEMAfile,a collaboration
COLLABORATION_IDin the form of a UUID, andan environment variable
C3R_SHARED_SECRETcontaining a Base64-encoded 256-bit secret. See the “Preparing encrypted data tables“ section of the user guide for details on how to generate a shared secret key.An encrypted
OUTPUTfile can be generated by running the C3R encryption client at the command line as follows:See the “Encrypt data“ section of the user guide for more information.
decryptmodeOnce queries have been executed on encrypted data in an AWS Clean Rooms collaboration, that encrypted query results
INPUTfile can be decrypted generating a cleartextOUTPUTfile using the same Base64-encoded 256-bit secret stored in theC3R_SHARED_SECRETenvironment variable, andCOLLABORATION_IDas follows:See the “Decrypting data tables with the C3R encryption client“ section of the user guide.
SDK Usage Examples
SDK usage examples are available in the SDK packages’
src/examplesdirectories.Running C3R on Apache Spark
The
c3r-cli-sparkpackage is a version ofc3r-cliwhich must be submitted as a job to a running Apache Spark server.The JAR’s
com.amazonaws.c3r.spark.cli.Mainclass is submitted via the Apache Sparkspark-submitscript and the JAR is then run using passed command line arguments. E.g., here is how to view the top-level usage information:And here is how to submit a job for encryption:
Security Notes for running C3R on Apache Spark
It is important to note that
c3r-cli-sparkmakes no effort to add additional encryption to data transmitted or stored in temporary files by Apache Spark. This means, for example, that on an Apache Spark server with no encryption enabled, sensitive info such as theC3R_SHARED_SECRETwill appear in plaintext RPC calls between the server and workers. It is up to users to ensure their Apache Spark server has been configured according to their specific security needs. See the Apache Spark security documentation for guidance on how to configure Apache Spark server security settings.General Security Notes
The following is a high level description of some security concerns to keep in mind when using the C3R encryption client to encrypt data.
Trusted Computing Environment
The shared secret key and data-to-be-encrypted is by default consumed directly from disk by the C3R encryption client on a user’s machine. It is, therefore, left to users to take any and all necessary precautions to ensure those security concerns beyond what the C3R is capable of enforcing are met. For example:
the machine running the C3R encryption client meets the user’s needs as a trusted computing platform,
the C3R encryption client is run in a minimally privileged manner and not exposed to untrusted data/networks/etc., and
any post-encryption cleanup/wiping of keys and/or data is performed as needed on the system post encryption.
Temporary Files
When encrypting a source file, the C3R encryption client will create temporary files on disk. These files will be deleted when the C3R encryption client finishes generating the encrypted output. Unexpected termination of the C3R encryption client execution may prevent the C3R encryption client or JVM from deleting these files, allowing them to persist on disk. These temporary files will have all columns of type
fingerprintorsealedencrypted, but some additional privacy-enhancing post-processing may not have been completed. By default, the C3R encryption client will utilize the host operating system’s temporary directory for these temporary files. If a user prefers an explicit location for such files, the optional--tempDir=DIRflag can specify a different location to create such files.Frequently Asked Questions
What data types can be encrypted?
Currently, only string values are supported by sealed columns.
For fingerprint columns, types are grouped into equivalence classes. Equivalence classes allow identical fingerprints to be assigned to the same semantic value regardless of the original representation. For example, the integral value
42will be assigned the same fingerprint regardless of whether it was originally anSmallInt,Int, orBigInt. No non-integral values, however, will ever be assigned the same fingerprint as the integral value42.The following equivalence classes are supported by fingerprint columns:
BOOLEANDATEINTEGRALSTRINGFor CSV files, the C3R encryption client treats all values simply as UTF-8 encoded text and makes no attempt to interpret them differently prior to encryption.
For Parquet files, an error will be raised if a non-supported type for a particular column type is used.
What Parquet data types are supported?
The C3R encryption client can process any non-complex (i.e., primitive) data in a Parquet file that represents a data type supported by Clean Rooms. The following Parquet data types are supported:
Binarywith the following logical annotations:--parquetBinaryAsStringis set (STRINGdata type)Decimal(scale, precision)(DECIMALdata type)String(STRINGdata type)Booleanwith no logical annotation (BOOLEANdata type)Doublewith no logical annotation (DOUBLEdata type)Fixed_Len_Binary_Arraywith theDecimal(scale, precision)logical annotation (DECIMALdata type)Floatwith no logical annotation (FLOATdata type)Int32with the following logical annotations:INTdata type)Date(DATEdata type)Decimal(scale, precision)(DECIMALdata type)Int(16, true)(SMALLINTdata type)Int(32, true)(INTdata type)Int64with the following logical annotations:BIGINTdata type)Decimal(scale, precision)(DECIMALdata type)Int(64, true)(BIGINTdata type)Timestamp(isUTCAdjusted, TimeUnit.MILLIS)(TIMESTAMPdata type)Timestamp(isUTCAdjusted, TimeUnit.MICROS)(TIMESTAMPdata type)Timestamp(isUTCAdjusted, TimeUnit.NANOS)(TIMESTAMPdata type)What is an equivalence class?
An equivalence class is a set of data types that can be unambiguously compared for equality via a representative data type.
The equivalence classes are:
BOOLEANcontaining data types:BOOLEANDATEcontaining data types:DATEINTEGRALcontaining data types:BIGINT,INT,SMALLINTSTRINGcontaining data types:CHAR,STRING,VARCHARDoes the C3R encryption client implement any non-standard cryptography?
The C3R encryption client uses only NIST-standardized algorithms and– with one exception– only by calling their implementation in the Java standard cryptographic library. The sole exception is that the client has its own implementation of HKDF (from RFC5869), but using MAC algorithms from the Java standard cryptographic library.
Does the C3R encryption client support FIPS?
Yes, the C3R encryption client supports FIPS endpoints. For more information, see the AWS documentation on Dual-stack and FIPS endpoints.
License
This project is licensed under the Apache-2.0 License.