utils

app.utils

Utility functions for the party-side app.

Functions

Name Description
assign_columns Assign columns from a form to collections.
check_is_csv Determine whether a file has the csv extension.
convert_dataframe_to_bf Convert a dataframe of features to a bloom filter.
download_files Serialize, compress, and send a data frame with its embedder.

assign_columns

app.utils.assign_columns(form, feature_funcs)

Assign columns from a form to collections.

All columns belong to one of three collections: columns to drop, raw columns to keep, or a column feature factory specification.

Parameters

Name Type Description Default
form dict Form from our column chooser page. required
feature_funcs dict Mapping between column types and feature functions. required

Returns

Type Description
list[str] List of columns to drop.
list[str] List of columns to keep in their raw format.
dict[str, func] Mapping between column names and feature functions.

check_is_csv

app.utils.check_is_csv(path)

Determine whether a file has the csv extension.

Parameters

Name Type Description Default
path str Path to the file. required

Returns

Type Description
bool Whether the file name follows the pattern {name}.csv or not.

convert_dataframe_to_bf

app.utils.convert_dataframe_to_bf(df, colspec, other_columns=None, salt='')

Convert a dataframe of features to a bloom filter.

Convert the columns to features based on the colspec. The features are then combined and converted to Bloom filter indices with the Bloom filter norm also calculated.

Parameters

Name Type Description Default
df pandas.pandas.DataFrame Data frame of features. required
colspec dict Dictionary designating columns in the data frame as particular feature types to be processed as appropriate. required
other_columns None | list Columns to be returned as they appear in the data in addition to bf_indices, bf_norms and thresholds. None
salt str Cryptographic salt to add to tokens before hashing. ''

Returns

Type Description
pandas.pandas.DataFrame Data frame of bloom-filtered data.

download_files

app.utils.download_files(dataframe, embedder, party, archive='archive')

Serialize, compress, and send a data frame with its embedder.

Parameters

Name Type Description Default
dataframe pprl.embedder.embedder.EmbeddedDataFrame Data frame to be downloaded. required
embedder pprl.embedder.embedder.Embedder Embedder used to embed dataframe. required
party str Name of the party. required
archive str Name of the archive. Default is "archive". 'archive'

Returns

Type Description
flask.flask.Response Response containing a ZIP archive with the data frame and its embedder.