Safe Support Vector Machine Notebook#

A Quick Start Guide to implementing Safer Support Vector Machines#

Commands commented out for path manipulation are for developers only#

[1]:
import logging
import os

import numpy as np
from sklearn import datasets

logging.basicConfig()
logger = logging.getLogger("wrapper_svm")
logger.setLevel(logging.INFO)

from sacroml.safemodel.classifiers import SafeSVC

Use the sklearn Wisconsin breast cancer dataset#

[2]:
cancer = datasets.load_breast_cancer()
x = np.asarray(cancer.data, dtype=np.float64)
y = np.asarray(cancer.target, dtype=np.float64)

Kernel for approximator: equivalent to rbf.#

[3]:
def rbf(x, y, gamma=1):
    return np.exp(-gamma * np.sum((x - y) ** 2))


def rbf_svm(x, y, gamma=1):
    r = np.zeros((x.shape[0], y.shape[0]))
    for i in range(x.shape[0]):
        for j in range(y.shape[0]):
            r[i, j] = rbf(x[i, :], y[j, :], gamma)
    return r

Set parameters#

[4]:
gamma = 0.1  # Kernel width
C = 1  # Penalty term
dhat = 5  # Dimension of approximator
eps = 500  # DP level (not very private)

Define Differentially Private version with DP level (approximate)#

[5]:
clf3 = SafeSVC(eps=eps, dhat=dhat, C=C, gamma=gamma)
clf3.fit(x, y)
c3 = clf3.predict(x)
p3 = clf3.predict_proba(x)

Define the model and fit it.#

Save and Request Release#

We are warned that dhat is too low.#

[6]:
clf3 = SafeSVC(eps=eps, dhat=dhat, C=C, gamma=gamma)
clf3.fit(x, y)
clf3.save(name="testSaveSVC.pkl")
clf3.request_release(path="testSaveSVC", ext="pkl")
[7]:
target_yaml = os.path.normpath("testSaveSVC/target.yaml")
with open(target_yaml) as f:
    print(f.read())
dataset_name: ''
dataset_module_path: ''
features: {}
generalisation_error: .nan
safemodel:
- researcher: unknown
  model_type: SVC
  details: 'WARNING: model parameters may present a disclosure risk:

    - parameter dhat = 5 identified as less than the recommended min value of 1000.'
  recommendation: Do not allow release
  reason: 'WARNING: model parameters may present a disclosure risk:

    - parameter dhat = 5 identified as less than the recommended min value of 1000.'
  timestamp: '2025-12-02 21:28:03'
model_type: SklearnModel
model_name: SafeSVC
model_params: {}
model_path: model.pkl
X_train_path: ''
y_train_path: ''
X_test_path: ''
y_test_path: ''
X_train_orig_path: ''
y_train_orig_path: ''
X_test_orig_path: ''
y_test_orig_path: ''
proba_train_path: ''
proba_test_path: ''
indices_train_path: ''
indices_test_path: ''

Set Parameters to safe values#

[8]:
gamma = 0.1  # Kernel width
C = 1  # Penalty term
dhat = 1000  # Dimension of approximator
eps = 500  # DP level (not very private)

Define the model and fit it.#

Save and Request Release#

Examine the checkfile#

[10]:
target_yaml = os.path.normpath("testSaveSVC/target.yaml")
with open(target_yaml) as f:
    print(f.read())
dataset_name: ''
dataset_module_path: ''
features: {}
generalisation_error: .nan
safemodel:
- researcher: unknown
  model_type: SVC
  details: 'Model parameters are within recommended ranges.

    '
  recommendation: Proceed to next step of checking
  timestamp: '2025-12-02 21:28:04'
model_type: SklearnModel
model_name: SafeSVC
model_params: {}
model_path: model.pkl
X_train_path: ''
y_train_path: ''
X_test_path: ''
y_test_path: ''
X_train_orig_path: ''
y_train_orig_path: ''
X_test_orig_path: ''
y_test_orig_path: ''
proba_train_path: ''
proba_test_path: ''
indices_train_path: ''
indices_test_path: ''

[ ]: