Safe Support Vector Machine Notebook#
A Quick Start Guide to implementing Safer Support Vector Machines#
Commands commented out for path manipulation are for developers only#
[1]:
import logging
import os
import numpy as np
from sklearn import datasets
logging.basicConfig()
logger = logging.getLogger("wrapper_svm")
logger.setLevel(logging.INFO)
from sacroml.safemodel.classifiers import SafeSVC
Use the sklearn Wisconsin breast cancer dataset#
[2]:
cancer = datasets.load_breast_cancer()
x = np.asarray(cancer.data, dtype=np.float64)
y = np.asarray(cancer.target, dtype=np.float64)
Kernel for approximator: equivalent to rbf.#
[3]:
def rbf(x, y, gamma=1):
return np.exp(-gamma * np.sum((x - y) ** 2))
def rbf_svm(x, y, gamma=1):
r = np.zeros((x.shape[0], y.shape[0]))
for i in range(x.shape[0]):
for j in range(y.shape[0]):
r[i, j] = rbf(x[i, :], y[j, :], gamma)
return r
Set parameters#
[4]:
gamma = 0.1 # Kernel width
C = 1 # Penalty term
dhat = 5 # Dimension of approximator
eps = 500 # DP level (not very private)
Define Differentially Private version with DP level (approximate)#
[5]:
clf3 = SafeSVC(eps=eps, dhat=dhat, C=C, gamma=gamma)
clf3.fit(x, y)
c3 = clf3.predict(x)
p3 = clf3.predict_proba(x)
Define the model and fit it.#
Save and Request Release#
We are warned that dhat is too low.#
[6]:
clf3 = SafeSVC(eps=eps, dhat=dhat, C=C, gamma=gamma)
clf3.fit(x, y)
clf3.save(name="testSaveSVC.pkl")
clf3.request_release(path="testSaveSVC", ext="pkl")
[7]:
target_yaml = os.path.normpath("testSaveSVC/target.yaml")
with open(target_yaml) as f:
print(f.read())
dataset_name: ''
dataset_module_path: ''
features: {}
generalisation_error: .nan
safemodel:
- researcher: unknown
model_type: SVC
details: 'WARNING: model parameters may present a disclosure risk:
- parameter dhat = 5 identified as less than the recommended min value of 1000.'
recommendation: Do not allow release
reason: 'WARNING: model parameters may present a disclosure risk:
- parameter dhat = 5 identified as less than the recommended min value of 1000.'
timestamp: '2025-12-02 21:28:03'
model_type: SklearnModel
model_name: SafeSVC
model_params: {}
model_path: model.pkl
X_train_path: ''
y_train_path: ''
X_test_path: ''
y_test_path: ''
X_train_orig_path: ''
y_train_orig_path: ''
X_test_orig_path: ''
y_test_orig_path: ''
proba_train_path: ''
proba_test_path: ''
indices_train_path: ''
indices_test_path: ''
Set Parameters to safe values#
[8]:
gamma = 0.1 # Kernel width
C = 1 # Penalty term
dhat = 1000 # Dimension of approximator
eps = 500 # DP level (not very private)
Define the model and fit it.#
Save and Request Release#
Model parameters are within recommended ranges. The saved model can pass through next step of checking procedure#
[9]:
clf3 = SafeSVC(eps=eps, dhat=dhat, C=C, gamma=gamma)
clf3.fit(x, y)
clf3.save(name="testSaveSVC.pkl")
clf3.request_release(path="testSaveSVC", ext="pkl")
Examine the checkfile#
[10]:
target_yaml = os.path.normpath("testSaveSVC/target.yaml")
with open(target_yaml) as f:
print(f.read())
dataset_name: ''
dataset_module_path: ''
features: {}
generalisation_error: .nan
safemodel:
- researcher: unknown
model_type: SVC
details: 'Model parameters are within recommended ranges.
'
recommendation: Proceed to next step of checking
timestamp: '2025-12-02 21:28:04'
model_type: SklearnModel
model_name: SafeSVC
model_params: {}
model_path: model.pkl
X_train_path: ''
y_train_path: ''
X_test_path: ''
y_test_path: ''
X_train_orig_path: ''
y_train_orig_path: ''
X_test_orig_path: ''
y_test_orig_path: ''
proba_train_path: ''
proba_test_path: ''
indices_train_path: ''
indices_test_path: ''
[ ]: