1
2
3
4 """ code for calculating empirical risk
5
6 """
7 import math
8
9
11 return math.log(x) / math.log(2.)
12
13
15 """ Calculates Burges's formulation of the risk bound
16
17 The formulation is from Eqn. 3 of Burges's review
18 article "A Tutorial on Support Vector Machines for Pattern Recognition"
19 In _Data Mining and Knowledge Discovery_ Kluwer Academic Publishers
20 (1998) Vol. 2
21
22 **Arguments**
23
24 - VCDim: the VC dimension of the system
25
26 - nData: the number of data points used
27
28 - nWrong: the number of data points misclassified
29
30 - conf: the confidence to be used for this risk bound
31
32
33 **Returns**
34
35 - a float
36
37 **Notes**
38
39 - This has been validated against the Burges paper
40
41 - I believe that this is only technically valid for binary classification
42
43 """
44
45 h = VCDim
46 l = nData
47 eta = conf
48
49 numerator = h * (math.log(2. * l / h) + 1.) - math.log(eta / 4.)
50 structRisk = math.sqrt(numerator / l)
51
52 rEmp = float(nWrong) / l
53
54 return rEmp + structRisk
55
56
58 """
59 the formulation here is from pg 58, Theorem 4.6 of the book
60 "An Introduction to Support Vector Machines" by Cristiani and Shawe-Taylor
61 Cambridge University Press, 2000
62
63
64 **Arguments**
65
66 - VCDim: the VC dimension of the system
67
68 - nData: the number of data points used
69
70 - nWrong: the number of data points misclassified
71
72 - conf: the confidence to be used for this risk bound
73
74
75 **Returns**
76
77 - a float
78
79 **Notes**
80
81 - this generates odd (mismatching) values
82
83 """
84
85
86 d = VCDim
87 delta = conf
88 l = nData
89 k = nWrong
90
91 structRisk = math.sqrt((4. / nData) * (d * log2((2. * math.e * l) / d) + log2(4. / delta)))
92 rEmp = 2. * k / l
93 return rEmp + structRisk
94
95
97 """
98
99 The formulation here is from Eqns 4.22 and 4.23 on pg 108 of
100 Cherkassky and Mulier's book "Learning From Data" Wiley, 1998.
101
102 **Arguments**
103
104 - VCDim: the VC dimension of the system
105
106 - nData: the number of data points used
107
108 - nWrong: the number of data points misclassified
109
110 - conf: the confidence to be used for this risk bound
111
112 - a1, a2: constants in the risk equation. Restrictions on these values:
113
114 - 0 <= a1 <= 4
115
116 - 0 <= a2 <= 2
117
118 **Returns**
119
120 - a float
121
122
123 **Notes**
124
125 - This appears to behave reasonably
126
127 - the equality a1=1.0 is by analogy to Burges's paper.
128
129 """
130
131 h = VCDim
132 n = nData
133 eta = conf
134 rEmp = float(nWrong) / nData
135
136 numerator = h * (math.log(float(a2 * n) / h) + 1) - math.log(eta / 4.)
137 eps = a1 * numerator / n
138
139 structRisk = eps / 2. * (1. + math.sqrt(1. + (4. * rEmp / eps)))
140
141 return rEmp + structRisk
142