Dr. Alexandra Cristea: NN exam solutions, 2001, 21 nov.TECHNISCHE UNIVERSITEIT EINDHOVEN
Faculteit Wiskunde en Informatica
Tentamen Neurale Netwerken (2L490) op 21 november 2001, 14.00-17.00 uur.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Alle antwoorden dienen duidelijk geformuleerd en gemotiveerd te worden.

________________________________________________________________________________


1) Beschouw een discreet een-laags perceptron met n inputs met gewichten w1 , ... , wn en
drempel -w0.
a) Geef de uitvoer y van zo'n perceptron bij gegeven input x in R^n.
R: y = { 1, w0 + sum(i=1,..,n) wi * xi > 0 ;
         0, rest }

b) Geef een discreet een-laags perceptron met twee inputs dat de boolse functie f(x1,x2) = x1 ^ x2
berekent. (^ is de logische "en".)
R: e.g., 2 inputs, 1 output, weights: w0= -1.5 (=-treshold); w1=w2= 1;


c) Geef een discreet een-laags perceptron met drie inputs dat de boolse functie f(x1,x2,x3) = (x1 ^ x2) v x3
berekent. (v is de logische "of".)
R: e.g., 3 inputs, 1 output, weights: w0= -1.5, w1=w2= 1; w3= 2;

d) Is het mogelijk om de boolse functie f(x1,x2,x3,x4) = (x1 ^ x2) v (x3 ^ x4) door een discreet
een-laags perceptron met vier inputs te berekenen? Hint: Bekijk de ongelijkheden waaraan
de vier gewichten en de drempel moeten voldoen goed.
R: for x1=x4=0, x2=x3=1, output f=0 => w0 + w2 + w3 <= 0 (1)
   for x1=x4=1, x2=x3=0, output f=0 => w0 + w1 + w4 <= 0 (2)
   for x1=x2=1, x3=x4=0, output f=1 => w0 + w1 + w2 >  0 (3)
   for x1=x2=0, x3=x4=1, output f=1 => w0 + w3 + w4 >  0 (4)

from (1) + (2) => 2wo + w1 + w2 + w3 + w4 <= 0; (5)
from (3) + (4) => 2wo + w1 + w2 + w3 + w4 >  0; (6)

(5) and (6) are a contradiction. q.e.d.

e) Geef een discreet twee-laags perceptron dat de functie f gegeven in onderdeel d) berekent.
R: e.g., 4 inputs, 1 hidden layer with 2 neurons, 1 output with weights:

input -> hidden
w01=w02= -1.5 (rsp. thresholds of both hidden neurons are 1.5)
w11=w21=1
w12=w22=0 (first 2 inputs connected to only 1 hidden neuron: AND connection)
w31=w41=0
w32=w42=1 (second 2 inputs connected to only the 2nd hidden neuron: AND connection)

hidden -> output
v03= -0.5
v1=v2=1 (OR connection)



*****


2) Beschouw het volgende twee-laags perceptron met een invoer x en een uitvoer y.
x --- w1 --->( ) --- w2 ---> y
Invoer zowel als uitvoer zijn reeelwaardig. Zowel de verborgen laag als de uitvoerlaag
bestaan uit een neuron. Beide neuronen hebben geen drempel en gebruiken de standaard
activatiefunctie f (met f(z) = 1=(1 + exp(-z))). Laat w1 het gewicht tussen input en het
neuron in de verborgen laag zijn en laat w2 het gewicht tussen dit neuron en het neuron
in de uitvoerlaag zijn, zie ook bijgevoegde figuur.

a) Geef de uitvoer y van dit netwerk als functie van de invoer x.
R: y=f(w2 * f(w1 * x)) = 1/[1 + {exp( -w2/[1 + exp(-w1 * x) ] )}]

b) Geef de definitie van de kwadratische fout E bij een leerpaar (x,t).
E = 0.5 * (t - y)^2 with y as above.

c) Leid de standaard gradient descent leerregels af voor de gewichten w1 en w2.
R:let's derivate first the general case of dwi:

dwi = - alpha dE/dwi = - alpha * 0.5 * d[(t - y)^2 ]/dwi = - alpha * 0.5 * 2* (t - y) * d(t - y)/dwi =
= - alpha * (t - y) * (-1) * dy/dwi = alpha * (t - y) * dy/dwi = alpha * (t - y) * d(f(w2 * f(w1 * x)))/dwi =
= alpha * (t - y) * f(w2 * f(w1 * x))' * d(w2 * f(w1 * x))/dwi =
    (because f(w2 * f(w1 * x)) = y, and f'(z) = f(z)(1 - f(z)))
= alpha * (t - y) * y * (1 - y) * d(w2 * f(w1 * x))/dwi

from here on, the derivation depends on i, so:

dw1 = alpha * (t - y) * y * (1 - y) * d(w2 * f(w1 * x))/dw1 =
= alpha * (t - y) * y * (1 - y) * w2 * df(w1 * x))/dw1 =
= alpha * (t - y) * y * (1 - y) * w2 * f(w1 * x)' * d(w1 * x)/dw1 =
= alpha * (t - y) * y * (1 - y) * w2 * f(w1 * x) * (1 - f(w1 * x)) * d(w1 * x)/dw1 =
= alpha * (t - y) * y * (1 - y) * w2 * f(w1 * x) * (1 - f(w1 * x)) * x

dw2 = alpha * (t - y) * y * (1 - y) * d(w2 * f(w1 * x))/dw2 =
= alpha * (t - y) * y * (1 - y) * f(w1 * x)


d) Beschouw het leerpaar (x, t) = (0, 0.75). Stel dat initieel de gewichten zijn w1 = 0 en
w2 = 0. Bereken de echte uitvoer y en de kwadratische fout E in deze situatie.
R:
y = 1/[1 + {exp( -w2/[1 + exp(-w1 * x) ] )}] = 1/[1 + {exp( -0/[1 + exp(- 0 * 0) ] )}] =
= 1/[1 + exp(0)] = 1/2 = 0.5

E = 0.5 * (t - y)^2 = 0.5 * (0.75 - 0.5)^2 = 0.5 * (0.25)^2 = 1/2 * (1/4)^2 = 1/32 = 0,03125


e) Bereken de nieuwe waarde van de gewichten w1 en w2 indien vanuit de initiele situatie in
onderdeel d) een leerstap met leerparameter alpha = 32 wordt gedaan. Gebruik eventueel dat
f'(z) = f(z)(1 - f(z)).

R:
dw1 = alpha * (t - y) * y * (1 - y) * w2 * f(w1 * x) * (1 - f(w1 * x)) * x =
= 32 * (0.75 - 0.5) * 0.5 * (1 - 0.5) * 0 * f(0 * 0) * (1 - f(0 * 0)) * 0 = 0

dw2 = alpha * (t - y) * y * (1 - y) * f(w1 * x) =
= 32 * (0.75 - 0.5) * 0.5 * (1 - 0.5)* f(0 * 0) = 32 * 1/4 * 1/2 * 1/2 * f(0) =
= 2 * f(0) = 2 * [1/1 + exp(0)] = 2 * 1/2 = 1

w1(new) = w1(old) + dw1 = 0 + 0 = 0;
w2(new) = w2(old) + dw2 = 0 + 1 = 1;


f) Bereken voor de nieuwe gewichten opnieuw de uitvoer y en de kwadratische fout E.
Gebruik eventueel onderstaande tabel voor f:
x f(x)
0 0.5
1/8 0.53
1/4 0.56
1/2 0.62
3/4 0.68
1 0.73

R:

y(new) = f(w2(new) * f(w1(new) * x)) = f(1 * f(0 * x)) = f(1 * f(0)) = f(1 * 0.5) = f(0.5) = 0.62

E = 0.5 * (t-y(new))^2 = 1/2 * (0.75 - 0.62)^2 = 1/2 * (0.13)^2 = 0,00845



*****



3) In deze opgave beschouwen we Hopfeld netwerken.

a) Geef de definitie van consensus voor een Hopfeld netwerk.
R: C(x) = 1/2 * sum (i,j=1 .. n) [wij * xi * xj] - sum (i=1 .. n) [bi * xi]


b) Beschouw een Hopeld netwerk bestaande uit 3 neuronen. Deze neuronen hebben drempel
0, en alle onderlinge verbindingen hebben gewicht 1, zie ook onderstaande figuur.

(1) -- 1 -- (2)
  \         /
   1      1
    \    /
     (3)

Bereken voor al de 8 toestanden van dit netwerk de consensus.
R: bi = 0 for i =1 .. n; wii = 0 for i =1 .. n; wij = wji = 1 for i,j = 1 .. n;
C(x) = 1/2 * [w12 * x1 * x2 + w13 * x1 * x3 + w21 * x2 * x1 + w23 * x2 * x3 + w31 * x3 * x1 + w32 * x3 * x2] =
= 1/2 * [x1 * x2 + x1 * x3 + x2 * x1 + x2 * x3 +  x3 * x1 + x3 * x2] =
= x1 * x2 + x1 * x3 + x2 * x3

So C(x) for the 8 states of the network are:
x1 x2 x3 C(x)
 1  1  1   3
 1  1 -1  -1
 1 -1  1  -1
 1 -1 -1  -1
-1  1  1  -1
-1  1 -1  -1
-1 -1  1  -1
-1 -1 -1   3

c) We beschouwen nu de asynchrone update regel. Laat zien dat dit netwerk slechts twee
stabiele toestanden heeft, namelijk (1, 1, 1) en ( -1, -1, -1).
R: asynchronous update rule is:
i := random(n);
zi := sum(j=1 .. n) [wij * xj(t)] - bj;
xi(t+1) := sgn(zi);
for j:=1  to n do
  if j<>i then xj(t+1) := xj(t);
od

By definition, a stable state means xi = sgn(sum(j=1 .. n) [wij * xj(t)] - bj);
x1 x2 x3 sgn(z1) sgn(z2) sgn(z3)
 1  1  1    1       1       1    => stable
 1  1 -1    1       1       1    => not stable
 1 -1  1    1       1       1    => not stable
 1 -1 -1   -1       1       1    => not stable
-1  1  1    1       1       1    => not stable
-1  1 -1    1      -1       1    => not stable
-1 -1  1    1       1      -1    => not stable
-1 -1 -1    1       1       1    => stable

(alternatively, from the definition of asynchronous update x -> x* in Na(x) ( asynchronous neighbourhood)
=> C(x*) > C(x), as C(x)=3=max (local max of consensus), it cannot change anymore, so those states are stable)


d) Beschouw nu weer een willekeurig Hopeld netwerk met n neuronen, dat bedoeld is als
associatief geheugen voor de toestanden xi(1) , ... , xi(P). Geef de bijbehorende uitdrukking
voor de gewichten en drempels.
R: wij = sum(mu =1 .. P) [1/n * xi(i)(mu) * xi(j)(mu)]
   bi = 0; wii = 0;


e) Stel nu n = 6, en neem aan dat we de twee toestanden xi(1) = (1, 1, 1, 1, 1, 1) en xi(2) =
(1, 1, 1, -1, -1, -1) willen onthouden. Bereken volgens de hierboven gegeven formule het
bijbehorende Hopfield netwerk. Geef een grafische weergave van dit Hopfield netwerk,
waarbij verbindingen met gewicht 0 weggelaten worden.
R: n=6; bi = 0; wii = 0;
wij = sum(mu =1 .. P) [1/n * xi(i)(mu) * xi(j)(mu)] = sum(mu =1 .. 2) [1/6 * xi(i)(mu) * xi(j)(mu)] =
= 1/6 * [xi(i)(1) * xi(j)(1) + xi(i)(2) * xi(j)(2)]

w12 = 1/6 (1 * 1 + 1 * 1) = 1/3 = w21
same for: w13 = 1/3 = w31; w23 = 1/3 = w32
w45 = 1/6 [1 * 1 + (-1) * (-1)] = 1/3 = w45
same for: w46 = 1/3 = w64; w56 = 1/3 = w65
w14 = 1/6 [1 * 1 + 1 * (-1)] = 1/6 [1 - 1] = 0 = w41
same for: w15 = 0 = w51; w16 = 0 = w61; w24 = 0 = w42; w25 = 0 = w52;
same for: w26 = 0 = w62; w34 = 0 = w43; w35 = 0 = w53; w36 = 0 = w63;

graf:

(1) -- 1/3 -- (2)
  \         /
   1/3     1/3
    \    /
     (3)

(4) -- 1/3 -- (5)
  \         /
   1/3     1/3
    \    /
     (6)


f) Wat zijn de stabiele toestanden van dit netwerk?
R:
( 1,  1,  1,  1,  1,  1)
( 1,  1,  1, -1, -1, -1)
(-1, -1, -1,  1,  1,  1)
(-1, -1, -1, -1, -1, -1)

because xi = sgn[1/3 * (xk + xl)] only for i,k,l in 1,2,3 or 4,5,6;




*****




4) In deze opgave beschouwen we netwerken voor unsupervised leren. Gegeven is een leerver-
zameling X = {x(1) , ... , x(P) | x(q) in R^k voor q = 1,...,P}.

a) Geef de batch versie van het simple unsupervised leeralgoritme voor het "leren" van X
door een netwerk van n neuronen. Geef aan wat u met begrippen als "winnaar" bedoeld.

R:
batch version of simple unsupervised learning:
for i:=1 to n do wj := initial vector (often random or 0);
repeat
  for j:=1 to n do dwj := 0;
  for q := 1  to P do
     i := win (W, x(q));
     dwi := dwi + (alpha/P) * (x(q) - wi);
  end;
  for j := 1 to n do wj = wj + dwj;
until stopcriterium.

Stopcriterium can be a timeout value, or the more strict condition that no changes occur anymore
(stable state).


winner = the neuron that is allowed to change the weights (here i).
i = win(W, x) :
    a) || wi - x || <= || wj - x || , for all j = 1,..,n, j <> i
           ( so the winner is the neuron with the shortest distance to the input vector)
        b) wiT x >= wjT x , for all j = 1,..,n, j <> i
           ( so the winner is the neuron with the highest output for the given input)



b) Stel X bestaat uit de vier punten (-10;-1), (-10; 1), (10;-1) en (10; 1) in de twee dimen-
sionale ruimte. Stel dat we twee neuronen gebruiken, met initiele gewichten w1 = (1; 1)
en w2 = (30; 2). Geef aan wat de eindwaarden van deze gewichten zijn bij gebruik van de
batch versie van het simple unsupervised leeralgoritme, met het minimale afstand criterium
voor de winnaar. Zie ook onderstaande figuur (niet op schaal).

           |x2
           |
           |                        w2
(-10; 1)   |w1  ( 10; 1)
           |
___________|_______________________________________  x1
           |
           |
(-10;-1)   |    ( 10;-1)
           |
           |

R:
the distance from w1 to xi (any i) is always shorter than the distance from w2 to xi, so
w1 will be always the winner. Therefore, w2 will not change, and w1 will migrate towards the
middle of the points:

w1(new) = (0, 0)
w2(new) = w2(old) = (30, 2)

c) Herhaal het voorafgaande onderdeel, maar nu met initiele gewichten w1=(1; 1) en w2=(3; 2).

R:
In the new situation, w1 will win for the last 2 points, and w2 for the first two (as it is
closer). The final situation is:
w1(new) = ( 10, 0)
w2(new) = (-10, 0)




++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Waardering:
Opgave 1a: 2 punten Opgave 2a: 1 punt
Opgave 1b: 2 punten Opgave 2b: 1 punt
Opgave 1c: 2 punten Opgave 2c: 2 punten
Opgave 1d: 2 punten Opgave 2d: 2 punten
Opgave 1e: 2 punten Opgave 2e: 2 punten
                    Opgave 2f: 2 punten

Opgave 3a: 1 punt    Opgave 4a: 4 punten
Opgave 3b: 1 punt    Opgave 4b: 3 punten
Opgave 3c: 2 punten  Opgave 4c: 3 punten
Opgave 3d: 2 punten
Opgave 3e: 2 punten
Opgave 3f: 2 punten