Artificial Neural Network worked example for beginners Curve fitting Data Preparation Network Design

J.E. Steck Department of Mechanical Engineering, Wichita State University, Wichita, KS 67260-0035

S.R. Skinner Department of Electrical Engineering, Wichita State University, Wichita, KS 67260-0044

Abstract We present a mathematical implementation of a quantum mechanical artificial neural network, in the quasi-continuum regime, using the nonlinearity inherent in the real-time propagation of a quantum system coupled to its environment. Our model is that of a quantum dot molecule coupled to the substrate lattice through optical phonons, and subject to a timevarying external field. Using discretized Feynman path integrals, we find that the real time evolution of the system can be put into a form which resembles the equations for the virtual neuron activation levels of an artificial neural network. The timeline discretization points serve as virtual neurons. We then train the network using a simple gradient descent algorithm, and find it is possible in some regions of the phase space to perform any desired classical logic gate. Because the network is quantum mechanical we can also train purely quantum gates such as a phase shift. I. INTRODUCTION Many artificial neural networks are simulations, running on algorithmic computers [1]. With this, the massive parrallel processing speed advantage of a neural network is lost. Clearly it would be better to utilize the intrinsic physics of a physical system to perform the computation. Many efforts have been expended in this direction, using systems ranging from nonlinear optical materials to proteins [2]. At the same time, many other workers have been exploring the possibility of building quantum computers [3]. In this paper we present an architecture for a quantum neural computer, using the real time evolution of quantum dot molecules [4], and show by simulation that such an architecture can perform any classical logic gate. Since the time evolution is quantum mechanical, it can compute backwards or forwards in time; moreover, it can calculate a purely quantum mechanical gate, such as a phase shift, for which there is no classical equivalent. II. MATHEMATICAL DEVELOPMENT In most artificial neural network implementations, the neurons receive inputs from other processors via weighted connections and calculate an output which is passed on to other neurons. The calculated output xi of the ith neuron is performed on the signals {xj} from the other neurons in the network and is given by

xi'

jw j

ij

fj(xj)

(1)

where wij is the weight of the connection from the output of neuron j to neuron i, and fj is a bounded differentiable real valued neuron activation function for neuron j [1]. Similarly we can write the expression for the time evolution of the quantum mechanical state of a system:

|R(xf,T)> ' G(xf,T;x0,0) |R(x0,0)>

'

(xf,T)

m

D[x(t)] exp(

(x0,0) (xN%1 xf,T)

'lim N64

m

(x0,0)

[email protected]@@dxN (

i

T

mdJ[ 2 mx &V(x)]) |R(x ,0)>

S0

1

0

2

0

m (N 1)/2 i)t m x &x ) exp( [ ( j 1 j )2 & V(xj)])|R(x0,0)> 2BiS)t S j 0 2 )t

j N

(2)

Here |R(x0,0)> is the input state, the initial state of the quantum system. |R(xf,T)> is the output state, the state of the system at time t=T. G is the Green's function, which propagates the system forward in time, from initial position x 0 at time t=0 to final position xf at time t=T. The second line of Equation 2 expresses G in the Feynman path integral formulation of quantum mechanics [5], in which G is thought of as the infinite sum over all possible paths that the system could possibly take to get from x0 to xf. This is indicated by the notation D[x(t)], an infinitesimal change in the path x(t). Each path is weighted by the complex exponential of the phase contributed by that path, given by the classical action for that path; m is the mass, 2BS is Planck's constant, and V is the potential energy. This is equivalent to the third line, in which the paths are discretized: N)t=T, with the number of discretization points, N, 6 4. If, instead, we take N to be finite, we have a “quasi-continuum”, and the N intermediate states can be considered to be the states of N quantum neurons, one at each time slice j)t. The nonlinearity necessary for neural computation is inherent in the “kinetic energy” term, (xj+1 - xj)2, and in the exponential. Each of the N neurons’ different possible states contributes to the final measured state; the amount it contributes, can be adjusted by changing the potential energy, V(x). III. MODEL SYSTEM Our model system is that of a quantum dot molecule [4, 6], with five dots arranged as the pips on a playing card. The dots are close enough to each other that tunneling is possible between any two neighbors. Two electrons are fed into the molecule, which then has a doubly-degenerate ground state (in the absence of environmental potentials). These are shown in Figure 1, and can be thought of as the “polarization” P of the molecule, equal to ±1, that is, the Pauli matrix operator Fz. In Equation 2 this would be the value of the position variable x at a given time t, x(t). Figure 1 A quantum dot molecule, the physical In addition to adjusting or training V(x), we can model for our quantum neural network. The circles obtain an additional trainable nonlinearity by coupling the represent the quantum dots, which are spatially close quantum system to its environment. We choose this enough that electrons can tunnel between neighboring environment to be a set of Gaussians, that is, the dots. Coulombic repulsion between the two electrons environment has a quadratic Hamiltonian, or, equivalently, resident in the molecule gives it two possible (ground) a normal distribution; if the set is taken to be infinite, any states, P=1 and P= -1. desired influence including dissipation can be produced [7]. In our model this would be represented by the coupling between the electronic state of the dot molecules and the lattice through optical phonons. Physically the coupling would have to be weak enough to be represented accurately as linear; for example, GaAs substrate satisfies this, with a (unitless) electron-phonon coupling parameter of 0.08 <<1 [8] Equation 2 becomes:

|R(Fz(N)t),T)>'

j

{Fz(j)t)}

exp(

i

S

j j

[KFx(j)t) % 0(j)t)Fz(j)t)] ) I[Fz(t)] |R(Fz(0),0>

(3)

where the path integral over possible positions at each time, x(t), has been written as a finite set of sums over states of the polarization, Fz, at each time slice j)t: at each time slice, the polarization can be either +1 or -1. The potential energy V comes from a time-varying electric field, 0(t), and the kinetic energy term, in this two-state basis, now has the form KFx, where Fx is the Pauli matrix. Since Fx is off-diagonal in the polarization basis, this term contains the (nonlinear) coupling between the states of the quantum dot molecule at successive time slices. The size of this term, given by the parameter K (the “tunneling amplitude”), is determined by the physics of the dot molecule: how easy it is for the electrons to tunnel from polarization state +1 to -1. The effect of the optical phonons is summarized by the influence functional I[Fz (t)], given by

2

I[Fz(t)] '

mk k

D["k(t)] exp(

m j T

i dJ S 0

[

k

mk 2

mkTk

2

"0 2k (J)%

2

"2k (J)%8k"k(J)Fz(J) ] )

(4)

where "k is the position variable of the kth harmonic oscillator (phonon), mk its mass, Tk its frequency, and 8k its coupling strength to the system. The advantage of a linearly coupled harmonic bath is that the path integrals over the phonons can be performed immediately, giving us the nonlinear functional:

I [Fz(t)] ' exp[

j j F (j)t)P(j)t,j )t)F (j )t)] j

j)

z

z

(5)

where P(J, J’) = P(|J-J’|) = P(J’‘) is the influence phase, proportional to the response function of the bath. For the phonon bath,

P(J )'

j k

82k csch($STk/2)[cosh($STk/2)cos(TkJ )%isinh($STk/2)sin(TkJ )] 2mkTk

(6)

where we have introduced also a (suitably low) temperature, given by 1/ $ in units of Boltzmann’s constant. Since I[Fz(t)] is a functional of the state of the quantum system at all times (i.e. the activation levels of all the neurons), it adds another nonlinearity. The trainable parameters can be taken to be the coupling strengths to each of the bath oscillators, { 8k}, the values of the electric field at each time slice j, {0j}, the frequencies of the oscillators, {T k}, or any combination of these. Each of these can be controlled physically: {8k}, by optically exciting multiple phonons; {,j}, by changing the external field, and {Tk}, by changing the phonon frequencies excited. It should be noted that this is not a feed-forward network: all neurons are connected to all other neurons, both forward and backward in time, by the effects of the environment. IV. RESULTS AND DISCUSSION We now set up a simulation of the quantum neural network to do a logic gate. We specify as inputs the initial (t=0) polarizations of each of two quantum dot molecules, far enough from each other spatially that they do not interact directly, but sharing the same substrate. The inputs for the logic gates correspond to specification that the polarizations P of the two molecules: (0,0) : {-1, -1}; (0,1) : {-1, +1}; (1,0) : {+1, -1}; and (1,1) : {+1, +1}. So for example, if the two molecules pictured in Figure 1 were our two molecules in their prepared input states, this would correspond to a logical input of (1,0). For output we (arbitrarily) take the polarization of the first molecule at the final time, which could be determined by measuring the electric field of the quantum dot molecule, and threshold it at some value. Thus, the computed output of the system, which is between 0 and 1, is thresholded:

Output'Th(|<%|R(Fz(N)t),T)>|2)

(7)

where <+|R(Fz(N)t),T) > is the computed probability amplitude for the first molecule’s final state to be equal to the <+| state. We consider the network trained if the absolute magnitude squared of the probability amplitude for the first molecule’s polarization is greater than (less than) that chosen threshold value, as desired for the given logic gate. For example, for the XOR the “goals” for the computed outputs from the given inputs would be 0.01, 0.1, 0.1, and 0.01, respectively. A simple gradient descent algorithm was used to train the network. The error function is given by

1 4

' [|<%|R>|2&Desired]2

(8)

where the trace over the state of the second molecule is understood. We then differentiate the Lagrangian with respect to 3

the particular parameter we wish to train; for example, with respect to the coupling strength 8k of the quantum system to phonon k we have: M 1 ' [|<%|R>|2&Desired][2Re[ M<%|R> <%|R>(] M8k M8k 2

(9)

For each training pair, all paths (i.e. all possible states of the set of neurons; the sum over the different values for the set of { F z } in Eq.(3) ) are evaluated exactly, and all the contributions summed, to compute both the result, |<+|R>|, and the derivatives of with respect to each of the adjustable weights. This identifies that direction in the parameter space of weights which minimizes the error. The weights are then changed in that direction, according to M old 8new k '8k &0 M8k

(10)

Figure 2 quantum neural network. Two quantum dot molecules are evolving in time; each of the pictures as read from left to right is a snapshot of the states of the molecules at a given time. The far left states represent the input layer. The states on the far right represent the output layer. In this particular example the inputs are {-1,+1}, corresponding to a logical (0,1). The particular path shown is one of 2N+1 possible paths; here, N=4. This particular path’s probability amplitude would contribute to the <+|R>; the net would be training to increase this contribution if, for example, we are doing an XOR. where 0 is the training rate, and the process repeats. Figure 2 shows a picture of one particular path of the 2 N paths, whose amplitude would contribute. The two rows of pictures should be read from left to right as being a series of snapshots of what the state of each of the two molecules look like, at each time slice. Each time slice corresponds to a neuron. The far left states, in this case {-1,1}, correspond to the input layer; the far right states, to the output layer. Figure 3 shows the error as a function of training pass, for one logic gate, the XOR, and Figure 4 shows the asymptotic error value as a function of the number of neurons, N. For N < 5 the XOR gate does not train satisfactorily, but for N=5 or 6 the error is essentially zero. By adjusting the values of the tunneling amplitude K, the training rates 0 for the electric field and for the coupling strengths, the thresholding values, and the characteristic frequencies Tk , we have been able to find regions in which we can train the net successfully to do any classical logic gate. Results are shown in Table 1 for one of these regions, with discretization n equal to 5 points (neurons.) Each of these calculations was started at the same point in the parameter space, as indicated in the table. Because the time dynamics are quantum mechanical in nature, the network is also capable of performing purely quantum computation, such as phase shifts, which cannot be done by a classical computer. Preliminary calculations, done 4

using a single quantum dot molecule (one input, one output), show that the net is indeed capable of these; however, for this to be implemented physically we would need a sufficiently sensitive method for measuring the phase [9]. The net is also capable of doing computation forwards or backwards in time: computationally, this means simply replacing time with negative time. Since in this model no dissipative influence was included, the net is symmetric in time; thus, in principle at least, the net could provide the answer before the question was asked. V. CONCLUSIONS AND FUTURE WORK Potentially, a quantum neural network would be an extremely powerful computational tool. Moreover it is capable, at least in principle, of performing computations that cannot be done, classically. This current work, of course, merely demonstrates proof of concept; an actual working quantum neural net would likely want to take advantage of the greater multiplicity and connectivity inherent in an entire array of quantum dot molecules, by placing the molecules physically close enough to each other that nearest neighbors can interact directly, as Lent, Tougaw and Porod [4] proposed for their algorithmic quantum dot computer. This would have the additional advantage of reducing error, in the sense that if one molecule becomes damaged, the net can correct for the resulting error; this mechanism is not possible with the present architecture.

Figure 3 Error as a function of training pass, for the XOR gate. A three-oscillator bath was used, with frequencies of 1.3, 1.06, and 1.65 (in units of 1.5 x 1010 s). The temperature was 1.2 mK, the tunneling amplitude between states 6 meV, the total time of evolution 1 psec. The number of discretization points (neurons) was 5. The training rate for the fields was 5 x 10-4 , and for the couplings 1 x 10-5. 5

Figure 4 Assymptotic error as a function of N, the number of neurons, for the XOR gate. Same parameters and procedures as for Figure 3. The major question that still needs to be addressed, is: what level of noise is sufficient to destroy the quantum coherence and thus the computational power of the quantum neural net? Roughly, this will occur when the dissipation time scale approaches the computational time. This work is currently in progress. VI. ACKNOWLEDGEMENT This work was supported in part by the National Science Foundation, under grant ECS-9312345, and by the University of Kansas Center for Research, Inc. VII. REFERENCES

1.

P.D. Wasserman, in Neural Computing Theory and Practice. Van Nostrand Reinhold, New York, 1965.

2.

S.R. Skinner, E.C. Behrman, A.A. Cruz-Cabrera, and J.E. Steck, "Neural network implementation using selflensing media," Appl. Opt., vol 34, pp. 4129-4135, 10 Jul. 1995; R.R. Birge, "Protein-based computers," Sci. Am., vol272, pp. 90-95, Mar. 1995.

3.

A.Barenco, D. Deutsch, A. Ekert, and R. Jozsa, "Conditional quantum dynamics and logic gates," Phys. Rev. Lett., vol 74, pp. 4083-4086, 15 May 1995; J.J. Cirac and P. Zoller, "Quantum computations with cold trapped ions," Phys. Rev. Lett., vol 74, pp. 4091-4094, 15 May 1995.

4.

C.S. Lent, P.D. Tougaw, and W. Porod,"Quantum cellular automata: the physics of computing with arrays of quantum dot molecules," in Proceedings of the Workshop on Physics and Computation (PhysComp '94), IEEE

6

Computer Society Press, 1994, pp. 5-13. 5.

R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals. McGraw-Hill, New York, 1965.

6.

M. Kemerink and L.W. Molenkamp,"Stochastic Coulomb blockade in a double quantum dot," Appl. Phys. Lett., vol. 65, pp. 10 12- 1014, 22 Aug. 1994.

7.

A. O. Caldeira and A.J. Leggett, "Quantum tunneling in a dissipative system", Ann. Phys., vol 149, pp.374-456, 1983.

8.

Y. Wan, G. Ortiz, and P. Phillips, "Pair tunneling in semiconductor quantum dots," Phys. Rev. Lett., vol 75, pp.2879-2882, 9 Oct. 1995.

9.

J.H. Shapiro, S.R. Shepard, and N.C. Wong, "Ultimate quantum limits on phase measurement," Phys. Rev. Lett., vol. 62, pp. 2377-2380, 15 May 1989.

Table 1: Quantum neural network binary classical logic gates. Conditions were the same as in Figure 3. INPUT

00

01

10

11

OR

Goal Start End

0.01 0.2032 0.0110

0.10 0.0426 0.1338

0.10 0.3331 0.3486

0.10 0.0964 0.1816

NOR

Goal Start End

0.10 0.2032 0.3287

0.01 0.0426 0.0108

0.01 0.3331 0.0110

0.01 0.0964 0.0110

AND

Goal Start End

0.01 0.2032 0.0114

0.01 0.0426 0.0129

0.01 0.3331 0.0139

0.10 0.0964 0.6499

NAND

Goal Start End

0.10 0.2032 0.2088

0.10 0.0426 0.1003

0.10 0.3331 0.2150

0.01 0.0964 0.0110

XOR

Goal Start End

0.01 0.2032 0.0110

0.10 0.0426 0.1007

0.10 0.3331 0.3358

0.01 0.0964 0.0110

XNOR

Goal Start End

0.10 0.2032 0.1030

0.01 0.0426 0.0102

0.01 0.3331 0.0158

0.10 0.0964 0.1023

SPEC

Goal Start End

0.10 0.2032 0.0539

0.01 0.0426 0.0093

0.10 0.3331 0.2350

0.01 0.0964 0.0103

NSPEC

Goal Start End

0.01 0.2032 0.0089

0.10 0.0426 0.2233

0.01 0.3331 0.0103

0.10 0.0964 0.2735

GATE

7

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close