C. R. A. P.
      (Certain Research Articles & Programs)


      I am currently working with neural networks, mainly feed-forward models (FFNNs). I am trying to put several FFNNs together and construct a network of neural networks (entity). If you are interested on the subject I have a report (account of my work, see below), some papers and a program to help you create, train and test FFNN entities.

      BTW, I got my PhD in 2000.

      PhD Thesis

    • Feed Forward Neural Network Entities (postcript, gzipped) March, 2000

    • Bibliography in bibtex format
    • My Thesis Latex Style

    • Articles

    • A Neural Network Scheme For Earthquake Prediction Based On The Seismic Electric Signals (postscript).

    • Analysis of Seismic Electric Signals for Earthquake Prediction using Neural Networks (postscript).

    • Three Dimensional Tracking with Ultrasound for Virtual Reality Applications in Surgery (Word 6).

    • Feed Forward Neural Network Entities (postscript). See in the 'Programs' section below about a package which implements neural network entities.


    • Other documents

    • My BSc final year project on implementing a NN scheme for transputers.

    • Time Series Prediction, an investigation of possible implementations and methods.

    • A report of my work so far. ... and a more general account of my recent work.


    • Computer programs

    • A Feed Forward Neural Network engine (in C) V 10.0 (01/11/2011).
      download, extract, type ./configure, make and make install

    • Neural Parser v4.2 (01/11/2011). This is an application which is very helpful in defining, training and testing neural networks (feed forward/ back propagation only) and also Support Vector Machines (SVM, using libsvm software package). Training and testing a single FFNN is not a big deal (and can be done with the application below), but training and testing an entity of hundreds of FFNNs is a lot of work. The application understands a simple script language; an entity can be trained with a single command. There is an interpreter written in perl which parses the script file. This program needs perl version 5 (min) and a C compiler. It has been compiled in solaris, SunOS(sun4) and FreeBSD(i386) and Linux. It is now compiled via a 'configure' script.
      Here is an example np program to train and test a Feed Forward Neural Network, and here is an example np program to find optimal parameters, train and test a Support Vector Machine (SVM).
      This program requires NNengine v8.0, above.
    • Adaline (single layer perceptron) engine (in C).

    • A Feed Forward Neural Network simulator for X-windows. There are two versions one for the T800 transputer system, written in parallel C, and the other for ordinary systems with an ANSI C compiler and X-windows.

    • Try NoodlePlot(ver. 3.1) An utility for plotting, in an aesthetically sensitive graphical environment, my life, while I manch happily noodles spiced with mono-sodium glutamate. Runs on X11. Command line driven, realtime (i.e. pipe it to a command and will be plotting the numbers as they come out, do not try it on humans!!). Here is a description, a screen dump and the icon of NoodlePlot.

    • A java program implementing an idea called ``Organic GUI''.
      Source code is available on request.

      Organic scrollbars: A java program implementing vertical and horizontal scrollbars with custom images getting an ``organic'' feel. Source code is available on request.

    • Principal Components Analysis (PCA) program for large files and large dimensions.
      It requires GNU GSL and a BLAS library (one of many is ATLAS) and LAPACK. It runs on Linux.

      Source code is free to use and modify for non-commercial use.

    • A java program implementing an idea called ``Hyperbolic Tree'' which may be used for the visualisation of tree data. Lately this idea is very fashionable (see www.inxight.com) and applied to data visualisation (for example: a web site representation or a file system structure, etc).

      Source code is available here as a unix tar archive. It is a good idea to read the file README before downloading and installing. The source code is publicly available and distributed freely unde the GPL License.

      There is some documentation in javadoc format here. You would like to pay attention to the classes gnu.hyperti.HyperbolicTreeNode and gnu.hyperti.HyperbolicTree

      Also have a look at this harness java program. This is what has produced the applet above.

      A kind soul, Benjamin Matasar, has modified my code to:

      I was able to get it to work in my project, and I have fixed a few small issues with disconnection, and added an event/listener based system to the Hyperbolic Tree itself. Attached is my copy of the source, if you're interested. I also may do a few more structural changes later on. has contributed this source code.

    • Mandelbrot is a program for creating mandelbrot set plots.

    • An image processing suite initially for the Silicon Graphics. Needs X. It has compiled on SunOS/Solaris too.

    • A set of image processing routines. It does not require X.

    • An implementation of the Marching Cubes algorithm. It is a process by which a set of 2D images representing crossections (slices) of a 3D object (e.g. a set of CT scans taken crossectionally and with a fixed increment.) is used to reconstruct the original 3D object as a mesh of triangles.

    • Marching Cubes and the various image processing applications were written for Dr. G. Alusi who has paid me generously (thanks). However his friend Koosai Rauf (a signal processing man from Grenoble's university, France) still owes me a lot of money for ThinkDSP a program for win3.1(yiakkkksss) which allows one to program DSP chips in a visual environment. I do not have the compiler here, but the program stands on its own as a visual programming environment. Try it, it is for free...

    • Greeklish : a program to convert lower ASCII characters (<128) to higher ASCII (>128) which can be imported to LATEX or HTML and interpreted as greek. The correspondence is based on `greeklish' - an ad-hoc way of writing greek with the latin alphabet. The program can be downloaded from here and contains also a latex file showing how you can typeset in greek. The program is free and can be copied/changed etc. It has been compiled on Solaris and Linux. Windows ? be serious now!

    • microemacs sources and binaries for linux on PC
    • Remove caps lock in Linux:
      xmodmap -e "remove Lock = Caps_Lock"
      If you have window maker then place the above line in the file ~/GNUstep/Library/WindowMaker/autostart
      In other window managers perhaps put it in your .xinitrc
    • Save paper! A unix script to take a postscript file of many pages and produce booklets, either 4-up (i.e. two input pages to fit into one face of the output page and another two to fit in the otber face of the same output page - book dimensions become a4/2 and page number reduces by 4) or 8-up (i.e. four input pages to fit into one output page and four into the other face of the same output page producing booklets of a4/4 dimensions and reducing your pages number by 8). This script uses pstops which can be downloaded from elsewhere.
    • Quotes in csh / tcsh ! argghhh
      I have a file (ClassNames.txt) which contains two columns: classname and ID e.g. orange 1, red 2, blue 3 I need to replace classnames in another file (data.txt) with the IDs.
      for each line in ClassNames.txt do a sed on the data.txt:
      awk -v q="'" '{print "sed -e "q"s/"$1"/"$2"/"q" data.txt" }' ClassNames.txt | csh -f
      Because I always have problem with nested quotes in csh and I can't stand bash or sh, I resorted to one very simple hack: use variables to store quotes. So, here I define an awk variable 'q' to hold the quote i need.
    • Genetic Algorithms package pgapack has a problem when compiling to 64bit machines. When you compile the program go to the 'test' folder and run './instverf'. If you get failed tests then try recompiling the whole thing again but with the definition -DWL=64
      I have noticed that by default it is set as '-DWL=32' (in configure and configure.in). So replace all 32 by 64 in configure, configure.in in main folder and then (for apple mac, snow leopard, yaks)
      ./configure -arch linux -cc gcc -f77 g77 -cflags -arch x86_64 -march=core2 -mpilib /usr/lib/libmpi.dylib -mpiinc /usr/include/
    • Convert Excel Spreadsheet (xlsx) into text file (one for each sheet) using this bash script. It requires 'R' (what a name!) and R package 'xlsx'. In order to install this package, type this
      install.packages(c("xlsx"))
      in R (what a name! - the ideal name for googling something, come on idiots, unite! you are invincible). How these things are (!R) hey? R I guess stands for Robert Gentleman and Ross Ihaka who first wrote R, I guess again following the name convention of package S.
      some user of R
      You just have too large a vector for your memory.
      There is not much you can do with an object of 500 MG.
      You have over 137 million combinations.

      What are you trying to do with this vector?
      --- XXXXX wrote:
      >
      > Hello everybody!!
      > I'm from YYYYY (ZZZZZZ) and I'm new on R.
      > I've been trying to
      > generate all of the possible combinations for a 6
      > number combination with
      > numbers that ranges from 1 to 53.
      >
      > I've used the following commands:
      >
      > datos<-c(1:53)
      >
      M<-matrix(data=(combn(datos,6,FUN=NULL,simplify=TRUE)),nrow=22957480,ncol=6,byrow=TRUE)
      >
      > Once the commands are executed, the program shows
      > the following:
      >
      > Error: CANNOT ALLOCATE A VECTOR OF SIZE 525.5 Mb
      >
      >
      > How can I fix this problem?
      Honestly, I find R (apart from its name) very useful even if I have lots more trials and errors googling something about it - Thanks R! Don't thank me Mam, Thank SuperM.A.N!
      And that guy did win the Lottery.


    • Data

    • Financial Data. Data for various financial indices like S&P 500, FTSE etc.


    • np script language : training and testing a neural network example

      top
      First, use this bash script to produce the training and test data (it simulates the function of two inputs, x/(4y) for x and y between 0 and 4, the output is between 0 and 1. A different function can be used instead, see the line starting with $out):
      #!/bin/bash
      num_training_lines=2000
      num_test_lines=3000
      
      # nothing to change from here on
      total_lines=$((num_training_lines+num_test_lines))
      perl -e '
      srand(1981);
      for($i=0;$i<'${total_lines}';$i++){
      	$x = rand(3) + 1;
      	$y = rand(3) + 1;
      	$out = $x / (4*$y);
      	print "$x\t$y\t$out\n";
      }' > data
      head -${num_training_lines} data > train
      tail -${num_test_lines} data > test
      

      Once the training / test data (in files 'train' and 'test' in your local directory) have been produced by running the above script, and provided that you have installed successfully both NNengine and NeuralParser, then run the following np script
      A_TRAIN_FILE = OpenFileObject { Filename = train; }
      A_TEST_FILE = OpenFileObject { Filename = test; }
      A_TRAIN_INPUTS = ExtractColumnsFromObject { Columns = A_TRAIN_FILE[1..2]; }
      A_TEST_INPUTS = ExtractColumnsFromObject { Columns = A_TEST_FILE[1..2]; }
      FFNN = CreateSingle {
      	SingleType = FFNN;
      	Arch = 2,5,1;
      	Weights = W_SINGLE;
      	Sigmoid = Yes;
      }
      TrainSingle {
      	Obj  = FFNN;
      	InpFileObj  = A_TRAIN_FILE;
      	Iters = 3000;
      	Beta = 0.015;
      	Lamda = 0.0;
      	Seed = 1234;
      	TrainingType = Continuous;
      	SaveWeightsEveryNIterations = 25;
      	ShowProgressEveryNIterations = 25;
      	ProgressFilename = NNengineProgress;
      	UniqueWeightsFile = no;
      	Silent = Yes;
      	PIDfilename = nnengine.pid;
      }
      
      # assessment
      
      # known error, on the training set
      TestSingle {
      	Obj = FFNN;
      	InpFileObj  = A_TRAIN_INPUTS;
      	OutFileName = final_output.train;
      }
      EXPECTED_OUTPUT_KNOWN = ExtractColumnsFromObject { Columns = A_TRAIN_FILE[3]; }
      ACTUAL_OUTPUT_KNOWN = OpenFileObject { Filename = final_output.train; }
      ERROR_KNOWN = ColumnsArithmetic {
      	RowExpr = 0.5*(ACTUAL_OUTPUT_KNOWN[1]-EXPECTED_OUTPUT_KNOWN[1])**2;
      	ColExpr = sqrt, sum;
      	OutFileName = error_known;
      }
      ERRORPERCENT_KNOWN = ColumnsArithmetic {
      	RowExpr = (abs(ACTUAL_OUTPUT_KNOWN[1]-EXPECTED_OUTPUT_KNOWN[1])/(max(abs(ACTUAL_OUTPUT_KNOWN[1]),abs(EXPECTED_OUTPUT_KNOWN[1]))));
      	# count all the mismatches and divide
      	# by the number of rows in the file to get percent of mismatches
      	# i.e. 25 means 25 % mismatches
      	ColExpr = average, 100.0 *;
      	OutFileName = error_known_percent;
      }
      
      # unknown error, on the test set
      TestSingle {
      	Obj = FFNN;
      	InpFileObj  = A_TEST_INPUTS;
      	OutFileName = final_output.test;
      }
      EXPECTED_OUTPUT_UNKNOWN = ExtractColumnsFromObject { Columns = A_TEST_FILE[3]; }
      ACTUAL_OUTPUT_UNKNOWN = OpenFileObject { Filename = final_output.test; }
      # below, the RowExpr, means do that for each row - when all rows are processed, then execute
      # the ColExpr over all the numbers, below is a root mean square error
      ERROR_UNKNOWN = ColumnsArithmetic {
      	RowExpr = 0.5*(ACTUAL_OUTPUT_UNKNOWN[1]-EXPECTED_OUTPUT_UNKNOWN[1])**2;
      	ColExpr = sqrt, sum;
      	OutFileName = error_unknown;
      }
      # this is somewhat controversial, as it tries to calculate a 'percent error' as follows:
      # for each expected,actual output pair do:
      # abs(exp-act)/abs(exp)
      # and then find the average of all these numbers and multiply by 100 to get percent.
      ERRORPERCENT_UNKNOWN = ColumnsArithmetic {
      #	RowExpr = (abs(ACTUAL_OUTPUT_UNKNOWN[1]-EXPECTED_OUTPUT_UNKNOWN[1])/(min(abs(ACTUAL_OUTPUT_UNKNOWN[1]),abs(EXPECTED_OUTPUT_UNKNOWN[1]))));
      	RowExpr = (abs(ACTUAL_OUTPUT_UNKNOWN[1]-EXPECTED_OUTPUT_UNKNOWN[1])/abs(EXPECTED_OUTPUT_UNKNOWN[1]));
      	ColExpr = average, 100.0 *;
      	OutFileName = error_unknown_percent;
      }
      
      DeleteObjects {
      	Obj = A_TEST_INPUTS, A_TRAIN_INPUTS, EXPECTED_OUTPUT_KNOWN, EXPECTED_OUTPUT_UNKNOWN;
      	Unlink = Yes;
      }
      $
      

      np script language : training and testing a Support Vector Machine example

      top
      Below is an example on how to use Support Vector Machine (SVM) to learn a boolean function. First, use this bash script to produce the training and test data - it is a boolean function on 6 inputs - this is a stupid example really.
      #!/bin/bash
      
      num_training_lines=40
      num_test_lines=3000
      
      # nothing to change from here on
      total_lines=$((num_training_lines+num_test_lines))
      perl -e '
      srand(1981);
      @inps=(0)x6;
      for($i=0;$i<'${total_lines}';$i++){
              for($j=0;$j<6;$j++){ $inps[$j] = int(rand()*10000000000) % 2 }
              $out = boolean_function(@inps);
              print join("\t", @inps)."\t$out\n";
      }
      sub     boolean_function {
              return (($_[0] && $_[1]) || ($_[1] && $_[3])) && ($_[4] || $_[5] || $_[0]) ? "1" : "0";
      }
      ' > data
      head -${num_training_lines} data > train
      tail -${num_test_lines} data > test
      
      Once the training / test data (in files 'train' and 'test' in your local directory) have been produced by running the above script, and provided that you have installed successfully both NNengine and NeuralParser, then run the following np script
      A_TRAIN_FILE_TAB = OpenFileObject { Filename = train; }
      A_TEST_FILE_TAB = OpenFileObject { Filename = test; }
      # we need to convert the tab-separated (one input per column, input columns followed by output columns)
      # data files to libSVM format using this command:
      ConvertColumnDataToLIBSVMFormat {
      	InpFileObj = A_TRAIN_FILE_TAB;
      	OutFileName = train.svm;
      	OutputsFirst = no;
      	NumInputs = 6;
      	NumOutputs = 1;
      }
      ConvertColumnDataToLIBSVMFormat {
      	InpFileObj = A_TEST_FILE_TAB;
      	OutFileName = test.svm;
      	OutputsFirst = no;
      	NumInputs = 6;
      	NumOutputs = 1;
      }
      A_TRAIN_FILE = OpenFileObject { Filename = train.svm; }
      A_TEST_FILE = OpenFileObject { Filename = test.svm; }
      
      SVM = CreateSVM {
      	Model = svm_model;
      	ProabilityEstimates = no;
      }
      SVP = CreateSVMTrainingParameters {
      	Degree = 5;
      	Kernel = 2; # radial basis
      	Cachesize = 50;
      }
      # now, SVM is weird in that the running parameters are CRUCIAL in its performance,
      # therefore we are going to spend a lot of time finding these parameters using this command:
      FindOptimalSVMTrainingParameters {
      	ParamsObj = SVP;
      	InpFileObj = A_TRAIN_FILE;
      	SaveParamsObjName = SVPoptimal;
      	NumThreads = 2; # parallelise in 2 threads if you have multi-core cpu
      	NFoldValidation = 0;
      	ExploreAtMostNEqualSolutions = 2;
      	MaxResolution = 0.1;
      	# min,max,numSteps
      	RangeCost = -3,15,2;
      	RangeGamma = -5,12,2;
      	# use per-class mean accuracy
      	PCMCriterion = yes;
      	MaxDepth = 6;
      	MaxAccuracy = 100.0;
       	MinRateOfChangeOfAccuracy = -1;
       	MinRateOfRateOfChangeOfAccuracy = -1;
      	OutputFileName = svm.optimal_parameters;
      	Verbose = yes;
      }
      # now train without crossvalidation so as to get a model out
      TrainSVM {
      	Obj = SVM;
      	InpFileObj  = A_TRAIN_FILE;
      	ParamsObj = SVPoptimal;
      	Overwrite = Yes;
      }
      
      # assessment
      # known error, on the training set
      TestSVM {
      	Obj = SVM;
      	InpFileObj  = A_TRAIN_FILE;
      	OutFileName = final_output.train;
      }
      EXPECTED_OUTPUT_KNOWN = ExtractColumnsFromObject { Columns = A_TRAIN_FILE_TAB[7]; }
      ACTUAL_OUTPUT_KNOWN = OpenFileObject { Filename = final_output.train; }
      ERROR_KNOWN = ColumnsArithmetic {
      	RowExpr = (inrange(ACTUAL_OUTPUT_KNOWN[1],0.0,0.5,0.0,1.0)!=inrange(EXPECTED_OUTPUT_KNOWN[1],0.0,0.5,0.0,1.0));
      	ColExpr = sqrt, sum;
      	OutFileName = error_known;
      }
      ERRORPERCENT_KNOWN = ColumnsArithmetic {
      	RowExpr = (inrange(ACTUAL_OUTPUT_KNOWN[1],0.0,0.5,0.0,1.0)!=inrange(EXPECTED_OUTPUT_KNOWN[1],0.0,0.5,0.0,1.0));
      	# count all the mismatches and divide
      	# by the number of rows in the file to get percent of mismatches
      	# i.e. 25 means 25 % mismatches
      	ColExpr = average, 100.0 *;
      	OutFileName = error_known_percent;
      }
      TestSVM {
      	Obj = SVM;
      	InpFileObj  = A_TEST_FILE;
      	OutFileName = final_output.test;
      }
      EXPECTED_OUTPUT_UNKNOWN = ExtractColumnsFromObject { Columns = A_TEST_FILE_TAB[7]; }
      ACTUAL_OUTPUT_UNKNOWN = OpenFileObject { Filename = final_output.test; }
      # an error is a discrepancy between expected and actual outputs for the given output column
      # if you have binary (two classes 0,1) outputs then error expression can be
      # ACTUAL_OUTPUT[1]==EXPECTED_OUTPUT[1]
      # (((ACTUAL_OUTPUT[1]<0.5)&&(EXPECTED_OUTPUT[1]<0.5))||((ACTUAL_OUTPUT[1]>=0.5)&&(EXPECTED_OUTPUT[1]>=0.5))) == 0
      # or    inrange(ACTUAL_OUTPUT[1], 0.0, 0.5, 0.0, 1.0) && inrange(EXPECTED_OUTPUT[1], 0.0, 0.5, 0.0, 1.0)
      # inrange(a,b,c,d,e){if((a>=b)&&(a<c)){return d}return e}
      # for more  inputs then AND the above
      # each error will be a 1 in the output (i.e. note the ==0 at the end)
      # and each success will be a 0, so count all ones to get errors
      
      ERROR_UNKNOWN = ColumnsArithmetic {
      	RowExpr = (inrange(ACTUAL_OUTPUT_UNKNOWN[1],0.0,0.5,0.0,1.0)!=inrange(EXPECTED_OUTPUT_UNKNOWN[1],0.0,0.5,0.0,1.0));
      	ColExpr = sqrt, sum;
      	OutFileName = error_unknown;
      }
      ERRORPERCENT_UNKNOWN = ColumnsArithmetic {
      	RowExpr = (inrange(ACTUAL_OUTPUT_UNKNOWN[1],0.0,0.5,0.0,1.0)!=inrange(EXPECTED_OUTPUT_UNKNOWN[1],0.0,0.5,0.0,1.0));
      	# count all the mismatches and divide
      	# by the number of rows in the file to get percent of mismatches
      	# i.e. 25 means 25 % mismatches
      	ColExpr = average, 100.0 *;
      	OutFileName = error_unknown_percent;
      }
      $