Per your standard fitness function. Where can you write a custom

fitness function. I was a doing research with genetic algorithms with

gatool in matlab so calling a custom function was simple. however in

genetic programing it seems weird. If I can get my head around this I

can help you.

dward just noticed some useful emails came back from old friends

(previous job who works with GP - that said they're C++ guys and not

really into Matlab) I'll upload them (this fitness fuction of

evaluating the trees is a real pain and the email sent back to me

references this):

Have you tried evolving three classifiers.

The first segregates classs one from the other two

The next segregates class two from the other two

The last segregates class three from the other two

You can use IF THEN ELSE or min max for example

nothing)

use a population size of 1000 in each case. As you evaluate a GP

tree over all the data remember which branches were "fired". So if

min (A,B) where A and B are two subtrees put a 0 if it went left and

a 1 if it went right and do this against TP FP TN FN (true positive,

false positive, true negative, false negative). Run it 20 times on

the same or different machines. Compute the FOM you want. Then look

at each case TP, FN, TN, FN and look at these memories of 0 and 1.

You will find that perhaps in TP only certain branches will fire and

not others, so you can understand in this way which parts of the tree

executed and so simplify the tree for whatever task (e.g. TP for

discriminating class 1 from the rest).

In order to get the precise FOM you want then you may use another

trick. As you evaluate the GP tree over all cases and you compute

the FOM and say it comes out to favour too much the FPs or something

then simply slide the threshold (>0 for example as above to >

epsilon where epsilon is a small constant until you get the FOM that

you want). We did not do this but it is a trick that Maarten gave us

once which makes sense.

If you use IF THEN ELSE or min max alone you will never get anything

that "slides" but some FOM with abrupt changes. Consider using plus

minus divide (protected) and multiply in the formula. Do not use

constants in the GP tree, I think that you will get better

generalization without them.

Moreover, you should use linear scaling (interval arithmetic you do

not need because the protected division and the fact that the return

from the tree is a threshold and a bit meaningless means you do not

have to rely on interval arithmetic in this case)

For these things you should look at Maarten Keijzer's paper which won

best paper in the 2003 EuroGP paper

<http://www.cs.vu.nl/~mkeijzer/publications/mybibliography.html>

[Kei03] Maarten Keijzer. Improving symbolic regression with interval

arithmetic and linear scaling. In Conor Ryan, Terence Soule, Maarten

Keijzer, Edward Tsang, Riccardo Poli, and Ernesto Costa, editors,

Genetic Programming, Proceedings of EuroGP'2003, volume 2610 of LNCS,

pages 71-83, Essex, 14-16 April 2003. Springer-Verlag.

[ bib | .ps.gz | Abstract ]

Use the technique known as linear scaling because it should help you

a lot with nice improvements.

Limit your tree size to 1000, use tournament selection, and make sure

your population is randomly assembled correctly, etc, all the basic

things you do right. Also use the efficiency measure of Koza it is

not perfect but will tell you something about your problem.

By the way, the way to understand the GP tree is to pick the smallest

for your analysis, one you got in one of your parallel runs which is

almost as good as the best one. There is always one run that

stochastically results in shorter GP trees and then you can use

analyse that one for comprehension rather

(previous job who works with GP - that said they're C++ guys and not

really into Matlab) I'll upload them (this fitness fuction of

evaluating the trees is a real pain and the email sent back to me

references this):

Have you tried evolving three classifiers.

The first segregates classs one from the other two

The next segregates class two from the other two

The last segregates class three from the other two

You can use IF THEN ELSE or min max for example

nothing)

use a population size of 1000 in each case. As you evaluate a GP

tree over all the data remember which branches were "fired". So if

min (A,B) where A and B are two subtrees put a 0 if it went left and

a 1 if it went right and do this against TP FP TN FN (true positive,

false positive, true negative, false negative). Run it 20 times on

the same or different machines. Compute the FOM you want. Then look

at each case TP, FN, TN, FN and look at these memories of 0 and 1.

You will find that perhaps in TP only certain branches will fire and

not others, so you can understand in this way which parts of the tree

executed and so simplify the tree for whatever task (e.g. TP for

discriminating class 1 from the rest).

In order to get the precise FOM you want then you may use another

trick. As you evaluate the GP tree over all cases and you compute

the FOM and say it comes out to favour too much the FPs or something

then simply slide the threshold (>0 for example as above to >

epsilon where epsilon is a small constant until you get the FOM that

you want). We did not do this but it is a trick that Maarten gave us

once which makes sense.

If you use IF THEN ELSE or min max alone you will never get anything

that "slides" but some FOM with abrupt changes. Consider using plus

minus divide (protected) and multiply in the formula. Do not use

constants in the GP tree, I think that you will get better

generalization without them.

Moreover, you should use linear scaling (interval arithmetic you do

not need because the protected division and the fact that the return

from the tree is a threshold and a bit meaningless means you do not

have to rely on interval arithmetic in this case)

For these things you should look at Maarten Keijzer's paper which won

best paper in the 2003 EuroGP paper

<http://www.cs.vu.nl/~mkeijzer/publications/mybibliography.html>

[Kei03] Maarten Keijzer. Improving symbolic regression with interval

arithmetic and linear scaling. In Conor Ryan, Terence Soule, Maarten

Keijzer, Edward Tsang, Riccardo Poli, and Ernesto Costa, editors,

Genetic Programming, Proceedings of EuroGP'2003, volume 2610 of LNCS,

pages 71-83, Essex, 14-16 April 2003. Springer-Verlag.

[ bib | .ps.gz | Abstract ]

Use the technique known as linear scaling because it should help you

a lot with nice improvements.

Limit your tree size to 1000, use tournament selection, and make sure

your population is randomly assembled correctly, etc, all the basic

things you do right. Also use the efficiency measure of Koza it is

not perfect but will tell you something about your problem.

By the way, the way to understand the GP tree is to pick the smallest

for your analysis, one you got in one of your parallel runs which is

almost as good as the best one. There is always one run that

stochastically results in shorter GP trees and then you can use

analyse that one for comprehension rather

1. How to write the custom EWD add-ins?

2. HTML Help Workshop and Custom VBA Excel Functions/Add-Ins

3. How to write the WORST max function-------WAS: How to write an efficient maximum function?

4. How the function written on pl/pgsql can be called by the function written on plperl.

5. Writing Custom Aggregate Functions

6. How do I write a help file for a custom function?

8. Fitness Function with Eigenstructure Assignments!

9. Multiple constraint fitness functions

10. Custom Function write to Database field

11. ANN as fitness function of GA

13. Does the use of fitness functions in Genetic Algorithms invalidate

14. Can we write custom function to filter rows from DataTable

15. [Excel][DLL] Writing a DLL custom function for Excel with Delphi

2 post • Page:**1** of **1**