PDA

View Full Version : The Biggest A/B Split Experiment Ever


princec
08-08-2004, 06:49 AM
A few days ago I mention in some thread about a cunning plan. Well, it's not so cunning a plan that I can't share it with you.

Many people in here have tried the A/B split technique to tune the conversion ratios of their games; almost always in leads to a successful result - it's quite a difficult experiment to actually get the wrong results from. Typically an A/B split involves releasing two versions of the same game with some fundamental difference in the upsell technique. Punters arriving at your site looking for downloads are farmed out each version randomly on 50/50 basis. You conduct the experiment long enough to get a statistically significant result and then after 100 sales ditching the version that is doing worst.

For many of us the best time to do the tuning is immediately on initial release because that's when we get the most exposure and buzz. It might, of course, be best to have an initial "quiet" release to iron out bugs and compatibility problems before the full press-release & submission process is begun in earnest, but for most of us, there's one big shot at the limelight and then the site traffic dies down to a bit of a trickle a month later.

I see three flaws with the A/B split as we know it.

Now, A/B splitting requires a fair amount of effort on the part of the developer. You've got to make two builds. You've got to work some cookie magic on your website. You've got to track the conversion rate for each version. Then you've got to analyse the results and undo the A/B split on your website and only give out the one that's doing best. It takes quite a bit of time and arseing about. That's flaw #1 with the A/B split technique.

Flaw #2 is that you can only measure the difference between two versions of the software. You've got to pick your two configurations very carefully - remember, many of us are only going to get one big shot and then it's all over until the next game - so you've got to choose your difference wisely.

Flaw #3 is that once you've made your analysis you can't easily subsequently monitor the punters for a change. Your original experiment may be based on a huge surge of traffic from download.com - these punters are quite different from the subsequent constant trickle of more discerning customers that arrive directly at your site. Unfortunately your A/B split decision has been made and you have no idea whether it's changing over time as your traffic changes.

Super Elvis is about to use a technical solution to sort out these three flaws. I don't know the technique is in use by anyone else, but no-one's talked about it before so this is how Super Elvis works.

There is but one version of Super Elvis (I use the unlockable full download with server based key verification method for deployment). Upon initial install it randomly configures itself as a demo with the following factors:

A random maximum number of games that can be played before it stops working; or unlimited.

A random maximum amount of play time that can be played before it stops working; or unlimited.

A random amount of level crippling - allowing the player to play between 10% and 100% of the game.

A random discount special offer - ranging the price between $1.95 and $19.95 (the full price which is displayed on the website), quantized to a selection of psychological price points.

In all, Super Elvis has 1,080 different configurations - compared to just 2 in a standard A/B split.

Super Elvis, like all my games will do eventually, logs its installation with my server, and logs its configuration as it does so along with a few details about the user's system specification and the affiliate build ID (I track download.com using a separate build ID for example). My server is geared to respond to an installation log with a reply that may override the initial configuration parameters individually or completely. If the punter refuses the initial connection to the server but subsequently buys the game, I'll get the information then anyway when the game is registered as it requires a server connection.

Using a bit of statistical wizardry and some fuzzy logic, Super Elvis will automatically tune itself to sell better to any particular bunch of customers, from now until the day I pull it from my virtual shelves. I can put my feet up and let Super Elvis figure out how to sell itself while I code another game (and install the same system in it, natch).

Cas :)

terin
08-08-2004, 07:00 AM
Wooow...

What happens if/when they install it and aren't online? Or is the system modified at the time they download it?

Screw making games cas, sell that technology ;-) lol.

princec
08-08-2004, 07:10 AM
If they're not online it just picks a random configuration on first run and goes with that. That's about the best it can manage. As time goes by I can always release an updated version that removes some of the really poorest performing configurations completely from the initial set of available ones.

Cas :)

svero
08-08-2004, 07:28 AM
I think it's an interesting idea. I've long thought about using random parameters to test things and having that built right into the demo. Uniformity of distribution has always been a problem, but if the random number generator is perfect it should tend towards a uniform distribution if you have a large enough sample size.

The self adjusting idea is really cool though. Might be a heavier trick to implement.

papillon
08-08-2004, 07:30 AM
Most obvious problem with this, of course, comes when a review or a friend-recommendation gets one setting and the person following that review gets another setting and feels cheated. :)

cliffski
08-08-2004, 07:37 AM
i really like this idea. The thing is by the sounds of it you need to have your own server to handle this kind of authentication wossanme? or is it a cgi script? I have very very limited understanding of that stuff. Its all i can handle to get my games to launch a url, I wish I knew how to pass in parameters and have them recorded etc, but its all a bit beyond me ;(

princec
08-08-2004, 07:49 AM
Yes, you do need a server, but fortunately it's exceedingly easy to do this sort of stuff in Java :P

Cas :)

princec
08-08-2004, 07:50 AM
Most obvious problem with this, of course, comes when a review or a friend-recommendation gets one setting and the person following that review gets another setting and feels cheated. :)

I did wonder about that possibility, in particular regarding the discount price, but a special offer is a special offer, and it's made individually to a single customer. After all you don't get people complaining that they didn't have a particular discount coupon code when they made their order... do you?

Cas :)

oNyx
08-08-2004, 08:24 AM
To use your own words - it's a cracking good idea :)

I think I really should add fuzzy logic to my tolearn list.

Interesting concept. Thanks for sharing. Two things, which I would like to add: don't forget to adjust to maximum profit (instead of maximum sales) and (less obvious) take the "locale" into concideration. Eg the sweet spot for germans might be a different one than the one for french people.

Gilzu
08-08-2004, 08:38 AM
Using a bit of statistical wizardry and some fuzzy logic, Super Elvis will automatically tune itself to sell better to any particular bunch of customers, from now until the day I pull it from my virtual shelves. I can put my feet up and let Super Elvis figure out how to sell itself while I code another game (and install the same system in it, natch).

Cas :)

Isn't that much like genetic programming?


Genetic programming starts with a primordial ooze of thousands of randomly created computer programs. This population of programs is progressively evolved over a series of generations. The evolutionary search uses the Darwinian principle of natural selection (survival of the fittest) and analogs of various naturally occurring operations, including crossover (sexual recombination), mutation, gene duplication, gene deletion.


Take a look at that website, and a further look at genetic programming related subject. You might even perfect a system that may work well for most of us. Call it the princec method, if you like ;)

princec
08-08-2004, 08:51 AM
I think really it ought to be server configurable as to whether I want maximum profit or maximum sales. Right about now I'm actually after maximum sales to increase my customer base and hopefully drive sales of other games. Later on I think I'll switch it to maximum profit.

Cas :)

princec
08-08-2004, 08:54 AM
I think that using some clever statistical analysis I should be able to slice and dice the data to extrapolate individual A/B splits from it on a meaningful scale even with just a few hundred results - so I could, for example, ignore all factors except price and simply pick the price point that's working the best and tell the game to clamp it at that price. Then I can concentrate on another variable. This means I can make adjustments using a very minimal set of data. I hope.

Cas :)

princec
08-08-2004, 08:57 AM
Yes, it is rather like genetic programming, but I'll be throwing in a bit more statistical analysis and less random recombination. Effectively one at a time I'll be locking down variables as I discover the ones that have peak sales against them. At its most simplest I expect I can just look at the data every month and decide which variables to clamp manually by updating a row in the database. In fact this is how it will probably be rolled out - the system is likely to need a lot of tweaking and thinking and will probably really come into its own when I put it into the next game and fold it back into Alien Flux too.

Cas :)

MattInglot
08-08-2004, 10:04 AM
I like the idea and I think it could be a powerful tool. The only problem is the whole program attempting to connect to the internet on start-up. People with firewall software will catch this and may not appreciate it (I sure don't). How do you think asking the user before attempting to call home would affect things? Did you get any complaints about AF's statistics gathering?

Chris Evans
08-08-2004, 10:17 AM
Prince will tell you that no one has complained that AF phones home to date. However, he also keeps mentioning how AF only has 1-2 sales a month, so I don't know if that's really an accurate statistic...maybe instead of people complaining, they just simply decide not to buy the game...

Anyway, I think it's an interesting idea, but I'm also wary of the phoning-home aspect. I'm also curious about how well a "Yes" or "No" prompt would work to connect to the server.

ggambett
08-08-2004, 10:21 AM
I've done the "random configuration on first run" but limited to one parameter at a time. You will reach the "statistically significant" point much earlier with 4 different configurations at a time than with 1800 in one run.

Jack Norton
08-08-2004, 11:15 AM
That's simply the best idea I've seen in a while.
Good luck with it, seems a winning formula... the only weak point I could see if it you make a game so bad that you got very few sales ;) hehe but I don't think that will be the case!

oNyx
08-08-2004, 11:50 AM
I like the idea and I think it could be a powerful tool. The only problem is the whole program attempting to connect to the internet on start-up. People with firewall software will catch this and may not appreciate it (I sure don't).[...]

If it's started through webstart it's allowed to do that (webstart apps are allowed to connect to the host they came from). One single connection right after installation won't hurt anyone and to tell the truth - most people won't be able to notice that (because they need to allow java to connect and to act as server before they download).

The download of the "restriction set" could be bundled with downloading the highscores or with downloading of the latest scripts (if you use some kind of scripting language).

---

@Cas

Good point (profit vs number of sales).

princec
08-08-2004, 01:04 PM
Did you get any complaints about AF's statistics gathering?
Not one. Conversion is around 1% so I don't think it's turning many people off.

Cas :)

GameStudioD
08-08-2004, 01:18 PM
I can put my feet up and let Super Elvis figure out how to sell itself...

Talk to anybody in sales, products dont sell themselves.

Alien Flux did a horrible job of selling itself, my guess is that Super Elvis will do the same. The demo period was unclear and there was no sense of urgency to buy. Its the equivalent of the salesman at the electronics store who doesnt smile. Your game needs to build rapport. No price tweaking is going to increase profit or sales significantly, in my opinion. I would like to see more effort put into creating a better experience, treat your customers like people.

I can ramble on and on about how much I think this system sucks, but I won't. It does have some good research advantages but ultimately it does not enhance the customers experience and it doesnt make your game better.

princec
08-08-2004, 03:04 PM
This is merely an automation of one of the many manual tasks in this business. It will, I expect, save me several weeks of effort fiddling with four commonly fiddled with parameters. It won't send press releases or find willing affiliates or pay download.com $80 every other week to keep it at the top spot.

Cas :)

Mithril Studios
08-08-2004, 03:44 PM
Cas,

You've got a great idea in automating the A/B test process! However, you've got way too many parameters!

Assuming for a moment that each parameter is a binary one (either yes/no, and in your case it doesn't sound like that's true), then you will have 2^1080 possible permutations of Super Elvis! Which means you will need to have 1.29 x 10^325 versions, and then you need roughly 400 downloads of each version in order to get remotely close to a statistically valid sample.

You may want to limit the number of parameters you are able to change and only deal with one at a time.

@GamestudioD
I agree with you in the respect that you can't automate everything which will potentially increase a game's sales, but to say that the "system suck's" I think would be a mistake.

Do you agree that things like demo time limit, price, and feature set, can potentially affect sales? If so, then how in the world would automating the data collection and permutating of these factors be a bad thing? Doing it by hand "does not suck?" :eek:

Anthony

Valen
08-08-2004, 03:51 PM
Using a bit of statistical wizardry and some fuzzy logic, Super Elvis will automatically tune itself to sell better to any particular bunch of customers, from now until the day I pull it from my virtual shelves. I can put my feet up and let Super Elvis figure out how to sell itself while I code another game (and install the same system in it, natch).


You're placing too much importance on the presentation of the registration incentives and price points. While those are certainly important, improving the game itself based on player feedback is even more important. If people don't like the game it won't matter to them how much it costs or how you crippled the demo. Therefore "automatically adjusting" the game to sell better in the way you are proposing to do it will only marginally improve your sales. Now what I expect you to say is that improving Alien Flux didn't improve sales. But that's also one small piece of the whole puzzle. There are factors like market demand, market size, and exposure that could've easily been the culprit. Those factors multiply, and each one can bring your sales down to nothing. I'm not saying that what you're doing is a bad idea per se, but don't expect it to be the Final Solution to making your game a hit.

DavidRM
08-08-2004, 06:29 PM
As developers, we tend to think that we can't properly market something until it's perfect. And, given our background, that's not surprising, as we find it easier--much easier--to tweak the product than its marketing.

We *know* how to improve the game (new explosion graphic, better music, more optimized keyframe interpolation, blah blah techie babble). So that's what we automatically look to do when it doesn't sell as well as we think it could.

What we, as indies, *don't* know how to do, or know a lot less about, is how to really market our games. That's the part that stumps us, the part where we feel out of our element.

And it's the part that, as a whole, we tend to provide least useful feedback about. But that's a whole different thread.

So it's not too surprising that as soon as someone starts trying to tweak the marketing aspects of his game, that we panic and start going back to the part we know--and can usefully comment on: the game. "Improve the game first!" we cry, striking our best Pose of the Purist. "If you build the perfect game," we insist, even though we know better, "the players will come."

All that to say: Sounds like an interesting experiment. I've considered something similar, but never actually *done* it. So my considering it only counts as syllables in a post. Nothing more. ;) You, however, get full credit for *doing* *something*.

I do think some of the posts about limiting the number of options might be good options. On the other hand, you said yourself it was configurable from the server side. So you're in good shape to tweak as you go.

Keep us informed as how it works. Best of luck.

-David

GameStudioD
08-08-2004, 06:45 PM
Again, I am not a fan of this idea.

As Mithril Studios points out, there is a large surface area of configurations. The configurations would have to be small in each category to make an accurate statistical analysis.

The game will be advertised on the website as 19.99. A potential customer downloads the game knowing that information. They play the game, like it and are ready to buy. The random discount makes the game 9.99, $10 is lost from a customer willing to pay 19.99.

This system is based on random values, not customer habits, customer likes, etc. You will just get random results. This is my biggest gripe with the plan, randomness.

The demo time is random. Again, the trial period needs to be stated clearly and as early as possible(website). If not stated, the customer has no sense of urgency to buy. 'Oh shoot, I only have 3 minutes left on the demo, I better buy.' Alien Flux suffered from this problem and looks like Super Elvis will too.

It will take work to implement and maintain this system. Ultimately, this will just be a refinement tool for your marketting and sales. You can probably make a bigger impact by improving marketting and sales than squeezing a bit more out.

terin
08-08-2004, 07:01 PM
Well, I agree its overdoing the number of factors. I also agree that in many times it is the presentation of the limit that is more important than the actual limit.

However, if such a system is easy to impliment and transfer to other products, by really refining the process it uses (sharpen the fuzzy logic and bring it into dealing only with "reasonable" numbers): I think the greatest power of such a system is STILL selling it to other developers.

You'll make a lot more selling to developers than you would from the increase in sales... but I think everyone could benefit from having a research tool like that.

robleong
08-08-2004, 07:24 PM
I always like reading Cas' remarkable experiments! In this case, perhaps I don't understand the concept well enough, but can you not solve the problem associated with the initial "phone home" by making it a server-side randomization at around the time of download rather than one that happens at installation? These server-generated randomized codes can then be passed to the installer directly as parameters, or as a separate file that is downloaded together with the installer. Done this way, at least your program will not be considered spyware by your potential customers. Just a thought.

Morphecy
08-08-2004, 09:24 PM
Super Elvis sounds a good idea but there's one problem which (after reading the whole thread) others have mentioned.

If you configure it to have 1080 options you need to get 100,000 - 1,000,000 download & plays before you have tested the full scale efficiently. Now, if you use A/B (or A/B/C split) you have only 3 variables, which can mean 300 to 3000 downloads & plays before you can tell which one is working. And after those 3000 downloads you can do the appropriate changes to the next 3000 downloads. A/B (/C) split may need extra work, but Super Elvis may take more time to reach the results.

Of course the automization is a good thing and of course you CAN adjust Super Elvis to have only certain amount of variables (like only 3 - the price) and then let Super Elvis to tweak himself. I think Super Elvis can really pay off after you have first found sweet spot by "manual tweaking" and after that (when you think you are near the sweet spot) you let the demo adjust itself in the web.

Very nice idea Cas, keep us informed what kind of impact it has.

Andy
08-08-2004, 10:11 PM
If you configure it to have 1080 options you need to get 100,000 - 1,000,000 download

Yeah, that's the point really. And if we'd get 1,000,000 downloads we'd probably spend it on something potentialy more profitable for us than testing 1000 different configurations. I'd say I choose one of two more profitable and later tweak that one a little bit and may be repeat this adjustment two-three times if necessary.

Cas, step out with your next crazy idea and say us please where to get 1,000,000 of downloads :D

And of course good luck by anyway!!!

Morphecy
08-08-2004, 11:41 PM
Yeah, that's the point really. And if we'd get 1,000,000 downloads we'd probably spend it on something potentialy more profitable for us than testing 1000 different configurations. I'd say I choose one of two more profitable and later tweak that one a little bit and may be repeat this adjustment two-three times if necessary.

Cas, step out with your next crazy idea and say us please where to get 1,000,000 of downloads :D

And of course good luck by anyway!!!

But Andy, remember that you can use only 3 options if you wish... and as mentioned - after manual tweaking it might get really profitable to use automatization.

Nemesis
08-09-2004, 01:12 AM
Cas,

I see this game evolve to a point where it will identify the weak link (you) and sell you off as a sex slave in Morocco... :) I think you probably have more parameter-tweaking code than actual game code at this stage!

On a more serious note, the experiment is admirable, but as already pointed out, 1080 combinations require hundres of thousands, if not millions of downloads before any realistic sample based on conversions can be extracted. As a potential customer I would also feel a little unconfortable (having this insider knowledge), but then again.. this sort of thing is happening all the time with supermarket loyalty cards etc.

Just one question... are you assuming that every parameter is independent? Because if so, you could organise the tweaking in a more sequential manner. For example, release the first 1000 demos with a tweaked discount only and keep only the one with the best result. Once the discount parameter is locked, repeat with the next parameter, say, demo expire time, and so on. You could have this sequence automated as part of the tweaking code of course. It will give you the benefit of an automated process with the advantage of benefitting early on from better sales.

my 0.02 euros

(edited - was incomplete)

Gilzu
08-09-2004, 01:40 AM
Just one question... are you assuming that every parameter is independent? Because if so, you could organise the tweaking in a more sequential manner. For example, release the first 1000 demos with a tweaked discount only and keep only the one with the best result. Once the discount parameter is locked, repeat with the next parameter, say, demo expire time, and so on. You could have this sequence automated as part of the tweaking code of course. It will give you the benefit of an automated process with the advantage of benefitting early on from better sales.

People tend to forget that there are numerus ways to succees. Take the game of Chess for example. When you run a genetic algorithem (like princec want to test his on his costomers) you get about 10 different player with different strategies which all succeed in varius ways. Some attack more, some defend more, some sacrifice their queens or rely more on their pawns... My point is that there are more then a dozen combinations for a winning strategy (or a profitable one) and while one parameter shows success, you cant seperate it from the others - small number of levels can work out great with long shareware-stop-time and vice versa. which will you use?

also, what will be you factors to check the sucess of your game stats? even the price affects the other game parameters.

princec
08-09-2004, 02:19 AM
You've got a great idea in automating the A/B test process! However, you've got way too many parameters!

Ah no, you misunderstand. I have four parameters with 5-6 possible values in each; that's 1,080 configurations, and because they're a distinct set of values in four actual variables I can plot graphs for them individually or against another variable and make correlations.

Cas :)

princec
08-09-2004, 02:22 AM
can you not solve the problem associated with the initial "phone home" by making it a server-side randomization at around the time of download rather than one that happens at installation?
Actually yes, I can do that, and very easily, just by supplying the parameters in a php generated JNLP file. Great idea!

In fact now I think of it, a php generated JNLP file is capable of doing an awful lot of stuff which is simply very difficult with a .exe (which is another reason I'm going to completely ditch .exe builds)

Cas :)

oNyx
08-09-2004, 02:41 AM
Actually yes, I can do that, and very easily, just by supplying the parameters in a php generated JNLP file. Great idea!
[...]

Haha... I'd totally overseen that. Man. That's great! :D

princec
08-09-2004, 03:16 AM
I find it a little difficult to explain as it's been a long time since I did statistical maths, but the long and the short of it is, even with 1,080 different combinations I can draw conclusions with the same number of downloads as you would for an ordinary single binary A/B split.

How?

Ultimately, the perfect set of statistics is where I have a statistically significant number of results (I forget how to actually calculate what a "statistically significant" number is but there is in fact a hard and fast formula). You're quite right in assuming I'd need, say, 1,000,000 sales just to find out what the absolute perfect configuration was beyond a doubt.

However, using statistical analysis I can take a single variable in isolation if I like and plot its success rate. For just 100 sales I can determine whether the configurations with no demo timeout are doing better than demos with a 1 hour timeout. I can get the server to lock down this variable and gather a new set of results where only 3 variables are altered. I can analyse the whole lot again on another variable and find out whether $9.95 sells better than $19.95 and lock down this variable too. Another 100 sales go by and I can look at the conversion graph based on the number of free levels I give away in the demo and lock that down at its peak. I'm now left with one variable to analyse. I've achieved all this in 400 sales or so. It's not accurate but it is probably one of several configurations that sells well.

Maybe after 400 more sales using this configuratoin I'll revert the system back to random again to retune it.

All I have to look at is the amount of money I'm making or the number of customers I've gained - depending on which one I want to maximize, that's the one I tune the lockdown for. There was a concern earlier in the thread that I've just sold a game for $9.95 to someone who was willing to pay $19.95 but essentially this comes down to whether I'm trying to maximize profits or maximize sales. So I think that's not an issue.

There are some misconceptions about this particular idea in the thread here too about it being the holy grail of internet games selling: it's not! It's a very simple automation of something that the pros do all the time, manually and slowly. That's what computers do best after all :) It doesn't take the place of gameplay improvements - but there will be very few tweaks to Super Elvis in this area because of the nature of the game.



Cas :)

Andy
08-09-2004, 03:32 AM
Cas,

...if all versions will come into hands of similar persons - from "ready to buy" point of view. But they will not... !!! ;)

Wayward
08-09-2004, 04:39 AM
Scientific Web Site Optimization Using AB Split Testing, Multi Variable Testing, And The Taguchi Method (http://www.webpronews.com/ebusiness/smallbusiness/wpn-2-20040726ScientificWebSiteOptimizationusingABSplitT estingMultiVariableTestingandTheTaguchiMethod.html )

--edit--
Taguchi Me This (http://www.pbs.org/cringely/pulpit/pulpit20030925.html)

princec
08-09-2004, 06:18 AM
Brilliant :) Well done that man, that's a great link. Time for me to do some reading!

Cas :)

Nemesis
08-09-2004, 08:37 AM
Here's a link with a more detailed explanation:
http://www.isixsigma.com/library/content/c020311a.asp
Cas, actually you were already (unwittingly) intending to use the Taguchi method iIn the sense that you wanted to identify the correlation between certain variables. That means you can combine your original parameters into a smaller set of parameters (alternatively, set them to specific combinations only) with the result that you will have far viewer possible configurations and hence be able to get a more meaningful statictic earlier on with a smaller number of sales.

You're a monster :)

We're building a science for maximising conversion rates target specifically for indie games.. almost warrants a paper this!

princec
08-09-2004, 09:55 AM
I bet this time next year we'll all be doing it :)

Cas :)

Kai Backman
08-09-2004, 10:40 AM
:) Good luck with the project Cas. I have also been considering automation of testing, and as I near a site rewrite I'll probably dive in as well. For me the key technologies here are the framework and modifying the download (both trial and full) for each player.

Cracking the statistical problem is naturally hard, but with a framework like this you are a leg above us others. You are taking a shot at it.. :D

Greg Squire
08-09-2004, 11:16 AM
Interesting idea, though I'd have to agree with some of the others that have said you have too many variables to track, with the number of download a typical indie game gets.

On another note: How about including a in-game registration system that lets you "Name your own price" (similar to priceline). It could even "haggle" with the user until it comes up with a sale? :D

Diodor Bitan
08-09-2004, 11:30 AM
Greg Squire
It could even "haggle" with the user until it comes up with a sale?

Perhaps offer discount codes in return for high scores? It might be feasible - I wish someone'd test it :p

BantamCityGames
08-09-2004, 11:40 AM
I used a similar technique on an Artificial Intelligence agent for a school project. The agent was a player of the game Checkers. I came up with a bunch of different parameters and had my agent play a fixed configuration that was known to be decent. My learning program played a game and if it lost, it changed the current parameter. If it won it moved on to the next parameter until it was done with all the parameters, then it cycled back to the first parameter and kept playing. In essence you are doing a similar thing except you are changing multiple parameters at once. If you were to use the same model, I would suggest choosing the first parameter and randomize that. Then after a certain period of time, lock the most successful value for that parameter and move on to the next parameter, rinse and repeat.

P.S. - There was a flaw to my checkers agent and it ultimately lost the tournament. The problem was that I didn't choose the most appropriate heuristics (parameters). Not the wrong values, but the wrong PARAMETERS. For instance, as one of my heuristics I chose: the total number of peices on the board. This caused a problem because this peice of info turned out to not be an important judge of who is winning and did nothing except flood the agent's knowledge with useless info. Basically what I am saying is: more parameters are not always better. Try to pick your parameters carefully.

princec
08-10-2004, 01:55 AM
The good thing about this particular experiment is that it's already very well grounded in theory and practice. Most indies in here already tweak these four parameters repeatedly until they (by luck rather than diligence!) happen across the configuration that works best.

Cas :)