<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Fraud detection &#8211; decision trees versus support vector machines (classification)</title>
	<atom:link href="http://www.baqmar.be/?feed=rss2&#038;p=1821" rel="self" type="application/rss+xml" />
	<link>http://www.baqmar.be/?p=1821</link>
	<description>the Belgian Association for Quantitative and Qualitative Marketing Research</description>
	<lastBuildDate>Wed, 18 Aug 2010 15:21:25 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Geert Bilcke</title>
		<link>http://www.baqmar.be/?p=1821&#038;cpage=1#comment-1856</link>
		<dc:creator>Geert Bilcke</dc:creator>
		<pubDate>Wed, 25 Nov 2009 20:35:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.baqmar.be/?p=1821#comment-1856</guid>
		<description>Hi,
I think it is a bit &quot;unfair&quot; to compare a strong learner (SVM) with a weak learner (DT), especially if you work with a --for data mining standards-- very small minority class (300.000 * 0.06% = only 180 positive targets).
With such a dataset strong learners (e.g. logistic regression) will always outperform a single decision tree.  Try doing the exercise with DT bagging and I am sure your conclusion will be totally different.
Geert</description>
		<content:encoded><![CDATA[<p>Hi,<br />
I think it is a bit &#8220;unfair&#8221; to compare a strong learner (SVM) with a weak learner (DT), especially if you work with a &#8211;for data mining standards&#8211; very small minority class (300.000 * 0.06% = only 180 positive targets).<br />
With such a dataset strong learners (e.g. logistic regression) will always outperform a single decision tree.  Try doing the exercise with DT bagging and I am sure your conclusion will be totally different.<br />
Geert</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: datalligence</title>
		<link>http://www.baqmar.be/?p=1821&#038;cpage=1#comment-1855</link>
		<dc:creator>datalligence</dc:creator>
		<pubDate>Wed, 25 Nov 2009 12:10:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.baqmar.be/?p=1821#comment-1855</guid>
		<description>hi eric,

the business requirement was to identify fraudulent transations and the data was at the transaction level. so a customer can have multiple transactions in the data.

some businesses require the false positives to be on the lower side, some prefer it to be on the higher side. here, the idea was to deploy the model in real-time so that transactions suspected to be fraudulent can be put on hold or cancelled. as such, a higher FP was ok as long as the model could capture more number of fraudulent transactions.

but i guess, i generalized a little too much when i first wrote this! as for the data, sorry, no comments :)</description>
		<content:encoded><![CDATA[<p>hi eric,</p>
<p>the business requirement was to identify fraudulent transations and the data was at the transaction level. so a customer can have multiple transactions in the data.</p>
<p>some businesses require the false positives to be on the lower side, some prefer it to be on the higher side. here, the idea was to deploy the model in real-time so that transactions suspected to be fraudulent can be put on hold or cancelled. as such, a higher FP was ok as long as the model could capture more number of fraudulent transactions.</p>
<p>but i guess, i generalized a little too much when i first wrote this! as for the data, sorry, no comments <img src='http://www.baqmar.be/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eric Lecoutre</title>
		<link>http://www.baqmar.be/?p=1821&#038;cpage=1#comment-1854</link>
		<dc:creator>Eric Lecoutre</dc:creator>
		<pubDate>Wed, 25 Nov 2009 10:21:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.baqmar.be/?p=1821#comment-1854</guid>
		<description>Hi,

I can see a problem with your conclusions.
If you look at the confusion matrices, you can see that
- for DT there are a total of 72 cases predicted as 1. The 61 correctly classified represent 85% of those predictions.
- for SVM, there are a total of 278+79=357 cases predicted as 1. The 79 correctly predicted represent only 22% of those ones.
Both learnings methods rely on parameters you may change to adjust what somewhat is the cutpoint of predictions 0/1.
If a am a fraud detection organism, I prefer DT: making verifications has a cost. Having the estimation that doing less verifications to have more real frauders is of great interest...
Another way to say it is: do verifications for all 127.000 people, you will find all frauders!
That&#039;s why one introduce the notion cost matrix ; it is not clear to me what the prefered modeling would then be.
BTW, where can I find this dataset? There is an modeling approch I would like to test for comparisons and assessment purpose.
Kind regards,
Eric</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>I can see a problem with your conclusions.<br />
If you look at the confusion matrices, you can see that<br />
- for DT there are a total of 72 cases predicted as 1. The 61 correctly classified represent 85% of those predictions.<br />
- for SVM, there are a total of 278+79=357 cases predicted as 1. The 79 correctly predicted represent only 22% of those ones.<br />
Both learnings methods rely on parameters you may change to adjust what somewhat is the cutpoint of predictions 0/1.<br />
If a am a fraud detection organism, I prefer DT: making verifications has a cost. Having the estimation that doing less verifications to have more real frauders is of great interest&#8230;<br />
Another way to say it is: do verifications for all 127.000 people, you will find all frauders!<br />
That&#8217;s why one introduce the notion cost matrix ; it is not clear to me what the prefered modeling would then be.<br />
BTW, where can I find this dataset? There is an modeling approch I would like to test for comparisons and assessment purpose.<br />
Kind regards,<br />
Eric</p>
]]></content:encoded>
	</item>
</channel>
</rss>
