<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Computer Science</title>
	<atom:link href="http://www.maunz.de/wordpress/feed" rel="self" type="application/rss+xml" />
	<link>http://www.maunz.de/wordpress</link>
	<description></description>
	<lastBuildDate>Sat, 30 Mar 2013 15:40:29 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Multinomial Fminer</title>
		<link>http://www.maunz.de/wordpress/opentox/2011/multinomial-fminer?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=multinomial-fminer</link>
		<comments>http://www.maunz.de/wordpress/opentox/2011/multinomial-fminer#comments</comments>
		<pubDate>Tue, 14 Jun 2011 10:21:12 +0000</pubDate>
		<dc:creator>am</dc:creator>
				<category><![CDATA[Opentox]]></category>

		<guid isPermaLink="false">http://www.maunz.de/wordpress/?p=321</guid>
		<description><![CDATA[Both libraries (libbbrc and liblast) within the Fminer2 package are now usable in a multinomial context. The efficient pruning technique specific to libbbrc has been generalized to this setting. Correlation to Multiple Target Classes In short terms, statistical metric pruning &#8230; <a href="http://www.maunz.de/wordpress/opentox/2011/multinomial-fminer">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><strong>Both libraries (libbbrc and liblast) within the <a title="Fminer2 libraries" href="http://github.com/amaunz/fminer2">Fminer2 package</a> are now usable in a multinomial context. The efficient pruning technique specific to libbbrc has been generalized to this setting.</strong></p>
<span id="Correlation_to_Multiple_Target_Classes"><h2>Correlation to Multiple Target Classes</h2></span>
<p>In short terms, statistical metric pruning works as follows in the binary case: Any pattern (here: subgraph) <span id='tex_3661'></span> may occur in <span id='tex_1392'></span> instances of class 1 and <span id='tex_9931'></span> instances of class 2.</p>
<div id="attachment_322" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.maunz.de/wordpress/wp-content/uploads/2011/06/2D.png"><img class="size-medium wp-image-322" title="Two Target Classes" src="http://www.maunz.de/wordpress/wp-content/uploads/2011/06/2D-300x253.png" alt="Two Target Classes" width="300" height="253" /></a><p class="wp-caption-text">Two Target Classes</p></div>
<p>Pattern <span id='tex_6130'></span>&#8217;s <span id='tex_1410'></span> value is thus determined by <span id='tex_7669'></span> and <span id='tex_4370'></span>, referred to as <span id='tex_5484'></span>. Importantly, any pattern larger than, i.e. supergraph of, <span id='tex_905'></span> can never have a <span id='tex_1085'></span> value higher than the maximum of <span id='tex_3625'></span> and <span id='tex_7733'></span> (referred to as <em>upper bound</em>). This is due to the convexity of the <span id='tex_9310'></span> function and allows to prune the search once the upper bound falls below a certain treshold (for details see <a title="Backbone Refinement Class Mining" href="http://www.maunz.de/wordpress/bbrc" target="_blank">the paper</a>).</p>
<p>Now consider the multinomial case (three classes for the sake of presentation):</p>
<div id="attachment_325" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.maunz.de/wordpress/wp-content/uploads/2011/06/3D.png"><img class="size-medium wp-image-325" title="Multinomial Setting (three classes)" src="http://www.maunz.de/wordpress/wp-content/uploads/2011/06/3D-300x253.png" alt="Multinomial Setting (three classes)" width="300" height="253" /></a><p class="wp-caption-text">Multinomial Setting (three classes)</p></div>
<p>The setting is analoguous: No refinement of pattern <span id='tex_6105'></span> with occurrences <span id='tex_7086'></span> can have a <span id='tex_8828'></span> value larger than the maximum value associated with the red vertices (vertice (0,0,y3) not shown, but must be also considered).</p>
<p>One would have to check more (actually <span id='tex_660'></span> many, here <span id='tex_630'></span>) values  to get the maximum red vertice, so this is feasible for a low number of classes only.</p>
<span id="Usage_from_the_Command_Line"><h2>Usage from the Command Line</h2></span>
<p>Fminer supports currently up to five different classes. You can use it from ruby, python, java, and C++ as those APIs are integrated into <a title="Fminer2" href="http://github.com/amaunz/fminer2" target="_blank">the source</a> and supported by the Makefile. Download and install the fminer2 package according to the <a title="Backbone Refinement Class Mining" href="http://www.maunz.de/wordpress/bbrc">BBRC instructions</a> and <a title="Latent Structure Pattern Mining" href="http://www.maunz.de/wordpress/latent-structure-pattern-mining">LAST-PM instructions</a>, respectively.</p>
<p>Run fminer from the main directory with the standard front-end application.</p>
<pre class="brush: bash; collapse: true; light: false; title: ; toolbar: true; notranslate">
unset FMINER_LAZAR
export FMINER_PVALUES=1
export FMINER_SMARTS=1
export FMINER_P_VALUES=1
fminer/fminer libbbrc/libbbrc.so -f5  liblast/test/hamster_carcinogenicity.smi \
liblast/test/hamster_carcinogenicity-multinomial.class
</pre>
<p>You will receive output such as the following:</p>
<pre class="brush: plain; title: ; notranslate">
- [ &quot;[#6&amp;a]:[#6&amp;a](-[#8&amp;A])(:[#6&amp;a])&quot;, 0.9961, [ 20674, 20911 ], [], [ 21212, 21219, 21250 ] ]
</pre>
<p>This <a title="YAML" href="http://www.yaml.org/" target="_blank">YAML string</a> denotes fragment <strong>[#6&amp;a]:[#6&amp;a](-[#8&amp;A])(:[#6&amp;a])</strong> in <a href="http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html" target="_blank">SMARTS</a> notation, having a p-value of 0.9961, as inferred from the <span id='tex_6664'></span> statistics, and occurring two times in the first class (in molecules 20674, 20911), zero times in the second, and three times in the third class (the classes are ordered in alphanumeric descending order). YAML is the standard output format of fminer, please check out the <a title="Fminer README" href="https://github.com/amaunz/fminer2/blob/master/fminer/README" target="_blank">fminer README</a>.</p>
<p>The occurrences (molecules where the fragments occur) have the following SMILES codes (you can check that from the .smi file):</p>
<p>COC1=CC(=O)C[C@@H](C)C21Oc1c(Cl)c(OC)cc(OC)c1C2=O<br />
OC(=O)C(C)(C)Oc1ccc(cc1)C1CCCc2ccccc12<br />
OC(=O)c1ccc(O)c(O)c1<br />
O.O.Oc1cc(O)c2c(c1)oc(c1ccc(O)c(O)c1)c(O)c2=O<br />
Oc1cc(O)c2c(c1)oc(c1ccc(O)c(O)c1)c(OC1O[C@H](CO[C@@H]3O[C@@H](C)[C@H](O)[C@@H](O)[C@H]3O)[C@@H](O)[C@H](O)[C@H]1O)c2=O</p>
<p>To verify the output, import the occurrences and the SMARTS pattern into the <a title="Depictmatch" href="http://www.daylight.com/daycgi_tutorials/depictmatch.cgi" target="_blank">depictmatch-application at Daylight</a> like so:</p>
<div id="attachment_345" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.maunz.de/wordpress/wp-content/uploads/2011/06/occ.png"><img class="size-medium wp-image-345" title="Input SMILES and SMARTS" src="http://www.maunz.de/wordpress/wp-content/uploads/2011/06/occ-300x270.png" alt="Input SMILES and SMARTS" width="300" height="270" /></a><p class="wp-caption-text">Input SMILES and SMARTS</p></div>
<p>and hit the &#8220;Depict&#8221; button:</p>
<div id="attachment_346" class="wp-caption aligncenter" style="width: 642px"><a href="http://www.maunz.de/wordpress/wp-content/uploads/2011/06/occ2.png"><img class="size-full wp-image-346   " title="Occurrences of SMARTS in Molecules." src="http://www.maunz.de/wordpress/wp-content/uploads/2011/06/occ2.png" alt="Occurrences of SMARTS in Molecules." width="632" height="378" /></a><p class="wp-caption-text">Occurrences of SMARTS in Molecules.</p></div>
<span id="Integration_into_Opentox-Webservices"><h2>Integration into Opentox-Webservices</h2></span>
<p>The Fminer functionality for multinomial environments is also available as a web service within Opentox. No special switches are necessary, the detection works automatically. A <a href="http://www.maunz.de/opentox/ISSCAN-multi.csv">demo dataset</a> with three class labels is available (excerpt from ISS cancer database).</p>
<pre class="brush: bash; collapse: true; light: false; title: ; toolbar: true; notranslate">
# Upload Data
curl -X POST \
-F &quot;file=@ISSCAN-multi.csv;type=text/csv&quot; \
&quot;http://ot-test.in-silico.ch/dataset&quot;
# retrieve dataset URI from task (not shown)

# Run Fminer on dataset URI
curl -X POST \
--data-urlencode &quot;dataset_uri=...&quot; \
--data-urlencode &quot;prediction_feature=...&quot; \
&quot;http://ot-test.in-silico.ch/algorithm/fminer/bbrc&quot;
# retrieve dataset from task (not shown)
</pre>
<p>In the output, you will find the fragments <a title="Calling BBRC and LAST-PM in three steps" href="http://www.maunz.de/wordpress/opentox/2011/bbrc-and-last-usage" target="_blank">as usual</a> (including, e.g. <em>p</em>-values and occurrences in compounds as feature values), however, the <em>effect</em> value is now distributed on all three classes, instead of two, e.g.</p>
<pre class="brush: plain; title: ; notranslate">
# Effect can range across all class labels (here 0-2)
http://www.opentox.org/api/1.1#effect: &quot;2&quot; #
 http://www.opentox.org/api/1.1#smarts: &quot;[#8&amp;A]=[#16&amp;A]-[#6&amp;a]&quot;
 http://www.opentox.org/api/1.1#pValue: 0.9927
</pre>
<p><strong>Note:</strong> This example dataset was used just for demonstration. You can use all Opentox compliant dataset URIs (e.g. from <a title="Ambit" href="http://ambit.sourceforge.net/" target="_blank">Ambit</a>). Moreover, BBRC was used for demonstration, but LAST-PM supports the same functionality (please read <a title="Calling BBRC and LAST-PM in three steps" href="http://www.maunz.de/wordpress/opentox/2011/bbrc-and-last-usage">the post about calling BBRC and LAST-PM</a>).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.maunz.de/wordpress/opentox/2011/multinomial-fminer/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
