Moving This Blog
I am just working on unifying all my blogs, and this has been the first one. The new one is located here http://www.xavierllora.net/
Filed by Xavier at 11:20 pm under Notes
No Comments
I am just working on unifying all my blogs, and this has been the first one. The new one is located here http://www.xavierllora.net/
Filed by Xavier at 11:20 pm under Notes
No Comments
Finally we have finished setting up the website for Meandre a semantic-driven data-intensive flow engine. Meandre provides basic infrastructure for data-intensive computation. It provides, among others, tools for creating components and flows, a high-level language to describe flows, and multicore and distributed execution environment based on a service-oriented paradigm. We are currently working on getting gear up for a first alpha release. You can visit the Meandre site here. I will be posting in the Meandre blog about our current steps toward getting the release out of the door. The Meandre infrastructure is being build to support the SEASR project
Filed by Xavier at 7:30 pm under Notes
No Comments
I just posted on my IlliGAL blog how to implement a generic genetic algorithm (GA) main loop squeezing the dynamic behavior of Python. Pretty sleek, if you have tweaked GA main you main find this interesting ![]()
Filed by Xavier at 11:00 am under Notes
No Comments
For the first time in 9 years, this vacation break I have done absolutely nothing. Wow what a coach potato I have become! Well that is not totally true, just for fun I started going over Python and, as usual, for any new language I end writing a simple genetic algorithm. I like the flexibility and compactness of the code (no verbose at all). However when I fire my first run (yes, the good old OneMax problem), I realized that some of my assumptions about coding did not directly transfer. Yes, it was a bit slow. So I started digging for a profiler and, surprise!, it comes with the Python interpreter.
Here is an example on how to run the profiling capabilities
import cProfile
cProfile.run('main()')
The cProfile module is a profile based coded in C. If you do not have it in your install you could run the same code with the profile module instead (highly likely to be in your install). Also if you are using Python < 2.5 you may also want to use the profile instead (I read somewhere there was a bug on the cProfile, but I could not recall where I saw it). Below you can read the output of the profiler.
1246109 function calls (1096109 primitive calls) in 1.428 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.428 1.428 :1()
156000/6000 0.366 0.000 0.905 0.000 copy.py:144(deepcopy)
29953 0.008 0.000 0.008 0.000 copy.py:197(_deepcopy_atomic)
6000 0.154 0.000 0.535 0.000 copy.py:223(_deepcopy_list)
6000 0.034 0.000 0.740 0.000 copy.py:250(_deepcopy_dict)
47953 0.105 0.000 0.131 0.000 copy.py:260(_keep_alive)
6000 0.040 0.000 0.861 0.000 copy.py:276(_deepcopy_inst)
3000 0.180 0.000 0.258 0.000 crossovers.py:6(uniformCrossover)
6600 0.005 0.000 0.017 0.000 fitnesses.py:5(oneMax)
6000 0.006 0.000 0.006 0.000 ind_n_pop_classes.py:16(__init__)
11 0.023 0.002 0.040 0.004 ind_n_pop_classes.py:35(evaluate)
10 0.004 0.000 1.071 0.107 ind_n_pop_classes.py:63(selection)
10 0.026 0.003 0.317 0.032 ind_n_pop_classes.py:67(crossover)
18011 0.074 0.000 0.079 0.000 random.py:147(randrange)
18011 0.023 0.000 0.102 0.000 random.py:211(randint)
10 0.081 0.008 1.067 0.107 selections.py:7(tournamentSelection)
1 0.001 0.001 1.428 1.428 test.py:39(main)
24000 0.033 0.000 0.033 0.000 {hasattr}
227953 0.042 0.000 0.042 0.000 {id}
18038 0.004 0.000 0.004 0.000 {len}
12008 0.004 0.000 0.004 0.000 {method 'add' of 'set' objects}
293953 0.085 0.000 0.085 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
203953 0.061 0.000 0.061 0.000 {method 'get' of 'dict' objects}
6000 0.002 0.000 0.002 0.000 {method 'iteritems' of 'dict' objects}
141011 0.032 0.000 0.032 0.000 {method 'random' of '_random.Random' objects}
6000 0.006 0.000 0.006 0.000 {method 'update' of 'dict' objects}
21 0.000 0.000 0.000 0.000 {range}
6600 0.012 0.000 0.012 0.000 {sum}
3000 0.017 0.000 0.017 0.000 {zip}
Yes, I used the deepcopy method because it was nice and make my live easy. Yup, big mistake. That force my selection to take almost 67% of the overall execution time. Quite unacceptable. Thanks to the profiler, now I knew were to look for slowness and more important, I learn what Python blanks in my knowledge need to be improved ![]()
Filed by Xavier at 10:36 am under Notes
No Comments
Jena 2 Inference Support is a nice introduction to the inference engines provided by the Jena package. Besides standardized reasoning for RDF and a subset of OWL/Lite and OWL/All ontologies, it also provides the mechanisms for creating your own rule-based inference engine using the generic rule-based inference also provided.
Powered by ScribeFire.
Filed by Xavier at 10:04 am under Notes
No Comments
I am currently working on an open source project (do not ask me which since it will surface soon, and I should not talk much about it till it does ;)) that required to provide web access to apps, services, and contents. From my days fighting with Mulgara descriptors I remembered that Jetty (full-featured web server implemented entirely in Java) could be embedded into applications to provide such services. It has been two months now since I started using, and it is a nice, shiny, and slick piece of software. I used Tomcat for most of my stuff, but Jetty is definitely and amazing alternative Below I just pasted one of the ways you can embed Jetty in your app.
Server server = new Server(8080);
Context root = new Context(server,"/",Context.SESSIONS);
root.addServlet(new ServletHolder(new HelloServlet("Hello World!")), "/*");
server.start();
server.join();
Yes, that’s it. You can also embed full-fledge multiple web apps using
Server server = new Server();
XmlConfiguration configuration = new XmlConfiguration(new File("myJetty.xml").toURL()); //or use new XmlConfiguration(new FileInputStream("myJetty.xml"));
configuration.configure(server);
server.start();
server.join();
Oh, and one last cool thing. You can remove apps from the server without needing to restart it! That is pretty useful.
Filed by Xavier at 9:25 am under Notes
No Comments
The E2K blog has moved. You can reach it at
http://dita.ncsa.uiuc.edu/e2k/
Filed by Xavier at 2:40 pm under Notes
No Comments
Bernie just put together this beauty to load small RDF/XML files into Virtuoso’s metadata store (We are using testing the open source version).
DB.DBA.RDF_LOAD_RDFXML(http_get ('URI to the RDF/XML file'),'','Name of the graph in the store');
We have tested loading a 5Million triple RDF/XML and results are pretty nice (It took around 6 minutes to load into a dual Pentium 4 extreme edition at 3GHz with 4GB of RAM on a slow 7500rpm ext3fs). When pushing to larger files, the stream version of this is a must to reduce memory consumption.
Filed by Xavier at 9:19 am under Notes
No Comments
I needed to reset the password for a user on a MediaWiki site. Luckily, I run into this post “Reset a user password on MediaWiki - Greg’s Postgres stuff” which helps you to do so. The five-cent summary for a MySQL powered site:
UPDATE user SET user_password = md5(CONCAT('123-',md5('newpassword'))) WHERE user_id=123;
Filed by Xavier at 6:31 pm under Notes
No Comments
Sometimes you may need to sample a dataset. You may want to get a uniformly sampled subset out of a datatset stored in a file. The perlscript below does the job for you.
#!/usr/bin/perl -w
if ( $#ARGV!=1 ) {
print "Wrong number of arguments\n\t".
"uniform-sampler.pl <file> <sample_proportion>\n";
}
else {
srand();
open(FILE,$ARGV[0]) or die "File $ARGV[0] could not be open";
while($line=<FILE>) {
if ( rand()<$ARGV[1] ) {
print $line;
}
}
close FILE;
}
1;
Filed by Xavier at 8:20 am under Notes
No Comments