2007: J F M A M J J A S O N D
2008: J F M A M J J A S O N D
2009: J F M A M J J A S O N D
2010: J F M A M J J A S O N D
2011: J F M A M J J A S O N D
2012: J F M A M J J A S O N D

Blog, by category: computing

from the desk of travis johnson.

spamfunc for optimization in matlab (from 2012/01/30)

what spamfunc is

In developing optimization algorithms, one of the most tedious parts is trying different examples, each of which might have its own starting points or upper or lower bounds or other information. The tedium really starts when your algorithm requires first or second order information, which might be tricky to calculate correctly. These bugs can be pernicious, because it might be difficult to differentiate between a bug in your algorithm and a bug in your objective or constraint evaluation. Handily, Northwestern Professor Robert Fourer wrote a language called AMPL, which takes a programming-language specification of objective and constraints and calculates derivatives as needed. The official amplfunc/spamfunc reference is contained in Hooking Your Solver to AMPL, but I'm shooting for a more low-key introduction.

You'll need to compile spamfunc before using it, but I'm saving that for last–until after I've shown off it's power a little bit. You'll also need a .nl file for spamfunc; these are generated by AMPL's command line program from an AMPL .mod source file. I'll be using rosenb.nl as my example here, for the Rosenbrock function.

>> [x0,bl,bu,lam,cl,cu]=spamfunc('rosenb.nl');

This serves as the initialization. It's worth noting the mathematical form AMPL bends your problem into. In particular, AMPL's spamfunc returns your problem in the form

 begin{split} 	min qquad 		& f(x) 	text{s.t.}qquad 	& c_L leq c(x) leq c_U 						&	 x_L leq x leq x_U. 						end{split}

So we've initialized 6 variables for our environment: the initial x_0 we specified in the mod file, lower(x_L) and upper bounds on those variables, the initial Lagrange multipler(lam, short for lambda), and the lower(c_L) and upper(c_U) bounds on the constraints. To get zeroth order information(that is, f(x) and c(x)) for a particular point x, just call spamfunc again with the the x you'd like to evaluate at, as in

>> [f,c] = spamfunc(x,0);

To get first order information(the gradient of the objective and Jacobian of constraints), call spamfunc as

>> [g,A]=spamfunc(x,1);

so that the gradient nabla f(x) is stored in g, and the i^{th} row of A is the transposed-gradient (nabla c_i(x))^T. That is, the entry A_{ij} of the matrix A is the derivative frac{partial c_i(x)}{partial x_j}.

Finally, to get the Hessian of the Lagrangian, call spamfunc with just the multipliers lambda, as in

>> WL = spamfunc(lam);
>> WF = spamfunc(0*lam); % if you just want \nabla^2 f(x).

Here, spamfunc reuses the last x at which spamfunc(x,1) was evaluated at.

how to use spamfunc in your own code

Fine and good, but how do you use this in your code? I'll consider a very simple unconstrained optimization problem(and finally define the Rosenbrock function, if you've been waiting with bated breath). The Rosenbrock function–or, Rosenbrock's banana function, as I'll be calling it from now on, is defined as

 	f(x_1, x_2) = 100(x_2 - x_1^2)^2 + (1-x_1)^2

If we wanted to use our spamfunc interface to solve this, we'd generate the .nl file as described below, then do a standard Newton method iteration. Since Newton steps are defined as p_k = -nabla^2 f(x_k) nabla f(x_k) and then the iterate is updated as x_{k+1}=x_k + p_k, the most basic possible form of the algorithm is

[x,bl,bu,lam,cl,cu]=spamfunc('rosenb.nl');

grad = inf;
while norm(grad) > 1e-6
	    [grad,A] = spamfunc(x,1);
	    W = spamfunc(lam);
	    pk = -W\grad;
	    x = x + pk
end

Note: This is a terrible algorithm; there're no bells and whistles like line search, Hessian modification, etc. But it demonstrates calling at least a couple of spamfunc options.

how to compile spamfunc

I recovered an outline for building spamfunc and amplfunc from a project I've worked on; it should get you a compiled spamfunc binary. You need to make sure to have mex (MATLAB's compiler wrapper) in your path and properly configured.

wget --ftp-user anonymous --ftp-password traviscj@traviscj.com ftp://netlib.org/ampl.tar
tar xf ampl.tar
cd ampl/solvers
sh configurehere
mv makefile makefile.dist
sed "s/CFLAGS = -O/CFLAGS = -fPIC -O/g" makefile.dist > Makefile
make
cd examples
# following line should reflect where mex(1) is.
export PATH=/meas/bin:$PATH
mv makefile makefile.dist
sed "s/- o/-o/g" makefile.dist | sed "s/mex -I/mex -ldl -I/g" > Makefile
make spamfunc.mex amplfunc.mex

It's worth noting the difference between amplfunc and spamfunc; amplfunc returns dense matrices while spamfunc returns them in sparse format. MATLAB can do linear algebra on sparse format matrices much more efficiently than dense format matrices(as long as the matrices themselves are sparse enough!)

how to generate nl files

In the above examples, I started with an AMPL source file, rosen.mod.

rosenb.mod
var x {1..2};

minimize obj: 100*(x[2] - x[1]^2)^2 + (1-x[1])^2;

let x[1] := 1.2;
let x[2] := 1.2;

This file tells AMPL what how many variables you need, how you'd like to name them, what objective function you're minimizing, and what initial values they should take. To compile it to a .nl file, use

$ ampl -P -ogrosenb rosenb.mod

Congratulations! You now have a .nl file to feed into your super sweet solver!

Getting GnuPlot to compile on OSX (from 2010/05/27)

I misunderstood an officemate’s question yesterday and set out to compile gnuplot. Stock GNUPlot will fail to compile on OSX due to an incompatibility in the readline library. So when you compile, let GNUPlot know to use the one included:

> tar xzvf gnuplot*
> cd gnuplot-4.2.39
> ./configure --prefix=/usr/local --with-readline=builtin
> make
> sudo make install

ZSH and scp/rsync globbing (from 2010/05/25)

I finally got annoyed at typing

tjohnson> scp tcj:*txt .
zsh: no matches found: tcj:*txt

I’d always fixed this by the up-CTRL-A-over-over-over-over-quote-CTRL-E-over-over-quote method, ie

tjohnson> scp "tcj:*txt" .

but the more proper and less annoying version is

tjohnson> scp tcj:\*txt .

However, in the spirit of inspired overkill, there are a couple other options. You can define some new aliases with

alias scp='noglob scp'

but that hardly counts as overkill, so instead I edited my .zshrc to include

autoload -U url-quote-magic
zle -N self-insert url-quote-magic
zstyle -e :urlglobber url-other-schema \
'[[ $words[1] == scp ]] && reply=("*") || reply=(http https ftp)'

Now, as I type scp tcj:* it replaces it with \*, negating the problem. Also, this fixes the problem where

wget http://traviscj.com/dynpage.php?var=1&importantvar=zyzyx

interprets the & as a ‘background this command’ and throws a couple of errors. Nice!

Pylab PDF (from 2010/05/25)

I’m in the process of kicking the MATLAB habit and replacing it with Python. Pylab is a reasonable approximation to the 2D plotting capabilities of MATLAB, but by default Pylab doesn’t run on headless(ie, non-GUI) boxes. So on my macbook, where I’m often running things interactively, I have

backend : macosx

and on the faster core2quad machine I send off longer-running jobs to, I used

backend : PDF

in the .matplotlib/matplotrc file.

Spaces setup (from 2010/05/25)

My workflow is probably pretty specific, but I figured I’d document it for people looking for ideas on how to set it up. This one seems to work pretty well for me.

  1. Terminal: for running stuff on my and other computers.

  2. Textmate: writing mathematics, programming code, etc

  3. Mathematica, MATLAB, and Papers

  4. Mail and Adium

  5. Firefox, Chrome, Safari

  6. iTunes and VLC

I have this arranged in a 3×2 grid, such that Textmate is top center. From there I can navigate to terminal, math programs, and Firefox directly(control-{left,right,down}). It takes two navigations to get to communication programs and entertainment, which is good for concentration while working.

MySQLdb module in Python on Ubuntu (from 2009/05/30)

To install the mysqldb module in python on Ubuntu:

sudo apt-get install python-mysqldb

Computer Ressurection and Elastic Cloud Experimentation (from 2008/11/29)

I was home on Thanksgiving Break with Sharvil, and we decided to revive some old computers. Partly I'd like to experiment with some clustering stuff without incurring CPU time at the AMATH department or Teragrid stuff I'm likely gonna be working on soon with Shea-Brown's neuroscience research. So, it turns out I resurrected about 5-6 old computers(final tally is still waiting on the number of successful Xubuntu installs on them, among other practical issues(where the hell am I going to put six computers…?): The very first computer I built(a P3 450), P3 700, Dual P2 266, a couple of AMD64 3200's, and a Sony Vaio P3 733. The cool thing is that the neuron spiking models are basically embarassingly parallel(well, each run isn't necesarily, but from what I've gathered so far, we're looking for averages over a bunch of them. So, sweet! Again, this would be terrible for actual research, especially against something like TG or even Amazon's EC2–which is another thing I really need to check out.

Sharvil and I also managed to restore functionality in a set of other computers, but the details of that have to remain somewhat quiet, apparently. CIA-types like their privacy, I suppose. Probably I'll drop in a plug about it on some future project. On the way home from that I picked up a little light bulb for the dome lamp in my old F250 Diesel so that Dad won't have to search though his cab in the dark anymore.

Above I mentioned Xubuntu… but currently I'm installing regular Ubuntu. It's taking FOREVER. Which is why I think it'll end up being Xubuntu. Either way, I guess I shouldn't have started with the second-slowest computer on some of the slowest hard drives I've got. I'm probably going to wait for it to finish, then install Xubuntu on another drive or try imaging this one or something to see how long it takes.

Incidentally, I decided it'd be a great time to run some Amazon EC2 tests(since I happened across some tutorials on YouTube. It's amazing… You load up the Firefox interface, find the Amazon machine image you want, click 'boot’, and in short order you can ssh over to it directly. I decided I'd run a little python Sieving program to find prime numbers. I haven't got the final numbers yet, because I want to get a pretty graph…. but it's pretty sweet so far. The gist is: A P3 450 machine takes 8 times longer to run the program than my MacBook Pro, a small-image Amazon EC2 instance runs in just a little bit less time, and a extra-large EC2 High-CPU node runs it in just about 60% of the time it takes my MacBook Pro. Granted–this is a terrible benchmark, and it's straight Python, and I'm comparing across OSX vs Linux, i586 vs Core2Duo vs ??, etc, but these are some good rough numbers for how long this stuff is going to be taking. I'm a little impressed my MacBook Pro was keeping up that well… but it has been a great little machine. And it explains why it runs so hot, probably. In any case, I've certainly had less fun with $.92!

UPDATE: Here's that follow-up with the graph. It's pretty revealing:<img class=“alignnone size-medium wp-image-97” title=“Relative rates of Execution” src=“http:www.traviscj.comwp-contentuploads200811sieve_relative-300x225.jpg” alt=“” width=“300” height=“225” >

Incidentally, after a bit more sleep I realized that obviously this has nothing to do with float point ops, so this isn't entirely indicative of how these machines would fare on scientific computations. But probably still interesting. Honestly I'm not quite sure why the AMD64 did so well. The machine <em>feels</em> way slower than the MBPo, but they are fairly close in actuality.

Powered by Olark