Workable published a short data-science (probabilistic) puzzle at http://buff.ly/1Rip3b0:
Suppose we have 4 coins, each with a different probability of throwing Heads. An unseen hand chooses a coin at random and flips it 50 times. This experiment is done several times with the resulting sequences as shown below (H = Heads, T = Tails) Write a program that will take as input the data that is collected from this experiment and estimate the probability of heads for each coin.
Well, I thought I would spend a few minutes on it and well, a few minutes turned into more than 30. My solution is simply maximum likelihood estimation (MLE), i.e., minimizing the negative log likelihood of the data given the model parameters:
where is the number of observed heads in the sequence of 50, is the coin selected ( forms a categorical distribution over the 4 coins for each sequence), and is the probability of heads for coin .
Since I’m all about trying Julia nowadays, that’s what I coded it up in (hosted on github). The first-cut solution (coinsoln_old.jl) found the following MLE estimates (negLL: 278.2343):
Maximum Likelihood Estimates: [0.428 ,0.307, 0.762, 0.817]
The first solution didn’t use any speed-up tricks nor the derivatives so, it should be easy to follow but is not terribly accurate or efficient. I then tried out automatic differentiation, which required minor code changes and sped up the computation significantly. This updated, faster solution (coinsoln.jl) found a slightly different result using conjugate gradients (negLL: -7.31325)
Maximum Likelihood Estimates: [0.283, 0.283,0.813,0.458]
Oh, and if I made any stupid errors, please let me know.
Over the past few months, I’ve been exploring Julia, a new open-source programming language that’s been gaining traction in the technical computing community. For the uninitiated, Julia aims to solve the “two-language problem”—having to code in a high-level language (e.g., Python/Matlab) for prototyping and a low-level language (e.g., C/C++) for speed. I found Julia easy to pick up—it’s syntactically similar to MATLAB with some important differences—and I’ve enjoyed writing a few small tools.
Admittedly, I did encounter a few problems along the way. For example, initial installation and setup was a breeze, but I had trouble getting IJulia and Gadfly to play nice. Jiahao kindly helped me resolve this problem; as a tip, installing Anaconda’s Python distribution helps avoid many issues. Although native julia packages are being built rapidly, I’ve found quality to be mixed. When I needed a mature library, it was easy to make calls to C and Python, but that’s an additional step and dependency. There are a few additional quirks with regard to package imports and function replacement (something I haven’t quite gotten down yet) and garbage collection, which can take significant CPU time if you aren’t careful. An additional thing to note is that Julia uses a just-in-time (JIT) compiler so, first time runs are typically slower.
All-in-all, I believe Julia is at the point where, personally, the pros outweigh the cons, and I’m starting to port over some of my older code. If you haven’t yet tried Julia, I highly recommend giving it a go. But like learning anything new, expect to be a little confused at times and regularly consulting the Julia documentation. Oh, and the helpful and growing Julia community.
I’ve been playing around with D3.js the past few weeks and just completed a first-cut visualization of Singapore’s ethnic demographic has changed from 2000 to 2010. Feedback is welcome, particularly on what works and what doesn’t (visually). Some context: the idea for this explorative tool came up after a few conversations with locals about the extent of ethnic integration. Several voiced opinions left me with questions about the extent of racial discrimination in Singaporean society. Although the visualization doesn’t address this question, it was a step towards a better understanding Singapore’s ethnic demography. Feel free to explore the visualization, and you can learn more at Singstats.
It’s been a great December, ending the year quite nicely! I attended NIPS, and bumped into my PhD supervisor Yiannis. We had a enjoyable time at the conference and exploring Montreal (a beautiful city). I also presented a poster at the NIPS Workshop on Networks about how to link node features to eigenvector centrality via a probabilistic model; for example, mapping a person’s attributes to how influential he or she is in a social network:
Abstract: Among the variety of complex network metrics proposed, node importance or centrality has potentially found the most widespread application—from the identification of gene-disease associations to finding relevant pages in web search. In this workshop paper, we present a method that learns mappings from node attributes to latent centralities. We first construct an eigenvector-based Bayesian centrality model, which casts the problem of computing network centrality as one of probabilistic (latent variable) inference. Then, we develop the sparse variational Bayesian centrality Gaussian process (VBC-GP) which simultaneously infers the centralities and learns the mapping. The VBC-GP possesses inherent benefits: it (i) allows a potentially large number of nodes to be represented by the sparse mapping and (ii) permits prediction of centralities on previously unseen nodes. Experiments show that the VBC-GP learns high-quality mappings and compares favorably to a two-step method, i.e., a full-GP trained on the node attributes and network centralities. Finally, we present a case-study using the VBC-GP to distribute a limited number of vaccines to decrease the severity of a viral outbreak.
Download Paper PDF | Download NIPS Networks Spotlight Slides
If you’re ever getting “no such instruction” errors using Theano, e.g.,
no such instruction: `vmovd %rax, %xmm0′
try inserting the following lines into ~/.theanorc (create if missing)
More info on stackoverflow.
It’s been a while since I last updated this blog. My excuse is that I’ve been busy but who isn’t? ;) Just submitted a couple of conference papers with another journal paper in the review stage—I’ll post them up here soon. I’m excited about one of the conference papers which ties complex networks metrics with machine learning (specifically, centrality with a sparse GP), which can have applications in a broad range of areas.
Working at SMART has been quite enjoyable this past year; from hanging out in the (autonomous vehicle) garage to data analytics @ the fishbowl (pictures to come). A few of my fellow postdocs have moved on to greener pastures since I joined, which has gotten me thinking about what lies ahead. As usual, there is some internal tension brewing between choosing academia or industry. I’m would prefer the former but am open to the latter—given the limited number of tenure-track faculty positions available, I believe this to be the best perspective. Anyone with experience have any thoughts?
I’ll update more often from now on given I’ve mostly settled down into a routine. Oh, and I’m engaged so, a wedding next year. Crazy, exciting time in my life!
Well, my first working day at SMART is almost over. 4 minutes and 35 seconds to the stipulated end-of-work-day. But who’s counting? So far, it’s been interesting — met the friendly folks here and saw the cool toys (autonomous vehicles). I’m one of the “early birds” and managed to land a desk with a great view of the NUS campus. That said, I might move down to “The Garage” where all the robots/machines are. Hopefully, I’ll sort out all my administration stuff soon and get on to
playing working with the vehicles and some new learning methods I have in mind.