A Probabilistic Look at MDS

Headed to NYC in July! Will be presenting my paper on a probabilistic variant of multi-dimensional scaling (MDS) at the 2016 International Joint Conference on Artificial Intelligence (IJCAI)! The acceptance rate was below 25% so, it’s certainly satisfying that the paper was accepted.

You can read the pre-print here and slides here.

About the work: we take a fresh Bayesian view of MDS—an old dimensionality reduction method—and find connections to popular machine learning methods such as probabilistic matrix factorization (used in recommender systems) and word embedding (for natural language processing).

The probabilistic viewpoint allows us to connect distance/similarity matching to non-parametric learning methods such as sparse Gaussian processes (GPs), and we derive a novel method called the Variational Bayesian MDS Gaussian Process (VBMDS-GP) [yes, a mouthful!]. As concrete examples, we apply it to multi-sensor localization and perhaps more interestingly, political unfolding.

VBMDSGP_PoliticalUnfolding.pngIn the unfolding task, we projected political candidates to a 2-d plane using their associated Wikipedia articles and ~15,000 voter preference survey done in 2004 for other candidates. The projection is not perfect since we use very simple Bag-of-Words (BoW) features—I think Sanders is a more liberal than the map implies—but is nevertheless coherent. We see our favorite political candidate, Donald Drumpf, projected to the conservative section and President Obama projected near the Clintons.

The model can be extended in lots of different ways; I’m working on using more recent variational inference techniques, plus maybe some “deep” extensions.


Ethnicity in Singapore: A Visualization using D3.js

Screen-shot of SG Ethnicity VisualizationI’ve been playing around with D3.js the past few weeks and just completed a first-cut visualization of Singapore’s ethnic demographic has changed from 2000 to 2010. Feedback is welcome, particularly on what works and what doesn’t (visually). Some context: the idea for this explorative tool came up after a few conversations with locals about the extent of ethnic integration. Several voiced opinions left me with questions about the extent of racial discrimination in Singaporean society. Although the visualization doesn’t address this question, it was a step towards a better understanding Singapore’s ethnic demography. Feel free to explore the visualization, and you can learn more at Singstats.

Predicting Network Centralities from Node Attributes

It’s been a great December, ending the year quite nicely! I attended NIPS, and bumped into my PhD supervisor Yiannis. We had a enjoyable time at the conference and exploring Montreal (a beautiful city). I also presented a poster at the NIPS Workshop on Networks about how to link node features to eigenvector centrality via a probabilistic model; for example, mapping a person’s attributes to how influential he or she is in a social network:

NetworkCentrality_NIPSWsSpotlightAbstract: Among the variety of complex network metrics proposed, node importance or centrality has potentially found the most widespread application—from the identification of gene-disease associations to finding relevant pages in web search. In this workshop paper, we present a method that learns mappings from node attributes to latent centralities. We first construct an eigenvector-based Bayesian centrality model, which casts the problem of computing network centrality as one of probabilistic (latent variable) inference. Then, we develop the sparse variational Bayesian centrality Gaussian process (VBC-GP) which simultaneously infers the centralities and learns the mapping. The VBC-GP possesses inherent benefits: it (i) allows a potentially large number of nodes to be represented by the sparse mapping and (ii) permits prediction of centralities on previously unseen nodes. Experiments show that the VBC-GP learns high-quality mappings and compares favorably to a two-step method, i.e., a full-GP trained on the node attributes and network centralities. Finally, we present a case-study using the VBC-GP to distribute a limited number of vaccines to decrease the severity of a viral outbreak.

Download Paper PDF | Download NIPS Networks Spotlight Slides

First Day at SMART.

SMART Desk viewWell, my first working day at SMART is almost over. 4 minutes and 35 seconds to the stipulated end-of-work-day. But who’s counting? So far, it’s been interesting — met the friendly folks here and saw the cool toys (autonomous vehicles). I’m one of the “early birds” and managed to land a desk with a great view of the NUS campus. That said, I might move down to “The Garage” where all the robots/machines are. Hopefully, I’ll sort out all my administration stuff soon and get on to playing working with the vehicles and some new learning methods I have in mind.

Rebuilding libstdcxx using macports on Mountain Lion

I did the unthinkable and upgraded my OS (in my final year of my PhD!). And surprise-surprise, some of my code wouldn’t compile anymore. I figured I needed to rebuild my macports-installed *nix software but ran into problems with gcc45 and libstdcxx. The issue is a ld64 bug, that was fixed using user adrian’s solution (replicated here):

sudo port uninstall ld64
sudo port -v install ld64
sudo port clean libstdcxx
sudo port -d build libstdcxx build.jobs=1
sudo port install libstdcxx