ML / algorithms / data science / and just hack around...
A few BigML open source projects where I've acted as the main contributer. They're all data-oriented, offering techniques for summarizing or sampling over large (or streaming) data sources.
Various probabilistic hashing/sketching algorithms in Clojure (bloom filters, min-hash, hyper-loglog, count-min). Useful for making compact and mergable summaries of streaming data. Handy for distributed systems and even for ML tasks (specifically min-hash).