Friday, August 20, 2010

Who (and what) I would like to see at DevCamp

Comments, requests and suggestions on my last post are pouring in (re: my last post). Thanks everyone. One of the folks who sent me email asked "Who would *you* like to see speaking at DevCamp, assuming they are in India and willing to deliver a talk, and on what?"

Hmm. Interesting question. I haven't really thought about this very deeply but here is a quick response(very busy day, no time to edit, link to home pages etc, sorry)

In no particular order,

(1) Debashish Ghosh on deep Scala programming. This guy is really good.

(2) Baishampayan Ghose on the technical aspects of paisa.com

(3) Bhasker Kode on Erlang at Hover.in

(4) Peter Thomas - guru on things Wicket-ey, speaking of things wicket-ey. (Due disclosure , old friend of mine)

(5) Narayan Raman on *the evolution* of Sahi (and on running a company based around an open source tool he wrote. How cool is that?)

(6) Anyone from c42 or ActiveSphere on the challenges of setting an n-man (n < 7) consultancy and competing with the big boys (due disclosure both companies built by ex tw -ers. I know a few of them)

(7) Anyone (technical) from FlipKart. They seem to be doing good things (I am a satisfied customer) and I am interested in how they tackle the huge challenges in building (for e.g) reccomendation systems.

(8) Anyone at all in India doing serious work in Haskell (Scala would do).

(9) Anyone building/working in a *technically* challenging startup (Notion Ink, say) on their *technical* challenges.

(10) ThoughtWorkers hacking on stuff, on what they are hacking on. TW ers in general have all kinds of side projects going. The two Viveks (Prahalad and Singh) would be a good start.

Tuesday, August 17, 2010

Speaking at DevCamp 2010

I'll be speaking at DevCamp 2010. As in 2008, I have a "menu" of topics that people can vote on and will select the topic at the very last minute. Since I don't use slides(in general) this isn't very hard to pull off. More on this below.

Dev Camp is interesting because in India, there aren't any "developer to developer" conferences. Most are either company sponsored events (e.g Sun/Oracle/Adobe Tech days) or are overrun by "evangelists" hired by MegaCorps to sell their crapware to developers. DevCamp attempts to head these people off by stating "DevCamp is an annual BarCamp style unconference for hackers, by hackers and of hackers that began in Bangalore in 2008 with code and hacking as its core themes" Some of these "evangelists" are shameless enough to crash the conf anyway, but the Law of Two Feet often takes care of that.

So what do you talk about at DevCamp? (everything that follows is *my* opinion. I have nothing to do with the organizing of DevCamp)

If you are any kind of hacker, you have a pet project running on the side. You are learning or doing something that might be of interest to other developers. So in the last devcamp I attended (in 2008) someone was trying to replace JMeter with an equivalent Erlang tool and he gave a very interesting talk on the advantages and challenges of this approach.

Bring your laptop and show us what you are working on. *Don't* make one of those slide heavy "Introduction to Blah" type talks that are prevalent at most Indian conferences (last year's PyConf India was a good example of this iirc. Hopefully this year is better). Your audience consists of professional developers who are quite comfortable with looking up stuff on the Internet.

As the Dev Camp page puts it, "Assume a high level of exposure and knowledge on the part of your audience and tailor your sessions to suit. Avoid 'Hello World' and how-to sessions which can be trivially found on the net. First hand war stories, in-depth analysis of topics and live demos are best. ". Again some folks so try to sneak in "Introduction to Blah" where X is the latest "hot" topic (Clojure or Android would fit the bill these days for e.g), but again "The Law of Two Feet" (mostly) takes care of them.

If you want to talk about Clojure don't do "An Introduction to Clojure". In the days of YouTube, Rich Hickey can do that much better than you could. Talk about "How I built a Text Processing/WebcCrawler library in Clojure" or "My startup runs on Clojure" (and show us the code). Tell us what *you* know that few others do ("in-depth analysis") and/or show us interesting code you wrote ("live demos"). If someone were to do a talk on (for e.g) how the Clojure *compiler* works and the tradeoffs in its design, that would be interesting to me. If you are recycling "Clojure has macros, woot!" I don't care.

The other interesting aspect about DevCamps is how lightweight it is. There is none of the stuffiness associated with the usual company conferences. It is an *un* conference, like Barcamp, but without the legion of SEO marketing people, "bloggers", non-tech "founders" trawling for naive developers who'll work for free on their latest "killer idea"s etc who swarm Barcamp. BarCamp (imo) attracts fringe lunatics. DevCamp attracts (or should attract when it works well) competent developers.

So, these are the things I could talk about at DevCamp. Since I work on Machine Learning and Compilers, the topics reflect that experience. I could talk about how to build a Leasing System in Java but I doubt I'd have anything interesting to say ;-).

Send me email if you have a preference (or leave a comment here). I'll talk about whatever has the highest number of votes on Sep 4. "Customer Development" for sessions woot? Email > comments here > twitter but any and all forms of media are acceptable.

The topics from highest to lowest number of votes registered at the time of writing are

(1)An In Depth Look at the Category Theory Bits in Haskell (expanded version of the old Monad tutorial)

At DevCamp 2008, I presented a talk on "Understanding Monads" where the idea was that someone who knew nothing about Monads should come to the talk and walk out knowing how they work and when to use them. Instead of giving vague analogies("monads are space stations/containers/elephants.." you build monads from the ground up using first class functions. The talk included, in its first iteration, the List, Maybe and State monads. Later versions (over the years I have given the talk a few times) broke down the Category Theory behind monads and how it helps in structuring programs.

The latest version encompasses all the hairy Category Theory related bits and pieces(Applicatives, Monoids, Functors , Monad Transfomers...) which impede programmers trying to learn Haskell/Scala/ML etc. I don't assume any theory/math background from the audience and introduce required formalisms. The good news is that this is a very polished and popular topic (and is trending highest in the number of "votes") . The bad news is that I am bored of this talk (but will still use it if it scores the highest number of votes).


(2) Building a Type Inferencer in 45 minutes


Static Type Systems, especially those more powerful than the Java/C# variety are a mystery to most programmers. This can be seen for example in how developers with a Java background write "Java in Scala" than idiomatic Scala. The best way (and the Hacker's way) to understand how a Type Inferencer works is to build one. This session builds a Hindley Milner type checker with a couple of extensions.

(3) WarStory: How I escaped Enterprise SW and became a Machine Learning Dev

Self explanatory ;-)


(4) Proof Technique for Programmers - A Developer's gateway to Mathematics (and Machine Learning)


This comes out of something I observed in the Bangalore Dev community. A lot of people read "Programming Collective Intelligence" (a terrible book - read my HN "review" here - I am "plinkplonk". See also comments by brent) and fancy themselves "Machine Learning" people ("we aren't experts but we know the basics". Ummm . No, you don't :-P. )

The sad truth is, you can't do any serious machine learning (or Computer Vision, or Robotics, or NLP or Algorithm heavy) development without high levels of mathematics. "Pop" AI books like PCI are terrible in teaching you anything useful.

To quote Peter Norvig from his review of Chris Bishop's Neural network book (emphasis mine)

"To the reviewer who said "I was looking forward to a detailed insight into neural networks in this book. Instead, almost every page is plastered up with sigma notation", that's like saying about a book on music theory "Instead, almost every page is plastered with black-and-white ovals (some with sticks on the edge)." Or to the reviewer who complains this book is limited to the mathematical side of neural nets, that's like complaining about a cookbook on beef being limited to the carnivore side. If you want a non-technical overview, you can get that elsewhere (e.g. Michael Arbib's Handbook of Brain Theory and Neural Networks or Andy Clark's Connectionism in Context or Fausett's Fundamentals of Neural Networks), but if you want understanding of the techniques, you have to understand the math. Otherwise, there's no beef. "

The "if you want understanding of the techniques, you have to understand the math" bit is true for all areas of ML, not just Neural networks. The biggest stumbling block (there are many ;-)) for most developers attempting to grok the underlying mathematics is the proof based learning method most higher level Math/Machine Learning books assume.

E.g here is the *first* exercise of the *second* chapter of "Elements of Statistical Learning", a which (unlike PCI) book you *should* read if you plan to do Machine Learning-ey things

"Suppose each of K-classes has an associated target tk , which is a
vector of all zeros, except a one in the kth position. Show that classifying to
the largest element of y amounts to choosing the closest target, mink ||tk − y ||, if the elements of y sum to one."


This "Given X, Prove Y" structure is how almost all books in the field teach things. Sure you should code up the algorithms, but doing such problems is how you get *insight* into the field. And algorithms have their own problems (pun intended). Open Cormen et al's "Introduction to Algorithms" and you'll find questions like (randomly opening the third edition)

Problem 20.1 (e) Prove that, under the assumption of simple uniform hashing, your RS-vEB-TREE-INSERT (Note vEB tree == van Emde Boas tree) and RS-vEB-TREE-SUCCESSOR run in O(lg lg u) expected time.

Thus it turns out that for getting into many areas of interest, a knowledge of how to prove things is critical. You will make very slow or zero progress without that understanding. That is the bad news. The good news is, proofs are (relatively) easy for programmers to understand when presented the right way (acquiring skill takes a while). I wasted many years learning this stuff in inefficient ways. Don't make the same mistake.

Zero math background required. Just bring some paper to write on.


(5) Trika - A Hierarchical Reinforcement Learning framework in Scala

A demo and discussion on an RL framework I built. I haven't yet cleared the paperwork to Open Source this (the process is like pulling teeth, long story), but I can still show it off.

(6) Neuro genetic Algorithms - Theory and Applications


An interesting branch of AI/ML with some elegant applications. Again live demo of a couple of interesting algorithms and talk about design/performance trade-offs.

(7) Denotational, Operational and Axiomatic Semantics - Designing programming languages with mathematics

This is of interest to people building their own languages. Most language implementations are adhoc "hacks". They don't have to be.

If you plan to attend, let me know which of these topics strike your fancy. And if you are a reader of this blog, find me and say Hello.

See you at DevCamp!