In my current job, my big data projects are moving forward.  Having had a small cluster to play with, the organization has finally seen the value in going all out.  That’s great, so now the fun begins, who gets to be on the team?  This is one of those conversations that I get to have with many players in the organization and some feel quite strongly that their group should be in charge.  I have my own opinions for that, some of which surprise people because I don’t see my current role as the one who will ultimately be driving the bus, but I currently do.


Clearly if you want to create a team you need IT people.  So we have people who can set up the physical cluster, that’s actually the easy part.  Then you need people who can manage the map reduce and related systems.  So you will need people who can talk to you about Hive, Hbase, Pig, Cassandra, etc…  These are tools that help you do the analysis work of big data (assuming you went the Hadoop route).  This is a little more difficult; it might require a bit of adjusting but shouldn’t delay you that much.


Then comes the analysis part, that’s where everyone wants a seat at the table.  That is where the phrase, herding cats can come into play if you do not have a solid plan.  It can also fall apart very easily if you do not have a strong leader in charge.


Most of your support team will be IT; they run the cluster and the put the data into the system so you can do the analysis on the back end.  Some of the analysis may be done by an actual person or you can use machine learning to do some of that analysis.  Again, someone who knows machine learning may be in IT but I find the best ones know stats and have a solid understanding of what it is you are trying to solve.  Otherwise, if you don’t have such a person around, the business owner is going to have be that person and know enough about coding to understand what is possible and what is not.  Because if you don’t have the jack of all trades person on your team, you can find programmers to do the job of the actual coding but you as the business owner, need to be solid on your user stories so that the developers can do the best job possible.


Those are the basics you need, there is more but I try to keep these posts short.

