4links: Data Scientist’s ‘janitor work’, technologies used by startups & development outsourcing + CloudDiagram

My 3 links for today are:

– For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights: This piece of NYT highlights the fact, that while incredibly fancy tech tools are out there to master the algorithmic and big data aspects of data scientist’s work, the start point of every work is a process step called ‘data wrangling’, which means preparing data to make them comparable and computable for processing. My hope is that by applying techniques from the Semantic Web area (ontologies, etc.) people will in future have to spare less time on the basic wrangling, and can focus more on the interesting parts of projects.

– AngelList is a database for the startup ecosystem, including startups, VCs, locations… (similar to Crunchbase, where Crunchbase has more focus on the financial transactions). Happily enough, AngelList (such as Crunchbase) has a quite open API to retrieve their data… and GeekTime published “Which Technologies Do Startups Use? An Exploration of AngelList Data.” Not so much surprising AWS and Heroku are quire dominant in IT infrastructure, more surprising is the clear advantages JavaScript/ Node.js have for many startups.

– You might come to the point where your development capacity needs to scale,… and your are not the first who wants to grab the enormous dev talent potential in eastern Europe. Tips for Outsourcing Web Development to Eastern Europe has some goods hints on this one.

A shameless self-plug: I started gathering interest for a solution, which creates a printed large-scale diagram based on your AWS cloud infrastructure: if your interested feel free to head to http://clouddiagram0.datenprodukt.com/ .

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>