MapReduce and its discontents

Había leído hace tiempo esta presentación

Pero hasta el curso en el que estoy esta semana no lo he visto tan claro.

Sin duda que Hadoop resulta útil en muchos escenarios, ¿pero es la solución universal al escenario Big Data actual?…tengo ya una opinión, pero quizás sea pronto para dejarla por escrito…de momento me quedo con algunas de las opiniones de @deanwampler, concretamente con esta slide, hoy mismo he sentido lo mismo!!! Para ejecutar un ejemplo tonto Map Reduce necesitas tener escrito los pasos!!!!

Frase lapidaria!!!

“I worked with EJBs a decade ago. Like EJBs, Hadoop has an invasive API that obscures your business logic and reusability. There were too many conﬁguration options in XML ﬁles.

The framework “paradigm” is a poor ﬁt for most problems (like soft real time systems and most algorithms beyond Word Count). Internally, EJB implementations were inefficient and hard to optimize, because they relied on poorly considered object boundaries that muddled more natural boundaries and created large-scale, monolithic modules with few abstractions for extension and optimization points.

I’ve also argued in other presentations and my “FP for Java Devs” book that OOP is a poor modularity tool…

The fact is, Hadoop reminds me of EJBs in almost every way. It works okay and people do get stuff done, but just as the Spring Framework brought an essential rethinking to Enterprise Java, I think there is an essential rethink that needs to happen in Big Data. The FP community is well positioned to create the next generation.”

“The “next gen.”, V2.0 Hadoop ﬁxes some problems, but I think this ﬁrst-generation infrastructure has too many ﬂaws to be dominant for a long time (at least outside large enterprises that always stick with suboptimal solutions sold by big-name players). The Java-OO eccentricities, the overly-large and bloated modules, the premature optimization (and missing optimizations that wouldn’t be premature), lead me to believe that Hadoop will be displaced the same way that the Spring Framework displaced EJBs.”

Y ya en un mundo que me resulta más familiar dentro de Spring Data Spring ya existe una versión para trabajar con Hadoop: Spring Data Apache Hadoop aún en M2, y cierto que se simplifica su configuración:

¿Pero será suficiente? ¿O vuelve a acertar Google que ya se está moviendo de MapReduce a Pregel?…seguiremos investigando 😉

MapReduce and its discontents

Tu voto:

Deja un comentario Cancelar la respuesta

MapReduce and its discontents

Tu voto:

Comparte esto:

Deja un comentario Cancelar la respuesta