Archivo de autor

DSS Expert Network es un grupo de profesionales expertos en Big Data y Data Science que forman parte de la mayor comunidad de Data Scientist en España y colaboran de forma habitual en el blog de Synergic Partners aportando su conocimiento del sector.

17
feb
ØMQ: Mensajería asíncrona de alto rendimiento
  • Data Science Spain Expert Network
  • 9191 Views
  • 1 Comment
  • No tags

Un problema recurrente en proyectos de ingeniería de datos es el de conectar diferentes componentes software para que se comuniquen entre sí (flujos de datos, sincronización, creación y consumo de eventos, etc.). El problema se agrava cuando necesitamos conectar código en diferentes lenguajes o que se ejecuta en plataformas de distinto tipo (especialmente, en sistemas distribuidos). ZeroMQ (ØMQ) es un sistema de mensajería que soluciona estos problemas. Inicialmente diseñado para sistemas de transferencia de datos asíncronos y de alta velocidad, ZeroMQ puede conectar código en casi cualquier lenguaje (cuenta con conectores en más de 40 lenguajes) y en cualquier plataforma (GNU/Linux, Windows, OS X). De hecho, sus creadores sostienen que va más allá, ofreciendo un verdadero entorno de concurrencia, puesto…

02
abr
NEW RULES: HOW BIG DATA SCIENCE IS A DIFFERENT GAME (IV)
  • Data Science Spain Expert Network
  • 14087 Views
  • 2 Comments
  • Big Data . Data Science . Technology .

 5. New Technology In a nutshell, most important big data technologies fit one of the following categories: New requirements: No SQL Traditional database management systems are very efficient at storing and finding tabular data and also at doing some basic operations such as aggregating and linking. At least three different causes motivate the need for new paradigms. -       Scalability: The need to distribute the database over many nodes imposes the application of the “CAP theorem" http://en.wikipedia.org/wiki/CAP_theorem. This limitation creates three different classes of databases depending on which of the three restrictions is relaxed. -       Analytics: For machine learning algorithms, we do not need to find part of the data quickly. Instead, we have to pass all the data to some…

26
mar
NEW RULES: HOW BIG DATA SCIENCE IS A DIFFERENT GAME (III)
  • Data Science Spain Expert Network
  • 14135 Views
  • 1 Comment
  • Big Data . Data Science .

3. Big Data Analytics Two ideas illustrate how big data analytics is different from small data analytics: Samples vs. large datasets Before big data, analytics was usually about trying to infer some traits of a population based on the same traits measured on a sample. The question was how significant was the factor, in other words, how sure can someone be that it cannot be attributed to fluke. When we analyze the whole population, there is no inference anymore. The new question is about the relevance of the effect we are now measuring (not inferring). But still, there are more questions: dimensionality, predictive power, etc. that are far from trivial and do require statistical analysis. Dimensionality and dataset size Another…

18
mar
NEW RULES: HOW BIG DATA SCIENCE IS A DIFFERENT GAME (II)
  • Data Science Spain Expert Network
  • 13334 Views
  • 0 Comment
  • Big Data . Data Science .

2. Think stochastic! Statistics is here to stay There is also a trend, see for example The end of Statistics (1) http://www.datasciencecentral.com/profiles/blogs/data-science-the-end-of-statistics and The end of Statistics (2) http://www.kdnuggets.com/2013/04/data-science-end-statistics-discussion.html, to announce the end of statistics. This is understandable, since data science is a recent trend that people from different backgrounds try to redefine to match their own skills and people tend to underestimate their unknown unknowns. Probability theory is not just the only framework mankind has to give proper definitions to the most important ideas ever defined (including matter and energy), but also information. Information has different definitions (Shannon's, Fisher's, Kolmogorov's just to mention the most important) all of them based on probability theory. Also, knowledge engineering has become more…

12
mar
NEW RULES: HOW BIG DATA SCIENCE IS A DIFFERENT GAME (I)
  • Data Science Spain Expert Network
  • 12803 Views
  • 0 Comment
  • Big Data . Data Science .

In this first post, I introduce some key ideas on big data science to serve as a roadmap for future articles on specific parts. The rules have changed and some technologies considered today as "the way to go" may not survive, but the causes motivating changes are here to stay and change is real. Let's talk about these causes in plain language avoiding technicalities as much as possible. 1. Big Data is not hype, it's all about scalability The motivation for big data is not just data growing; it is growing, but, rather, our inability to build faster computers to process that data. In the last five years, Moore's law, a trend started in the seventies stating that CPU capacity…

20
feb
La primera Comunidad de Data Science en España
  • Data Science Spain Expert Network
  • 13032 Views
  • 1 Comment
  • Community . Data Science . Meet Up . Spain .

Data Science Spain (o simplemente DSS) es una comunidad de Data Scientists que trabajamos principalmente en España. Está abierta a profesionales o estudiantes apasionados de Data Science y tecnología de cualquier lugar del mundo. Los contenidos pueden crearse o contestarse tanto en inglés como en español. DSS es un punto de encuentro para: Intercambiar experiencias; compartir historias de éxito; discutir sobre metodología y tecnología; analizar productos relacionados con datos; ofrecer o buscar empleo o crecimiento profesional en Data Science; compartir tutoriales y pedir ayuda. Santiago Basaldúa

Clientes destacados