Big data models are promising, not perfect

Big data has become vital in the modern age. The analysis of trends drives company decisions, while consumer choices are analyzed to tailor advertisements and purchases to an individual user. Social media companies track the pages their users are interested in to personalize the content they see. Medical data can be analyzed to track diseases. Weather data can be used to predict storms. The list goes on and on. Data analysis has become the bread and butter of the information-driven era.

With the ever-increasing amount of available data, it has only become more important to simplify the process of analyzing it.

This is what researchers from the University of Cordoba’s Department of Computer Science and Numerical Analysis have been working on. Their work focuses on producing models capable of predicting several variables, based on the inputs from the same set of variables. In other words, they reduce the amount of data that a model needs to analyze by having the computer learn the different qualities of the input data.

Sebastian Ventura, one of the researchers, points out that the method of reducing data is a much cheaper and more effective strategy than just increasing computing power, and that there are a few key benefits of the data reduction. First, by simplifying the data set, the model could be more accurate, since extraneous data not affecting the model are removed. Second, with less data to input, less computing power is required. In turn, this ends up increasing the efficiency and rate at which models can be made.

With the sheer amount of data the world produces, having more efficient predictive algorithms is key to processing it, which has driven a global trend towards modernizing predictive models. Besides individual researchers working on their own projects, many government organizations are also funding big data projects. For example, the E.U. funded the BigStorage project in 2015 with the goal of developing new approaches to data analytics concepts, providing 3.8 million euros to finance the project.

It is also important to remember that there are many variables computers still won’t be able to account for. Big data is not infallible, and researchers should be careful when making bold claims about its versatility.