It is a fact: the amount of data generated in the year is exploding. To handle these large volumes, from business applications, or those generated by external sources, companies must acquire several analytical tools. Some are now turning to HANA and its Big Data tools.
“Big Data can only grow and expand, with ever-growing sources of data, both internal and external,” says a report by Forrester Research (“Ultra-Fast Data Access is Key to Unleashing Full”). big data potential “).
Companies must therefore put in place “a modern strategy for data analysis. It must offer a layer of access to real-time and permanent data to all relevant data, regardless of the source, “notes the report.
To address this, SAP is investing to provide the business with advanced analytical tools based on its HANA In-Memory column and database system , “said Anne Moxie, Senior Analyst at Nucleus Research.
A point shared by Werner Hopf, CEO of Dolphin Enterprise Solutions, an SAP partner located in the United States. “In the last two or three years, SAP has been working to expand the capabilities of HANA, so that the base can be used as a foundation for transactional systems,” he says.
For example, last September, SAP announced HANA Vora, a new memory query engine for Hadoop for businesses that need to manage distributed Big Data systems, says Anne Moxie.
However, as such, HANA is not ideally suited for very large volumes of data, as it is not cost effective to put large amounts of data in memory, says John Appleby, managing director of Bluefin Solutions, a British SAP partner. “We are delighted that SAP has moved closer to Hadoop. ”
HANA Vora, HANA’s Big Data Tool
HANA Vora, available since March, allows companies to analyze data stored in Hadoop as well as in enterprise systems or other distributed sources, according to SAP. HANA Vora relies on Apache Spark – which SAP has adapted for its platform – to offer interactive analytics capabilities on data (internal and stored in Hadoop), so that more elements can be integrated into the analysis.
“To stall HANA on Big Data projects, SAP has brought HANA closer to Hadoop data,” said Anne Moxie. “Connecting different data sources and associating company data with data stored in Hadoop gives you a true unified view. With this, data scientists have access to all data for their work. ”
CenterPoint Energy, a natural gas distributor based in Houston, Texas, is one of the first SAP customers to implement HANA and HANA Vora to gather data stored in a highly distributed environment.
With Hadoop, CenterPoint Energy can reduce costs associated with increasing storage capacity, and with Vora, leverage analytics to make better decisions, says SAP. It must be said that CenterPoint Energy collects data from its sensors every 15 minutes. Reports on energy use levels are generated. What will increase the costs of storage.
In 6 weeks, SAP and CenterPoint Energy have developed a test environment that has processed more than 5 million data with Hadoop, HANA and HANA Vora, says SAP. The distributor finally chose to implement HANA and capitalize on SAP technology. “Our first tests proved that the HANA and HANA Vora tandem was the right way to change our operational management,” said Gary Hayes, CIO and Vice President of CenterPoint Energy, in a statement.
HANA Vora can handle both structured and transactional data, says Irfan Khan, CTO at SAP. “But by deploying Vora on a Spark cluster, and on a Hadoop storage, we can do several types of tasks from HANA,” he continues. The result is “a much more coherent vision of ongoing activities”.
HANA Vora is a “first class citizen” from Spark. This allows SAP to push some very specific analytic workloads into Spark, or to retrieve contextual information within the transactional core for better performance metrics on clients, says Irfan Khan.
When it comes to big data analysis, the main problem with In-Memory systems such as HANA is the cost / value ratio, comments Werner Hopf from Dolphin. The main memory is expensive and the companies quickly reach a volume of data such that the costs exceed the gains obtained with the analytics.
That’s why Hadoop support was key to making HANA a component of Big Data, he notes. Integrating some HANA database modules into the HANA Vora front-end analytics and placing this above Hadoop and Spark “allows customers to perform high-performance analysis on data stored in large Hadoop data lakes” he adds.
Hadoop and Spark, essential for HANA in Big Data
According to Anne Moxie, associating HANA Vora with Hadoop and Spark represents a key step that allows companies to access all their data. With the promise of the Internet of Things, Spark will be very useful for distributed processing and data mining. “Spark is very critical for IoT-related applications, and the associated analytics, but HANA Vora has the ability to facilitate many of these projects, allowing companies to be able to more easily analyze their data,” he says. she.
Example at an SAP customer, from the agriculture sector. It relies on sensors installed on its land as well as satellite images to predict the yields of its cane sugar. The data from these sensors as well as the satellite images are stored in Hadoop and analyzed by HANA Vora and HANA – this one is used to carry out predictive analyzes supposed to optimize the use of water and fertilizer and to obtain better yields .
For John Appleby (Bluefin), his customers are very interested in Vora to manage the Information Lifecycle. Companies are using SAP ERP or an identical solution and they want to put their information in read-only mode for legal or business reasons in storage tools called “cold”. He expects SAP to clarify its roadmap for information lifecycle management in September.
But for Irfan Khan, HANA’s Big Data strategy at SAP will also focus on closer integration with Open Source, as with Spark. “This in itself is very relevant to our customers because none of them want to work with silo data. They want to have a unified and consistent view and that’s exactly what we do, “he concludes.