Easy Big Data Analytics:
ParAccel Hadoop Analytics

Easy Point and Click Platform for Analytics and Data Preparation

Data mining and big data analytics will create strategic advantages, but only if you can leverage your data quickly and economically. This means getting the best use possible from inexpensive commodity hardware. It also means making analytics model development accessible to data analysts and data scientists, not just programmers. Big data analytics needs to be easy, but there can be no compromises on scalability, compute time or ability to deploy analytics on multiple platforms.

ParAccel Hadoop Analytics delivers on the promise of no compromises, with a single platform for end-to-end data access, transformation, analysis, visualization and delivery. Hadoop Analytics eliminates memory constraints, as well as the need for data movement into specific data stores before analytics are run.  Now data scientists and business analysts can transform, cleanse and analyze terabytes of data into actionable insights at record-breaking speed on commodity hardware. Hadoop Analytics can handle both data streams and data stores of various types, including Hadoop (HDFS and HBase).

We believe that speed to value is the missing link in most big data analytics software. Lots of products advertise fast data crunching, but rapid model prototyping and iteration are even more essential to getting the best answers than final execution speed. Getting the accurate answers you need when you really need them is the key.

Get it faster. Prep it faster. Model it faster. Deploy it faster. Execute it faster.

Actian Analytics Software

Whether your data resides in flat files, SQL, NoSQL, HBase or the Hadoop filesystem (HDFS), we’ve got you covered.  Actian's ParAccel Hadoop Analytics can read and write any combination of sources simultaneously. 

Before analysis, do you need to check aspects of your data against business rules? Check mins, maxes, averages and such?  Hadoop Analytics takes the limits off the number of metrics you can check without bogging down your hardware. 

Need to find matches in data with similar names, addresses, ID numbers?  The DataMatcher operators let you build a high speed solution for matching inaccurate, inconsistent and duplicate data.  


ParAccel Hadoop Analytics also lets you fill in missing values, sort, aggregate, and any other transformations needed to improve data quality and prepare the data you need, whether its in a data store or streaming by on the fly. Whether you're dealing with big data or a normal analytics data set that might laughably be called "small," you can crunch through it fast, so you can move on to the real work. Analysis.

Download Free Trial of ParAccel Hadoop Analytics

If data preparation is all you need, then read about  data preparation for data mining and analytics with ParAccel Hadoop ETL/DQ.

The analytics library in Actian's ParAccel Hadoop Analytics supports data mining for marketing, surveillance, fraud detection, cybersecurity and scientific discovery. Build a recommender system, market basket analysis, risk analysis, customer churn analysis and more using the Hadoop Analytics extensive library of analytics operators.

  • Association Rule Mining, Affinity Analysis
  • Classifiers: Decision tree, K-nearest-neighbors, Naïve Bayes, Support Vector Machine (SVM)
  • Clustering: Recommender learner and predictors, k-means
  • Feature Selection: Principal Component Analysis (PCA)
  • Regression Analysis

Download Free Trial of ParAccel Hadoop Analytics

Actian's ParAccel Hadoop Analytics integrates directly with existing toolsets to solve problems requiring greater analysis performance speed. Hadoop Analytics uses the industry standard Predictive Model Markup Language (PMML) as either an input or output format. Take models developed in SAS, SPSS and other tools that use PMML and run them using Hadoop Analytics highly parallelized operators to get answers faster, without buying more hardware. Or use the intuitive design interface and rapid test and iterate cycle of ParAccel Hadoop Analytics to generate the models, then visualize them in another tool that uses PMML. 

Working with R for statistical computing? ParAccel Hadoop Analytics can do the heavy lifting on big data preparation and flow the output straight to your R code, vastly reducing overall execution time. You can also run snippets of R code as just another step of the workflow, open R views or even learn models within R. Be sure to compare R analytics operators with similar Hadoop Analytics operators. In many cases, the dataflow optimized operators will give you an equally accurate answer in far less processing time.

Download Free Trial of ParAccel Hadoop Analytics

Display data graphically in interactive charts that can be modified on-the-fly.  Scatter plots, line graphs, bar graphs, dashboards and more allow visualizing data to instantly spot trends. BIRT and R provide a wide variety of open source options.

Linear Regression

Output in PMML or through JDBC to visualization tools like Tableau, Actuate, etc. to expand your visualization options without limits.

BIRT 360 dashboard
Linear Regression
 

Download Free Trial of ParAccel Hadoop Analytics

Actian's ParAccel Hadoop Analytics is built on the patented ParAccel DataFlow Engine to ensure scalability and future-proof your analytics workflows.  The ParAccel DataFlow Engine automatically detects and utilizes all cores and nodes available at runtime up to a settable limit. Execution moves seamlessly from desktop to server to cluster, without the need to modify code, re-design models, or recompile.  

For example, an application written on a 4-core desktop will automatically scale to take full advantage of the additional resources when installed on a 16-core server or a cluster of 100 8-core machines. Every bit of current hardware will be used to it's fullest capacity, nothing wasted. Organizations can simply add more compute resources to keep up with growing data volumes over time.

Read more about the ParAccel DataFlow Engine

ParAccel Hadoop Analytics doesn't require Hadoop to run. It can run on any operating system with a JVM, including Windows, Mac, Linux, and most varieties of UNIX. If you need to run on Hadoop, though, Actian runs natively on all major Hadoop distributions:

  • Apache
  • Cloudera
  • Hortonworks
  • MapR
  • IBM BIgInsights

For more information, see our Hadoop Solutions page. 

Download Free Trial of ParAccel Hadoop Analytics

Do you need an analytics or data preparation operator that Actian didn't build?

Actian's ParAccel Hadoop Analytics includes a full set of data preparation and analytics operators that, because they are built directly with the ParAccel DataFlow Engine API, are fully optimized for running on multi-core and distributed systems, providing automatic scaling and extreme levels of performance.

If you need a parallel optimized dataflow operator that is not included in Hadoop Analytics, you can use the ParAccel DataFlow Engine API to develop custom operators using the very simple ParAccel DataFlow Engine Javascript interface, or any other JVM language, including Java, JRuby, Groovy, Jython, Scala, etc. Your custom operators will have the same automatic scaling and extreme performance, thanks to the ParAccel DataFlow Engine framework.

Also, keep in mind that standard KNIME operators can be mixed with the Actian parallel optimized operators in ParAccel Hadoop Analytics to give a tremendous breadth of functionality. Most KNIME created, or KNIME partner created, operators are not optimized for multi-core or parallel execution. However, KNIME and Actian partnered to help improve the KNIME API to allow KNIME operators to be made "flowable" so that they can take advantage of some of the speed of the ParAccel DataFlow Engine's data processing framework.

Download Free Trial of ParAccel Hadoop Analytics

Pervasive RushAnalytics for KNIME Demo
Watch Hadoop Analytics Demo
(formerly Pervasive RushAnalyzer)

 


Machine Learning Example   [enlarge]
 
 

“ParAccel Hadoop Analytics holds a lot of appeal for analytics gurus looking for both design-time and execution-time productivity.  The KNIME workflow environment is very accessible and easy to use, and the combined KNIME and Actian ParAccel nodes and operators provide a lot of flexibility with pre-existing building blocks for both ETL and analytics.  The highly parallel Actian execution engine now makes it practical to apply the beauty of the KNIME workflow environment to large datasets that would have been difficult or even impossible to tackle otherwise."

Dean Abbott 
President
Abbott Analytics

 

Accelerating Big Data 2.0™