Big Data is everywhere. Even small to medium-sized businesses are seeking ways to gain more visibility into processes, monetize additional streams, and derive more actionable insights from their data. With data traditionally contained in information silos within applications or databases, taking advantage of Big Data was initially a tedious and complex process. But thanks to Big Data tools, Big Data management can now streamlined in a comprehensive dashboard.
Sophisticated platforms enable end-to-end data management and business intelligence with solutions for gathering, integrating, analyzing, and even predicting data in ways never before possible. Listed in no particular order of importance, the following Big Data tools for developers offer platforms for rapid deployment of apps, the ability to integrated data collection and analysis from multitudes of sources and applications, and even integrating offline and online data to put actions and events into context.
@SpliceMachine A real-time SQL-on-Hadoop database, Splice Machine takes Big Data beyond analytics with the ability to derive real-time, actionable insights for rapid decision-making. Not only can Splice Machine process real-time updates, but it offers the ability to utilize standard SQL and is capable of scaling out on commodity hardware. Splice Machine can be used in circumstances where MySQL or Oracle can’t scale.
SQL-99 compliant, with standard ANSI SQL
Easily scales from gigabytes to petabytes using cost-effective, commodity hardware
@PalantirTech Palantir was founded in 2004 by a group of former PayPal employees and Stanford computer scientists. The company has doubled in size every year to date, but strives to maintain its startup culture. Offering a suite Big Data solutions for integrating, visualizing and analyzing information, Palantir’s product line emphasizes scalability, security, ease of use, and collaboration. Palantir’s solutions are most commonly used in intelligence, defense, financial and law enforcement applications, but it’s quickly growing in other verticals.
Solutions for integrating, visualizing and analyzing data
Serves a multitude of industries with custom solutions
Exploit and analyze data
Extract data from multiple sources
Privacy and data protection policies
Simplify workflows by integrating data into a single dashboard
@Attivio As enterprises are coping with a broader variety of information sources, eliminating information silos is critical to gaining comprehensive insights and identifying key relationships among data. Attivio’s Active Intelligence Engine combines Big Data and Big Content to analyze everything, including human-generated text through advanced text analytics. Combined with universal indexing and automatic ad-hoc JOIN, Attivio is a powerful solution for making valuable connections between all your data.
Combines Big Data and Big Content
Eliminates information silos
Adds context and signals from human-generated information sources
Mortar is a “general purpose platform for high-scale data science” designed to help data scientists spend more time analyzing their data and deriving actionable insights, instead of dedicating valuable time to building infrastructure and re-configuring systems. With Mortar, you can build a custom-built recommendation engine in days, not months.
Open-source tools for building a recommendation engine
SAP’s HANA platform can be combined with Apache Hadoop for the ability to integrate and analyze massive loads of data in real time. The platform makes it possible to derive actionable insights by making valuable connections between all types of information, from a multitude of sources. CombineSAP HANAwith applications that leverage Big Data insights to quickly create additional revenue streams and improve operations.
Flexible data management for all types of data
Discover insights with analytics solutions
Runs processes 1,000 to 100,000 times faster in-memory
SAP IQ analytics holds the Guinness World Record for data loading
Collecting, integrating, and analyzing Big Data doesn’t have to be a major effort. Cambridge Semantics makes it all possible with the Anzo Software Suite, an open platform for building Unified Information Access (UIA) solutions. That means replacing the information silos that leave data isolated and useless with a powerful, seamless data integration machine that streamlines data collection and enables sophisticated analysis for rapid decision making. And you can implement all this within hours or days — not the typical weeks or months required for an initiative at this level.
Combine data from a multitude of sources
Customized, interactive web dashboards for analysis
Share spreadsheets in sync automatically
Useful for CRM, billing, project management and more
MarkLogic is built to support the world’s biggest data loads, bringing all types of relevant content back to users who can turn it into action. With real-time updates and alerts, connections between information make new opportunities immediately obvious. MarkLogic is ideal for enterprises that count on revenue through paid content search. With geographic data combined with content, location relevance is built in, and geographic boundaries make advanced data filtering possible.
Syncsort offers a range of products and solutions to help you tap into Big Data. Hadoop, Linux, Unix and Windows, and Mainframe solutions, Syncsort’s product lineup offers a solution to meet practically any configuration needs. A GUI-based solution, Syncsort enables developers to create solutions for collecting, processing, and distributing more data in less time.
Solutions for Hadoop, Mainframe, Windows, Linux, Unix
Lowers the barriers to Hadoop adoption
Eliminates the need for custom code for Hadoop implementation
DataStax helps companies like Netflix, Healthcare Anytime, eBay, and even Adobe harness the power of Big Data with less effort and at a lower cost than traditional solutions. Tapped as the first alternative to Oracle, DataStax provides the constant uptime and lightning speed required for modern customer-facing applications. When you need the capacity to handle massive data loads at maximum speed for real-time analysis, DataStax packs a major punch with a robust visual query tool for developers.
Visual query tool for developers
Create and run Cassandra Query Language (CQL) queries and commands
Visually navigate and interact with data clusters
Works with DataStax Community and Enterprise editions
Need to create a rich, engaging and meaningful customer experience? Guavus drives better decision making with powerful analytics capabilities combined with advanced data science and the ability to distill data in real time to derive actionable insights at the precise moment of opportunity. Through continuous correlation and ongoing analysis, vast amounts of static and dynamic data are handled with ease and revealing opportunities to generate more revenue, reduce overhead costs, and monetize new streams.
Analyze-First Analytics Architecture
Analyze high-volume data streams in near real time
An open-source document database, mongoDB is ideal for developers who want precise control over the final results and processes for handling Big Data. With full index support, you have flexibility to index any attribute and scale horizontally without compromising functionality. Rich, document-based queries and GridFS for storing files of any size without the risk of compromising your stack, mongoDB is a scaleable, flexible, and powerful solution for Big Data.
A cloud service solution for Big Data, Infochimps Cloud makes it possible to deploy Big Data applications rapidly and without the typical time commitment. For applications requiring real-time analysis, multi-source streaming data, a NoSQL database, or a Hadoop cluster, Infochimps Cloud offers a solution that facilitates rapid implementation. Real-time analytics, ad hoc analytics, and batch analytics comprise Infochimps Cloud’s three essential cloud services.
Integrate with any data source – CRM solutions, etc.
Mobile data analytics
Fraud detection and risk analysis
Customer insights via social media sources, website clickstreams and more
Pentaho brings IT and business users together by joining data integration and business analytics for integrating, visualizing, analyzing and blending Big Data in ways never before possible for better business results. When you need the ability to put robust information at your users’ fingertips in real time and at a reasonable cost, Pentaho’s open, embeddable and extensible analytics platform makes it easy to visualize, explore, and predict — turning data into value.
Placed facilitates data collection from offline sources, enabling enterprises to derive actionable insights through a combined analysis of both offline and online behavior and data metrics. Placed targeting and placed attribution facilitates better results from mobile advertising by mapping the relationship between people and places by capitalizing on Big Data capabilities.
Measure visitation trends over time
Measures 100 million locations a day, across more than 100,000 opted-in US smartphones
Inference Pipeline references a place database with nearly 300 million features for the US alone
Largest repository of offline insights into the paths and behaviors of consumers
Audience segmentation by demographics and other data points
Affinity modeling for understanding relationships between data
Monitor and understand how consumer behavior changes over time
Upsight, formerly Kontagent, provides actionable analytics for developers to understand what’s happening with your apps and derive actionable insights from data to impact acquisition, engagement, retention and revenue. The platform also enables the creation of targeted in-app and out-of-app metrics in line with KPIs.
Free, enterprise-grade analytics
Unlimited data storage
Data mining with Hadoop
Measure anything from social apps to games and mobile dating apps
FREE – Analytics and unlimited data storage, 250k push
Core – $500/month - Custom Events up to 100k MAU, 500k push
Pro – $2,000/month - Custom Events up to 250K MAU, 1M push
Enterprise – Starting at $3,000/month - Unlimited Data Storage & Custom Events + Data Mine + Predictive LTV + A/B, unlimited push
Talend Open Studio is “a powerful and versatile set of open source products for developing, testing, deploying and administrating data management and application integration projects.” Providing a unified environment for managing the full lifecycle, even across enterprise boundaries, Talend enables developers to reclaim their productivity with a fully integrated platform for joining data integration, data quality, MDM, application integration and big data.
Connect and visualize data for Hadoop Analytics, MongoDB Analytics, Cassandra Analytics, and other platforms in one central repository. Using Big Data, developers can configure reports, analytics, dashboards, and more, without having to migrate data to multiple databases.
Integrate all your data
Blend data through innovative data virtualization metadata layer or raditional data warehouse using ETL
Present integrated visualizations and dashboards within your apps
Create intuitive design tools for non-designers to create visualizations
With powerful APIs for gathering all the data you need and deriving the actionable insights you need to drive your business forward, Keen IO is a powerful, flexible, and scaleable solution that’s literally Big Data, easy-to-implement and at your fingertips.
Send as much data as you want, from any source
Set up event data on any action, such as upgrades, impressions or purchases
High-performance machine learning on Big Data for advanced analytics, Skytree offers the ideal platform for fully exploiting the opportunities presented by Big Data. With a multitude of industry-focused solutions as well as solutions encompassing everything from predictive analytics to algorithmic pricing, Skytree is a comprehensive Machine Learning platform emphasizing the growing importance of Predictive Analytics in Big Data.
Business Analytics range from value analytics to fraud detection and what-if analytics
Marketing Analytics offer solutions ranging from ad optimization to lead scoring and recommender systems
Only general purpose scalable Machine Learning system on the market
Highest accuracy on the market; unprecedented speed and scale
Power Packs modules are plugged into the Skytree Server Foundation
@tableau Tableau was launched by a computer scientist, an Academy Award-winning professor, and a business leader with a passion for data. This perfect trio created a powerful suite of solutions designed to put more data at users’ fingertips — and help them understand it in more meaningful ways. With an advanced query language for powerful visualizations, the ability to natively query databases, cubes, warehouses, and more, a lightning-fast in-memory analytics database designed to eliminate silos and more, Tableau addresses every corner of Big Data demands.
Platfora hides the complex nature of Hadoop, making it simpler for enterprises to discover and understand facts in their business across events, actions, behaviors and time. Built by Silicon Valley veterans who have built market-leading companies around big ideas, the Platfora team understands the power of Big Data and aims to change customers’ lives with Platfora as they’ve done with companies in the past.
Vizboards for self-service, interactive data visualization
Analytics Engine, In-Memory Accelerator, and Hadoop Processor
Flurry is an end-to-end solution for analyzing consumer behavior, advertising to the right audience, at the right time, and discovering new ways to monetize audiences. Flurry makes use of 3.5 billion app session reports per day totaling more than 3 terabytes to provide valuable insights for app developers, such as a deep understanding of the user base, engagement benchmarks, and other key metrics.
App engagement benchmarks
App category and consumer interests
World’s largest app-audience data set
Reach more than 250 million customers per month
Flurry Analytics – FREE
Flurry AppCircle, FlurryPersonas, Flurry AppSpot – Contact for a quote
GridGain reimagines in-memory computing for a competitive edge in the modern business environment. Nikita Ivanov and Dmitriy Setrakyan share a passion for high-performance computing, a shared vision on which they based the first release of GridGain in 2007. The list of features, functionality and capabilities of these solutions is astounding.
A new type of system to help developers analyze data on a deeper level, DeepDive is an open-source project with a simple four-step process for writing applications on the platform. With calibrated probabilities for every assertion it makes, DeepDive is designed to navigate around the problematic nature of human error in development.
Handles large amounts of data from multiple sources
Write simple rules and offer feedback on prediction accuracy
“Distantly” learns, rather than requiring a tedious machine-learning process for training predictions
Scaleable, high-performance inference and learning engine
Orange is an open-source data visualization and analysis tool for both novices and experts. Data mining is conducted either through visual programming or Python scripting, with components for machine learning and ad-ons for bioinformatics and text mining.
Remembers choices and makes suggestions
Intelligently chooses communication channels between widgets
Packed with visualization options from bar charts to dendograms
Integration and data analytics
Combine widgets to design the framework of your choice
OpenDataSoft is a comprehensive discovery tool with maps, charts, and graphs to explore public data sets. A cloud-based platform, OpenDataSoft is designed for seamless and unlimited data publishing, sharing, and resuse.
Reuse data through APIs and apps models
Collect data from any source
Read and understand all formats
Make databases findable and reusable
Standard access formats
Interactive & shareable visualization
Web extensions and open source
Cost (pricing based on Euros):
FREE – Civic initiatives and academic projects
200/month – 100k records, 20K UI/API queries/day
700/month – 10M records, 100K UI/API queries/day
Contact for a quote – Unlimited records, UI/API queries/day
@Angoss A comprehensive marketing analytics solution, Angoss offers real-time Big Data insights for a variety of verticals and business sectors. From credit scoring to opportunity and lead scoring, fraud deterrence and claims management, Angoss is capable of capturing and analyzing data for a multitude of applications.
Mu Sigma is one of the world’s largest Decision Sciences and analytics firms, helping companies to institutionalize data-driven decision making by harnessing Big Data. With a set of proprietary platforms to enable rapid decision-making and comprehensive data collection and integration that eliminates information silos, Mu Sigma is a powerful tool for machine learning, operational research, artificial intelligence, and more.
Hosts Mu Sigma problem DNAs
Real-time analytics and event stream processing
Load models into an enterprise ecosystem for consumption
A collaborative data-monitoring environment, ERwin offers an intuitive, graphical interface with a centralized view of key definitions, enabling the leveraging of data as strategic business asset. The product is comprised of a number of editions designed for different stakeholders within an organization, providing a targeted level of information availability and display and configurations for better understanding and usability.
Achieve business agility through model-driven collaboration
Collaborate via web or desktop
Active model templates and naming standards
Display themes, custom data types, macro language and API
Metadata integration tools
CA ERwin Data Modeler Standard Edition r9.5 – Product plus 1 Year Enterprise Maintenance – $4,794
CA ERwin Data Modeler Standard Edition r9.5 – Product plus 3 Years Enterprise Maintenance – $6,392
CA ERwin Data Modeler Workgroup Edition r9.5 – Product plus 1 Year Enterprise Maintenance – $6,708
CA ERwin Data Modeler Workgroup Edition r9.5 – Product plus 3 Years Enterprise Maintenance – $8,944
CA ERwin r9.5 Data Modeler for Microsoft SQL Azure – Product plus 1 Year Enterprise Maintenance – $1,679.94
CA ERwin r9.5 Data Modeler for Microsoft SQL Azure – Product plus 3 Years Enterprise Maintenance – $2,239.92
CA ERwin r9.5 Web Portal Standard Edition 1-5 Users – Product plus 1 Year Enterprise Maintenance – $8,399.70
CA ERwin 9.5 Web Portal Standard Edition 1-5 Users – Product plus 3 Years Enterprise Maintenance – $11,199.60
Offering Big Data and Business Intelligence solutions, pmOne’s cMORE enables users to quickly build, flexibly grow and efficiently administer solutions. It leverages and extends SQL Server functionality, as well as that of Excel, SharePoint, and other components in the Microsoft BI stack.
Simplified standard and ad hoc reporting
Credible alternative to SAP-based data warehouse
Consistent reporting company-wide
Personalize reports; distribute books
Easy access to SAP data and other systems
Based on Microsoft BI
Cost:Contact for a quote
Are Big Data tools changing the way you develop apps? What Big Data tools are changing the way you architect and run your big data projects?