Sam Lightstone

Sam Lightstone

Distinguished Engineer at IBM
BIO: Sam Lightstone is Distinguished Engineer for relational Cloud Data repositories as well as co-founder of IBM's technology incubation initiative.He is author or several books, papers, and patents on computer science, careers, and data engineering.
Sam Lightstone

On Monday October 27, in the general session of the IBM Insight conference, to an audience of many thousands, IBM announced several new cloud computing services, including in-memory optimized warehousing in the cloud with dashDB.

dashDB is a managed data warehouse service in the cloud, it’s an analytics powerhouse at your finger tips

The well kept secret is that along with the work on BLU Acceleration for some time we have been working in parallel to bring the benefits of this technology and the best qualities of the Pure Data for Analytics (Netezza) product line to the cloud on IBM’s BlueMix platform. dashDB creates that reality. Super easy, very low cost and no need to bother with IT procurement or database management. A dedicated team worked tirelessly to bring this technology to market. What a thrill it was to be sitting in the audience as IBM Vice President Inhi Cho announced dashDB in the general session to the world. The vibe in the room was palpable, and certainly everyone sitting near me was buzzing about the new service. Twitter immediately lit up with #dashDB and #KnowMOREinaDASH.

Inhi dashDB announce Insight 2014I want to share with you some of the most important attributes of the new cloud service:

  1. It’s a managed, pay-as-you-go, use only what you need, service. And completely free for the first few gigabytes of data (yes, really).
  2. A managed service with load-and-go simplicity is the mantra for dashDB, leveraging technology and philosophy from both Netezza and BLU Acceleration. No indexes and no configuration tuning required.
  3. It’s in-memory optimized for high performance, with BLU Acceleration in its kernel
  4. It provides in-database analytics for statistical analytics (R) and spatial analytics (Esri), by leveraging the Netezza technology in Pure Data for Analytics bringing your analytics to the data, not the data to the analytics.
  5. It’s Cloudant and JSON ready.  dashDB is ready to use with just a few clicks in the Cloudant management interface..
  6. It’s a composable service on IBM BlueMix, so you can connect with other powerful services from IBM like IBM DataWorks, Cloudant, and others – like connecting Lego pieces.

It’s a managed service, which means that IBM takes care of the deployment and configuration, security and HA for you.  IBM takes care of configuration, tuning, and regular maintenance tasks like backup and restore.

Always secure and encrypted

When you’re placing you data in the cloud you need to know it is secure. dashDB always stores data and transfers data in  encrypted form.  The encryption of data at rest (stored) applies to both to data stored in the database as well data inside backup images. The data in the database and the backups (which are automated) are encrypted with AES 256. That uses the strongest key size known today within the industry standard symmetric cryptography. Database activities are continuously monitored and such activities are available to customers easily through the dashDB console. The activity reports include who is connecting to the database and what actions they have performed  In fact the console supports HTTPS so these communications are encrypted too.  Add to that the classic access control that you can grant to decide deliberately who and what has access to your data and you have data at rest and in transit is always trusted, and only available to who it should be.

Analytics for Mobile Apps

apple-ibmdashDB enables ‘mobile analytics’ with Cloudant. Cloudant is IBM’s new highly available and elastic JSON store in the cloud. This clever connection with Cloudant.com means that in addition to using it as a regular SQL based data mart or warehouse with services like Tableau, or Cognos, you can also use dashDB as a warehouse for your JSON data from Cloudant.  It’s easy – you click on the “Warehouse” button in Cloudant, and you’re off the races with industry leading reporting and analytic capabilities on your JSON data.  Just listen to Apple – the recently announced IBM & Apple collaboration is leveraging Cloudant extensively as the data layer for new mobile applications.  Now you’re able to easily store JSON data in Cloudant, and leverage dashDB to run analytics like reporting, data visualization or geospatial analytics. Bring it!

Why should you care?

Historically data warehousing was complex stuff. Massive value but also piles of skills were required.  Only the most expert DBAs dared to dabble in it.  But skills were not enough, you had to have hardware, and plenty of it. That meant procurement, which meant negotiations with a boss or an IT department, or a finance department. More importantly it meant delays.  In today’s world, none of us have time for that – we need to do more, faster, for lessdashDB brings leading edge warehousing, reporting, and analytics to you without delay, at super low cost. In fact for small data sets it’s completely free.  The result is super easy access to data warehousing – on demand – especially for your “now crowd” in the business.

dashDB is like the Netflix of advanced data warehousing. Always on tap and super easy.

Combining the best of Pure Data for Analytics (Netezza), BLU Acceleration, and IBM’s Cloud (BlueMix and Softlayer)

dashDB combines the best of Netezza simplicity and in-database analytics with the astounding speed and flexibility of BLU Acceleration, and brings it to the cloud at very low cost as a managed service on BlueMix so it is a composable component with many other cloud services. These are extremely powerful technologies being combined into a single user friendly, simple, service on the web.  Netezza broke records for appliance based warehousing, with extreme simplicity, speeds, price performance,  and in-database analytics. BLU Acceleration broke new ground in simplicity and performance for software based in-memory warehousing. BlueMix provides a one-stop-shop for IBM’s composable cloud services. dashDB is a wildly powerful combination, that only IBM could pull off.

In-memory databases sound expensive. (hint, not dashDB)

dashDB is not in-memory, it’s in-memory optimized.  Like BLU Acceleration, you generally need only a small amount of RAM (compared to the total uncompressed data size) in order to run at in-memory speeds.  For example, if you have a 100GB data set, dashDB and BLU Acceleration likely only require about 5GB of RAM to run your workload at in-memory speeds.  Data can be way larger than RAM and the workload will still run great.   It gets better though because as a managed service on the IBM cloud, IBM is able to amortize the costs across huge numbers of users in data centers around the world. The net result is that IBM is able to offer the service for free with your first several gigabytes of data and at very low costs with Terabyte class big data  volumes.

Moving data easily: Using IBM DataWorks and dashDB together

Enterprises today are rapidly moving to the cloud for performing analytics.  In this new cloud based world, one of the first challenges faced is that of efficiently moving data to cloud based data stores such as dashDB.  The IBM DataWorks service solves this problem by allowing application developers to move large amounts of data from different data sources such as SQLDB, dashDB, Softlayer Object storage, IBM Analytics for Hadoop, etc.  One important concern while moving data to the cloud is around moving of sensitive data such as social security number, credit card number, etc., to the cloud. IBM DataWorks provides a unique capability to profile and identify sensitive data like these so they can be acted upon in accordance to your needs and company policies.  Finally DataWorks has support for validating and standardizing US addresses. It also has capability to enrich partial addresses so that analytics can be performed on clean and standardized data.

In a nutshell IBM DataWorks allows users to get access to trusted, standardized and masked data in an efficient manner from different data sources.

Here are a few screen caps to whet your appetite:


If you are creating a dashDB warehouse from a Cloudant JSON store, it’s really easy. The service automatically mines the JSON data and creates and populates a dashDB instance for you bringing you to the following screen:


Then you are ready to go, with advanced in-memory optimized analytics and reporting.


The service enables dead-simple connections with major analytics tools and libraries, like Cognos, SPSS (predictive analytics), and R, as well as easy data extract, load and transformation tools and services like DataWorks and InfoSphere DataStage.

dashDB-screencap3… and you can seamlessly work with R scritps, and R studio. Lovely!


Finally you can leverage the built-in spatial analytics to identify geospatial insights and visualize the results e.g. with Esri ArcMap:


Whether it is Tableau, SAS, SPSS, Esri, or Cognos, or other popular analytics, reporting and business intelligence applications, you’ll find it very easy to start using them with dashDB because of the out-of-the-box guides how to connect then to dashDB:



dashDB: Use only what you need. No IT procurement or negotiation.  It’s all on tap, and managed for you.

Know more in a dash!


[Thank you to my co-authors Manish Bhide, Milind Tamaskar, Torsten Steinbach, Rick Sobiesiak, Walid Rjaibi, and Cindy Russell!]

Previous post:

Next post: