Category Archives: Analytics

An AI Data Platform for All Seasons

Pure Storage empowers enterprise AI with advanced data storage technologies and validated reference architectures for emerging generative AI use cases.

Summary

AI devours data. With award-winning AI-ready infrastructure, an AI data platform, and collaboration with NVIDIA, Pure Storage is delivering solutions and services that enable organizations to manage the high-performance data and compute requirements of enterprise AI.

AI Then and AI Now

They (some wise anonymous folks out there) say that there is a time and place for everything. They also say there is a season for every purpose. I believe that the time, place, and season for artificial intelligence (AI) data platforms have arrived. To see this, look no further than Pure Storage, whose core mission is to “empower innovators by simplifying how people consume and interact with data.”

In the past, it was sufficient to bring order to the randomness of enterprise data collection through applications of technology resources (databases and storage devices) that were aimed primarily at organizing, storing, indexing, and managing enterprise information assets for single purposes or single business units. However, this data was still left mostly unexploited for its maximum potential and enterprise-wide business value.

Also in the past, it was sufficient for business automation to consist primarily of rigid rule-based robotic non-adaptive repetition of processes and fixed tasks, requiring very little (if any) new knowledge input (i.e., live data consumption) or real-time adaptation to changing business conditions.

And also in the past, it was sufficient for AI to be relegated to academic researchers or R&D departments of big organizations who mostly produced research reports or journal papers, and not much else.

Fast-forward to 2024 and we see a totally different landscape: massive data sets feeding dynamic cross-enterprise processes, increasing automation and dynamic adaption of complex multi-step tasks, and ubiquitous value-producing applications of AI. In particular, in the past year, generative AI has played a major role in the explosive development and growth of these transformations within enterprises.

Pure Storage Meets the Demands of Enterprise AI

To support, sustain, and assure the continued success and cost-effectiveness of the enormous data-fueled AI-powered transformations in such a rapidly changing environment, Pure Storage has stepped up their delivery of an incredible array of award-winning AI-ready infrastructure (AIRI//S™) products and services with an AI data platform that provides the fundamental AI environment for enterprise data management (storage, access, orchestration, delivery), hyperscaled AI training, and AI inference on demand (on-prem, in data centers, at edge sites, and in micro edge devices).

One example of Pure Storage’s advantage in meeting AI’s data infrastructure requirements is demonstrated in their DirectFlash® Modules (DFMs), with an estimated lifespan of 10 years and with super-fast flash storage capacity of 75 terabytes (TB) now, to be followed up with a roadmap that is planning for capacities of 150TB, 300TB, and beyond. Another example is Pure Storage’s FlashBlade® which was invented to help companies handle the rapidly increasing amount of unstructured data coming into greater use, as required in the training of multi-modal AI models. One more example is Pure Storage’s development of non-disruptive upgrades (NDUs), a feature of Pure Storage’s architecture that permits upgrades and expansion of the data infrastructure with no impact on data availability or performance, and with no downtime or data migrations.

Pure Storage’s Announcements at GTC 2024

The preceding examples are industry-leading and exemplary, and yet there’s still more. At the NVIDIA GTC 2024 conference, Pure Storage announced so much more! Here are a few more details on some of those announcements.

A Data Platform for AI

Data is the fuel for AI, because AI devours data—finding patterns in data that drive insights, decisions, and action. Ease of data orchestration (ingest, cleaning, transformation, discovery, access, exploration, delivery, training, inference, deployment) is essential for data-devouring AI products and services. A data platform for AI is key to innovation and long-term affordability, scalability, sustainability, and advancement of enterprise AI applications. Anything less than a complete data platform for AI is a deal-breaker for enterprise AI. Pure Storage provides the ideal data platform for AI, as it provides unified storage for structured and unstructured data and provides enterprise data services for Kubernetes, supporting the entire AI data pipeline, because storage matters!

At GTC 2024, Pure demonstrated the features of their data platform for AI, specifically highlighting these benefits and features of the platform: (a) Helps organizations accelerate model training and inference; (b) Improves operational efficiency for AI/IT infrastructure teams, as well as AI/ML developers and engineers; (c) Delivers cost and energy efficiency as an enterprise scales their AI operations; and (d) Provides an AI storage platform that delivers ultimate reliability and is built to handle all future AI storage needs.

Optimizing GenAI Apps with RAG—Pure Storage + NVIDIA for the Win!

One of the most popular techniques associated with generative AI (GenAI) this past year has been retrieval-augmented generation (RAG). RAG is the essential link between two things: (a) the general large language models (LLMs) available in the market, and (b) a specific organization’s local knowledge base. In deep learning applications (including GenAI, LLMs, and computer vision), a data object (e.g., document, image, video, audio clip) is reduced (transformed) to a condensed vector representation using deep neural networks. The knowledge base then becomes the comprehensive collection of these condensed representations of the enterprise business data repositories, stored in vector format in a vector database—Vector DB being another major data technology development finding widespread adoption this past year.

As a consequence of these activities, RAG provides the bespoke use case-specific context to an organization’s proprietary GenAI LLM applications. This contextualization of the GenAI LLM is not only enterprise-specific, local, and customized, but it is also proprietary—maintaining the privacy and security of the GenAI LLM application within the security firewalls and policies of that organization. Additionally, RAG ensures the use of an organization’s most recent data while eliminating the need for constant retraining of the LLMs. Pure Storage has worked with NVIDIA (GPU memory and GPU servers) to boost the speed, accuracy, and on-prem power of such enterprise GenAI LLM applications. Here are some specific documented results:

(a) “NVIDIA GPUs are used for compute and Pure Storage FlashBlade//S provides all-flash enterprise storage for a large vector database and its associated raw data. In a specific case [presented at GTC], the raw data consisted of a large collection of public documents, typical of a public or private document repository used for RAG.”

(b) “Document embedding, indexing, [and ingest] were completed 36% more quickly when using the Pure Storage FlashBlade//S with a native S3 interface than when using local SSDs that were inside each server, demonstrating that Pure Storage’s fast networked all-flash storage can help accelerate RAG document embedding.”

Pure Storage’s RAG pipeline, in conjunction with NVIDIA GPUs and NVIDIA’s NeMo Retriever collection of GenAI microservices, ensures accuracy, currency, privacy, and relevance of proprietary enterprise LLMs. Time to insight and time to action in AI applications are faster and better with Pure Storage.

OVX Validated Reference Architecture for AI-ready Infrastructures

First question: What is OVX validation? OVX is NVIDIA’s standard validation paradigm for computing systems that combine high-performance GPU acceleration, graphics, and AI with fast, low-latency networking that are used to design and power complex 3D virtual worlds and digital twins that are transforming how businesses design, simulate, and optimize complex systems and processes. In this fantastic emerging realm of breathtaking technological achievements and innovations, Pure Storage has achieved OVX validation of their reference architecture for AI-ready infrastructures. At this stage, OVX validation applies directly to the increasing business demand for GenAI workloads (including RAG, LLMs, knowledge bases, and Vector DB), full-stack ready-to-run enterprise AI infrastructure, and local proprietary custom data + AI compute, storage, and networking solutions. Note: When you see “full-stack,” read “Pure Storage + NVIDIA working together seamlessly.”

Second question: What about technical debt and the cost of “lift and shift” to these new AI-ready architectures? For Pure Storage, OVX validation also certifies that Pure Storage’s AI-ready infrastructure will run on NVIDIA GPUs and on other vendors’ servers, which is a great savings on technical debt for those organizations that operate diverse server farms. OVX validation complements Pure Storage’s certified reference architecture for NVIDIA DGX BasePOD that was announced last year as well as their FlashStack^® for AI Cisco Validated Designs announced here.

Since one of the only certainties about the future is its uncertainty, it is a great benefit that Pure Storage Evergreen//One™ provides storage-as-a-service (STaaS) guarantees and enables future-proof growth with non-disruptive upgrades. That means that Pure Storage owns the hardware (“the end user doesn’t pay for it”), but the end user buys a subscription to the storage with the same agility and flexibility of public cloud storage, and with all the security, proprietary protection, and performance of on-prem all-flash sustainable infrastructure. This is Pure Storage’s SLA-guaranteed cloud-like STaaS!

More Pure Storage Announcements at GTC 2024

Pure Storage’s RAG development (described earlier) is accelerating successful AI adoption across vertical industries. Pure Storage is accomplishing this by creating vertical-specific RAGs in collaboration with NVIDIA. First, “Pure Storage has created a financial services RAG solution to summarize and query massive data sets with higher accuracy than off-the-shelf LLMs. Financial services institutions can now gain faster insight using AI to create instant summaries and analysis from various financial documents and other sources.” Pure Storage will soon release additional RAGs for healthcare and the public sector.

Expanded investment in the AI partner ecosystem: Pure Storage is further investing in its AI partner ecosystem with NVIDIA, engaging in new partnerships with independent software vendors (ISVs). Some of these investments are aimed at optimizing GPU utilization through advanced orchestration and scheduling, and others enable machine learning teams to build, evaluate, and govern their model development lifecycle. Additionally, Pure Storage is working closely with numerous AI-focused resellers and service partners to further operationalize joint customer AI deployments.

Looking at AI Now and at What’s Next

As the award-winning leader in AI-ready (and future-ready) data infrastructure, Pure Storage is collaborating with NVIDIA to empower their global customers with a proven framework to manage the high-performance data and compute requirements that these enterprises need to drive successful AI deployments, both now and into the future. Every technical leader, line of business (LOB) leader, VP of Infrastructure for AI, VP of AI/Data Science, and CDO/CTO/CAIO can benefit right now from these technologies and services.

To put all of Pure Storage’s recent accomplishments, products, services, and solutions into a single statement, I would say that Pure Storage’s primary purpose (their North Star) is to guide and accelerate their customers’ adoption of AI through the Pure Storage platform for AI.

To learn more about all of this, make connections, learn new skills, and get ready for what’s next in this rapidly evolving season of AI, be sure to register and attend the Pure//Accelerate® 2024 live event June 18-21, 2024, at Resorts World Las Vegas. The event will have a special track on “Today’s and Tomorrow’s Applications of AI.” Don’t miss it!

Register Now for Pure//Accelerate 2024

Drive your data success at Pure//Accelerate® at Resorts World Las Vegas from June 18-21. This is the premier event to make connections, learn new skills, and get ready for what’s next. Here’s a sneak peek of what to expect:

Where can you see the latest data storage innovations, hear from visionary thought leaders, and discover the secret to transformation with data? This year, June 18-21, Pure//Accelerate® is lighting up the Vegas Strip at Resorts World Las Vegas.

https://blog.purestorage.com/news-events/register-early-for-pure-accelerate-2024/

Three Types of Actionable Business Analytics Not Called Predictive or Prescriptive

Leave a reply

Decades (at least) of business analytics writings have focused on the power, perspicacity, value, and validity in deploying predictive and prescriptive analytics for business forecasting and optimization, respectively. These are primarily forward-looking actionable (proactive) applications.

There are other dimensions of analytics that tend to focus on hindsight for business reporting and causal analysis – these are descriptive and diagnostic analytics, respectively, which are primarily reactive applications, mostly explanatory and investigatory, not necessarily actionable.

In the world of data there are other types of nuanced applications of business analytics that are also actionable – perhaps these are not too different from predictive and prescriptive, but their significance, value, and implementation can be explained and justified differently. Before we dive into these additional types of analytics applications, let us first consider a little pedagogical exercise with two simple evidence-based inferences.

(a) In essentially 100% of cases where an automobile is involved in an accident, the automobile had four wheels on the car prior to the accident.

(b) In 100% of divorce cases, the divorcing couple was married prior to the divorce.

What is the point of those obvious statistical inferences? The point is that the 100% association between the event and the preceding condition has no special predictive or prescriptive power. Hence, prior knowledge of these 100% associations does not offer any actionable value. In statistical terms, the joint probability of event Y and condition X co-occurring, designated P(X,Y), is essentially the probability P(Y) of event Y occurring. The probability of the condition X occurring, P(X), is irrelevant since the existence of the precondition X is implicitly present by default.

Okay, those examples represent two remarkably uninteresting cases. Even when similar sorts of inferences occur in a business context, they have essentially zero value. How do predictive and prescriptive analytics fit into this statistical framework?

Using the same statistical terminology, the conditional probability P(Y|X) (the probability of Y occurring, given the presence of precondition X) is an expression of predictive analytics. By exploring and analyzing the business data, analysts and data scientists can search for and uncover such predictive relationships. This is predictive power discovery. Another way of saying this is: given observed data X, we can predict some outcome Y. Or more simply: given X, find Y.

Similarly (actually, conversely), we can use the conditional probability P(X|Y) (which is the probability that the precondition X exists, given the existence of outcome Y) as an expression of prescriptive analytics. How does that work in practice? By exploring and analyzing business data, analysts and data scientists can search for and uncover the conditions (causal factors) that have led to different outcomes. So, if the business wants to optimize some outcome Y, then data analysts will be tasked with finding the conditions X that must be implemented to achieve that desired outcome. This is prescriptive power discovery. Another way of saying this is: given some desired optimal outcome Y, what conditions X should we put in place. Or more simply: given Y, find X. Note how this simple mathematical expression of prescriptive analytics is exactly the opposite of our previous expression of predictive analytics (given X, find Y).

Here are a few business examples of this type of prescriptive analytics: Which marketing campaign is most efficient and effective (has best ROI) in optimizing sales? Which environmental factors during manufacturing, packaging, or shipping lead to reduced product returns? Which pricing strategies lead to the best business revenue? What equipment maintenance schedule minimizes failures, downtime (mean time to recovery), and overall maintenance costs?

Now that we have described predictive and prescriptive analytics in detail, what is there left? What are the three types of actionable (and valuable) business analytics applications that are not called predictive or prescriptive? They are sentinel, precursor, and cognitive analytics. Let’s define what these are.

Sentinel Analytics – in common usage, the sentinel is the person on the guard station who is charged with watching for significant incoming or emergent activity. In practice, all activity is being observed and a decision is made as to whether any particular activity requires some sort of triage: sounding an alarm, or sending an alert to decision-makers, or doing nothing.
- In the enterprise, sentinel analytics is most timely and beneficial when applied to real-time, dynamic data streams and time-critical decisions. For example, sensors (including internet of things devices and APIs on data networks) can be deployed with logic (analytics, statistical, and/or machine learning algorithms) to monitor and “watch” business systems and processes for emerging patterns, trends, behaviors, unusual operating modes, and anomalies that might be indicators of activities that require business attention, decisions, and/or action.
Precursor Analytics – in common usage, precursors are the early-warning indicators (harbingers, forerunners) of something else more serious or catastrophic that is about to come. We occasionally hear about earthquake precursors (increased levels of radon in groundwater), tidal wave precursors (a deep ocean earthquake), and cyber-attack precursors (phishing incidents). Precursor analytics is related to sentinel analytics. The latter (sentinel) is associated primarily with “watching” the data for interesting patterns that might require action, while precursor analytics is associated primarily with training the business systems to quickly identify those specific “learned” patterns and events that are known to be associated with high-risk events, thus requiring timely attention, intervention, and remediation.
- In these applications, the data science involvement includes both the “learning” of the most significant patterns to alert on and the improvement of their models (logic) to minimize false positives and false negatives. The analytics triage is critical, to avoid alarm fatigue (sending too many unimportant alerts) and to avoid underreporting of important actionable events. One could say that sentinel analytics is more like unsupervised machine learning, while precursor analytics is more like supervised machine learning. That is not a totally clean separation and distinction, but it might help to clarify their different applications of data science.
- The counterexample to the supervised learning explanation of precursor analytics is a “black swan” event – a rare high-impact event that is difficult to predict under normal circumstances – such as the global pandemic, which led to the failure of many predictive models in business. Broken models are definitely disruptive to analytics applications and business operations. Paradoxically, the precursor was actually predictive in a disruptive anti-predictive sort of way, which brings us right back to P(Y|X), or maybe it should be stated as P(“not Y”|X) where X is the black swan event (i.e., the predicted outcome Y from existing models will not occur in this case). As such, the global pandemic serves as a warning (a harbinger of disruption) and consequently as a “training example” to businesses for any future black swans.
Cognitive Analytics – this analytics mindset approach focuses on “surprise” discovery in data, using machine learning and AI to emulate and automate the cognitive abilities of humans. The goal is to discover novel, interesting, unexpected, and potentially valuable signals in the flood of streaming enterprise data. These may not be high-risk discoveries, but they could be high-reward discoveries. How does that resemble human cognitive abilities? Curiosity! Being curious about seeing something “funny” that you didn’t expect, thereby putting a “marker” in the data stream: “Look here! Pay attention! Ask questions about this!”
- Cognitive analytics is basically the opposite of descriptive analytics. In descriptive analytics, the task is to find answers to predetermined business questions (how much, how many, how often, who, where, when), whereas cognitive analytics is tasked with finding the business questions that should be asked. Descriptive: find the right answers in the data. Cognitive: find the right questions in the data. Cognitive analytics can then be viewed as a precursor to diagnostic analytics, which is the investigative stage of analytics that answers the questions raised by cognitive analytics (“Why did this happen?”, “Why are we seeing this pattern in our data?”, “What is the business impact of this trend, anomaly, behavior?”, “What is our next-best action as a result of this?”, “That’s funny! What is that?”).

None of these descriptions of the 3 “new” analytics applications are meant to declare that these are completely distinct and different from the “big 4” analytics applications that we have known for many years (Descriptive, Diagnostic, Predictive, Prescriptive). But the differences between the “big 4” and the “new 3” are in the nuanced business applications of these analytics in the enterprise and in the types of inferences that the data scientists are asked to derive from the business data.

Deploying these analytics in the cloud further expands their accessibility, democratization, enterprise-wide acceptance, broad advocacy, and ultimate business value. Blending automated analytics products (coming from the sentinel, precursor, and cognitive applications) with human-in-the-loop inquisitiveness, curiosity, creativity, out-of-the-box thinking, idea generation, and persistence can transform any organization into a data analytics powerhouse through an analytic culture revolution. This is more imperative than ever, as a global survey of analytics executives has revealed:

“Companies have been working to become more data-driven for many years, with mixed results.”
“Right now, the biggest challenge for organizations working on their data strategy might not have to do with technology at all.”
“Corporate chief data, information, and analytics executives reported that cultural change is the most critical business imperative.”
“Just 26.5% of organizations report having established a data-driven organization.”
“91.9% of executives cite cultural obstacles as the greatest barrier to becoming data driven.”
Reference: https://hbr.org/2022/02/why-becoming-a-data-driven-organization-is-so-hard

Where do organizations get help to overcome these challenges? Microsoft delivers what its clients need to help them grow their top line with cloud-based analytics. Microsoft’s cloud-based analytics products and services propel business insights, innovation, and value from enterprise data, with all of the dimensions of analytics applications brought into the game. Specifically, cloud analytics (accessing and inferencing on multiple diverse business datasets across business units) for a wide variety of enterprise applications can sharpen the workforce’s focus on value and growth, including: forward-looking insights through predictive, sentinel, and precursor analytics; novel recommendations; rich customer engagement; analytic product innovation; resilience through prescriptive analytics; surprise discovery in data, asking the right questions, and exploring the most insightful lines of inquiry through cognitive analytics; and more.

Microsoft Azure Cloud extends ease-of-access analytics to all, delivers increased speed to deployment, provides leading security, compliance, and governance – with price performance for any organization. Whether organizations are seeking scalability in their enterprise data systems, advanced analytics capabilities (including the “big 4” and the “new 3”), real-time analytics (essential value-drivers from streaming data, including IoT, network logs, online customer interactions, supply chain, etc.), and the best in machine learning model-building and deployment services, Microsoft Azure Cloud has you covered. To learn more about it, go to https://azure.microsoft.com/en-us/solutions/cloud-scale-analytics and bring actionable business analytics to higher levels of proficiency and productivity across your organization.

My top learning moments at Splunk .conf23

Leave a reply

https://www.splunk.com/en_us/products.html

I recently attended the Splunk .conf23 conference in Las Vegas. Well, the conference was in Vegas, while I was far away in my home office watching the live conference keynote sessions online. Despite the thousands of miles (and kilometers) of separation, I could feel the excitement in the room as numerous announcements were made, individuals were honored, customer success stories were presented, and new solutions and product features were revealed. I will summarize many of those here (specifically my major learning moments), though this report cannot provide a complete review of all that was said at .conf23, primarily because I attended only the two main keynote sessions, but also because the phenomenal number of remarkable things to hear and learn in those sessions exceeded my capacity to record them all in one succinct report.

When I reviewed highlights from last year’s Splunk .conf22 conference in my summary report at that time, I focused a lot on the Splunk Observability Cloud and its incredible suite of Observability and Monitoring products and services. This reflected my strong interest in observability at that time. My strong interest hasn’t diminished, and neither has Splunk’s developments and product releases in that space, as seen in observability’s prominent mention within many of Splunk’s announcements at this year’s .conf23 event. For a detailed report on the current state of observability this year, you can access and download “The State of Observability 2023” report from Splunk. Here are four specific metrics from the report, highlighting the potentially huge enterprise system benefits coming from implementing Splunk’s observability and monitoring products and services:

Four times as many leaders who implement observability strategies resolve unplanned downtime in just minutes, not hours or days.
Leaders report one-third the number of outages per year, on average, compared to those organizations who do not implement observability and monitoring.
Leaders are 7.9x as likely to say that their ROI on observability tools far exceeded expectations.
89% of leaders are completely confident in their ability to meet their application availability and performance requirements, versus just 43% of others.

Here are my top learnings from .conf23:

Splunk announced a new strategic partnership with Microsoft Azure, thereby adding another major cloud provider to their other cloud provider partnerships, bringing Splunk products and services into more enterprises through the Azure Marketplace. This partnership also specifically extends hybrid cloud capabilities that will enhance organizations’ digital resilience, while enabling transformation, modernization, migration, and growth in all enterprise digital systems with confidence, trust, and security.
Digital resilience was a major common theme across all of the Splunk announcements this week. As I heard someone say in the keynote session, “You had me at resilience!” By providing real-time data insights into all aspects of business and IT operations, Splunk’s comprehensive visibility and observability offerings enhance digital resilience across the full enterprise. Organizations are able to monitor integrity, quality drift, performance trends, real-time demand, SLA (service level agreement) compliance metrics, and anomalous behaviors (in devices, applications, and networks) to provide timely alerting, early warnings, and other confidence measures. From these data streams, real-time actionable insights can feed decision-making and risk mitigations at the moment of need. Such prescriptive capabilities can be more proactive, automated, and optimized, making digital resilience an objective fact for businesses, not just a business objective. I call that “digital resilience for the win!”
Several Splunk customer success stories were presented, with interesting details of their enterprise systems, the “back stories” that led them to Splunk, the transformations that have occurred since Splunk integration, and the metrics to back up the success stories. Customers presenting at .conf23 included FedEx, Carnival Corporation & plc, Inter IKEA, and VMware. Here are a few of the customer performance metrics presented (measuring performance changes following the Splunk integration into the customers’ enterprise systems): 3X Faster Response Time, 90% Faster Mean Time to Remediation, and 60X Faster Insights.
Splunk has brought greater integration and customer ease-of-use of their offerings through a Unified Security and Observability Platform. This unified operations center (Splunk Mission Control) amplifies the efficiency (time to problem resolution) and effectiveness (number of time-critical problems resolved) of ITOps and DevOps teams, requiring fewer manual steps in correlating data streams from multiple systems in order to determine the root cause of an incident. Further enhancing the user experience, the unified platform provides end-to-end enterprise systems visibility and federated search across those systems.
Splunk Mission Control (just mentioned above) – Splunk describes it best: “Splunk Mission Control brings together Splunk’s industry-leading security technologies that help customers take control of their detection, investigation and response processes. Splunk’s security offerings include security analytics (Splunk Enterprise Security), automation and orchestration (Splunk SOAR), and threat intelligence capabilities. In addition, Splunk Mission Control offers simplified security workflows with processes codified into response templates. With Splunk Mission Control, security teams can focus on mission-critical objectives, and adopt more proactive, nimble security operations.”
Model-Assisted Threat Hunts, also known as Splunk M-ATH, is Splunk’s brand name for machine learning-assisted threat hunting and mitigation. M-ATH is part of the PEAK (Prepare, Execute, and Act with Knowledge) Framework, that consists of three types of hunts: (1) Hypothesis-driven (i.e., testing for hypothesized threats, behaviors, and activities), (2) Baseline (i.e., search for deviations from normal behaviors through EDA: Exploratory Data Analysis), and (3) M-ATH (i.e., automation of the first two type of hunts, using AI and machine learning). M-ATH includes ML-assisted adaptive thresholding and outlier handling, for improved alerts (i.e., faster alerting with fewer false positives and false negatives).
“Don’t be a SOAR loser!” Okay, that’s what one of the Splunk speakers said at .conf23. By that, he was referring to being a winner with Splunk SOAR: Security Orchestration, Automation and Response. SOAR orchestrates, prioritizes, and automates security teams (SecOps) workflows and tasks, enabling more accurate, more complete, smarter, and faster response to security incidents. As Splunk says, “Automate so you can innovate.” Isn’t that always a business truth? If you can free your analyst teams to think outside the box, hypothesize, innovate, and test new methods and solutions, then that is the sure path to being a SAFE (Security Analytics For the Enterprise) winner: soar with SOAR! While SIEM (Security Information and Event Management) aims to manage the data flows, logging, audits, alerted events, and incident responses, SOAR automates these security activities (using machine learning and AI), monitors the data and events for anomalous behaviors, classifies (prioritizes) the events, and then orchestrates optimized security operations and incident responses (using playbooks).
Saving my best two .conf23 learning moments for last, first up is Splunk Edge Hub. This is a physical device, in the IoT (Internet of Things) family of sensors, that collects and streams data from the edge (i.e., from edge devices, cameras, streaming data sources, monitoring systems, and sensors of all types) into Splunk systems that go to work on those data: security operations, anomaly detection, event classification, trend detection, drift detection, behavior detection, and any other edge application that requires monitoring and observability, with an injection of machine learning and AI for intelligent data understanding, classification, prioritization, optimization, and automation. Since business thrives at the edge (through insights discovery and actionable analytics at the point—time and place—of data collection), an edge hub is just what a business needs to mitigate risk, ensure visibility, escalate incidents for review, optimize the operational response, and monitor the associated activities (causes and effects).
Splunk AI Assistant – Boom! This is the brilliant and innovative introduction of an AI assistant into Splunk products, services, and user workflows. This includes the latest and best of AI — generative AI and natural language interfaces integrated within the Splunk platform. This product release most definitely enables and “catalyzes digital resilience in cybersecurity and observability.” This is not just a product release. It is a “way of life” and “a way of doing business” with Splunk products and services. AI is not just a tacked-on feature, but it is a fundamental characteristic and property of those products’ features. Splunk AI increases productivity, efficiency, effectiveness, accuracy, completeness, reliability, and (yes!) resilience across all enterprise SecOps, ITOps, and AIOps functions, tasks, and workflows that are powered by Splunk. Generative AI enables the Splunk SecOps and ITOps tasks, workflows, processes, insights, alerts, and recommended actions to be domain-specific and customer-specific. It automatically detects anomalies and focuses attention where it’s needed most, for that business in that domain, while providing full control and transparency on which data and how data are used to train the AI, and how much control is assigned to the AI (by maintaining “human in the loop” functionality). With regard to the natural language features, Splunk AI Assistant leverages generative AI to provide an interactive chat experience and helps users create SPL (Splunk Processing Language) queries using natural language. This feature not only improves time-to-value, but it “helps make SPL more accessible, further democratizing an organization’s access to, and insights from, its data” – and that includes automated recommendations to the user for “next best action”, which is a great learning prompt for new Splunk users and SecOps beginners.

For a peek into my peak real-time experiences at .conf23, see my #splunkconf23 social thread on Twitter at https://bit.ly/3DjI5NU. Actually, go there and explore, because there is so much more to see there than I could cover in this one report.

Closing thoughts – AI (particularly generative AI) has been the hottest tech topic of the year, and Splunk .conf23 did not disappoint in their coverage of this topic. The agendas for some events are filled with generic descriptions that sing the praises of generative AI. This Splunk event .conf23 provided something far more beneficial and practical: they presented demonstrably valuable business applications of generative AI embedded in Splunk products, which deliver a convincing Splunk-specific productivity enhancer for new and existing users of Splunk products. When the tech hype train is moving as fast as it has been this year, it is hard for a business to quickly innovate, incorporate, and deliver substantially new features that use the new tech within their legacy products and services, but Splunk has done so, with top marks for those achievements.

Disclaimer: I was compensated as an independent freelance media influencer for my participation at the conference and for this article. The opinions expressed here are entirely my own and do not represent those of Splunk or of any Splunk partners. Any misrepresentations of the products and services mentioned in my statements are entirely my own responsibility. Nothing here should be construed as an offer to sell or as financial advice of any kind. My comments are entirely of a technical nature, focused on the technical capabilities of the items mentioned in the article.

https://twitter.com/KirkDBorne/status/1681108696812617730

Three Emerging Analytics Products Derived from Value-driven Data Innovation and Insights Discovery in the Enterprise

Leave a reply

I recently saw an informal online survey that asked users which types of data (tabular, text, images, or “other”) are being used in their organization’s analytics applications. This was not a scientific or statistically robust survey, so the results are not necessarily reliable, but they are interesting and provocative. The results showed that (among those surveyed) approximately 90% of enterprise analytics applications are being built on tabular data. The ease with which such structured data can be stored, understood, indexed, searched, accessed, and incorporated into business models could explain this high percentage. A similarly high percentage of tabular data usage among data scientists was mentioned here.

If my explanation above is the correct interpretation of the high percentage, and if the statement refers to successfully deployed applications (i.e., analytics products, in contrast to non-deployed training experiments, demos, and internal validations of the applications), then maybe we would not be surprised if a new survey (not yet conducted) was to reveal that a similar percentage of value-producing enterprise data innovation and analytics/ML/AI applications (hereafter, “analytics products”) are based on on-premises (on-prem) data sources. Why? … because the same productivity benefits mentioned above for tabular data sources (fast and easy data access) would also be applicable in these cases (on-prem data sources). And no one could deny that these benefits would be substantial. What could be faster and easier than on-prem enterprise data sources?

Accompanying the massive growth in sensor data (from ubiquitous IoT devices, including location-based and time-based streaming data), there have emerged some special analytics products that are growing in significance, especially in the context of innovation and insights discovery from on-prem enterprise data sources. These enterprise analytics products are related to traditional predictive and prescriptive analytics, but these emergent products may specifically require low-latency (on-prem) data delivery to support enterprise requirements for timely, low-latency analytics product delivery. These three emergent analytics products are:

(a) Sentinel Analytics – focused on monitoring (“keeping an eye on”) multiple enterprise systems and business processes, as part of an observability strategy for time-critical business insights discovery and value creation from enterprise data sources. For example, sensors can monitor and “watch” systems and processes for emergent trends, patterns, anomalies, behaviors, and early warning signs that require interventions. Monitoring of data sources can include online web usage actions, streaming IT system patterns, system-generated log files, customer behaviors, environmental (ESG) factors, energy usage, supply chain, logistics, social and news trends, and social media sentiment. Observability represents the business strategy behind the monitoring activities. The strategy addresses the “what, when, where, why, and how” questions from business leaders concerning the placement of “sensors” that are used to collect the essential data that power the sentinel analytics product, in order to generate timely insights and thereby enable better data-informed “just in time” business decisions.

(b) Precursor Analytics – the use of AI and machine learning to identify, evaluate, and generate critical early-warning alerts in enterprise systems and business processes, using high-variety data sources to minimize false alarms (i.e., using high-dimensional data feature space to disambiguate events that seem to be similar, but are not). Precursor analytics is related to sentinel analytics. The latter is associated primarily with “watching” the data for interesting patterns, while precursor analytics is associated primarily with training the business systems to quickly identify those specific patterns and events that could be associated with high-risk events, thus requiring timely attention, intervention, and remediation. One could say that sentinel analytics is more like unsupervised machine learning, while precursor analytics is more like supervised machine learning. That is not a totally clear separation and distinction, but it might help to clarify their different applications of data science. Data scientists work with business users to define and learn the rules by which precursor analytics models produce high-accuracy early warnings. For example, an exploration of historical data may reveal that an increase in customer satisfaction (or dissatisfaction) with one particular product is correlated with some other satisfaction (or dissatisfaction) metric downstream at a later date. Consequently, based on this learning, deploying a precursor analytics product to detect the initial trigger event early can thus enable a timely response to the situation, which can produce a positive business outcome and prevent an otherwise certain negative outcome.

(c) Cognitive Analytics – focused on “surprise” discovery in diverse data streams across numerous enterprise systems and business processes, using machine learning and data science to emulate and automate the curiosity and cognitive abilities of humans – enabling the discovery of novel, interesting, unexpected, and potentially business-relevant signals across all enterprise data streams. These may not be high risk. They might actually be high-reward discoveries. For example, in one company, an employee noticed that it was the customer’s birthday during their interaction and offered a small gift to the customer at that moment—a gift that was pre-authorized by upper management because they understood that their employees are customer-facing and they anticipated that their employees would need to have the authority to take such customer-pleasing actions “in the moment”. The outcome was very positive indeed, as this customer reported the delightful experience on their social media account, thereby spreading positive sentiment about the business to a wide audience. Instead of relying on employees to catch all surprises in the data streams, the enterprise analytics applications can be trained to automatically watch for, identify, and act on these surprises. In the customer birthday example, the cognitive analytics product can be set up for automated detection and response, which can occur without the employee in the loop at all, such as in a customer’s online shopping experience or in a chat with the customer call center bot.

These three analytics products are derived from business value-driven data innovation and insights discovery in the enterprise. Investigating and deploying these are a worthy strategic move for any organization that is swimming in a sea (or lake or ocean) of on-prem enterprise data sources.

In closing, let us look at some non-enterprise examples of these three types of analytics:

Sentinel – the sentinel on the guard station at a military post is charged with watching for incoming activity. They are assigned this duty just in case something occurs during the night or when everyone else is busy with other operational things. That “something” might be an enemy approaching or a wild bear in the forest. In either case, keeping an eye on the situation is critical for the success of the operation. Another example of a sentinel is a marked increase in the volatility of stock market prices, indicating that there may be a lot of FUD (fear, uncertainty, and doubt) in the market that could lead to wild swings or downturns. In fact, anytime that any streaming data monitoring metric shows higher than usual volatility, this may be an indicator that the monitored thing requires some attention, an investigation, and possibly an intervention.
Precursor – prior to large earthquakes, it has been found that increased levels of radon are detected in soil, in groundwater, and even in the air in people’s home basements. This precursor is presumed to be caused by the radon being released from cavities within the Earth’s crust as the crust is being strained prior to the sudden slippage (the earthquake). Earthquakes themselves can be precursors to serious events – specifically, a large earthquake detected at the bottom of the ocean can produce a massive tidal wave, that can travel across the ocean and have drastic consequences on distant shores. In some cases, the precursor can occur sufficiently in advance of the tidal wave’s predicted arrival at inhabited shores, thereby enabling early warnings to be broadcasted. In both of these cases, the precursor (radon release or ocean-based earthquake) is not the biggest problem, though they may be seen as sentinels of an on-going event, but the precursor is an early warning sign of a potentially bigger catastrophe that’s coming (a major land-based earthquake or a tidal wave hitting major population centers along coastlines, respectively).
Cognitive – a cognitive person walking into an intense group meeting (perhaps a family or board meeting) can probably tell the mood of the room fairly quickly. The signals are there, though mostly contextual, thus probably missed by a cognitively impaired person. A cognitive person is curious about odd things that they see and hear—things or circumstances or behaviors that seem out of context, unusual, and surprising. The thing itself (or the data about the thing) may not be surprising (though it could be), but the context (the “metadata”, which is “other data about the primary data”) provides a signal that something needs attention here. Perhaps the simplest expression of being cognitive in this data-drenched world comes from a quote attributed to famous science writer Isaac Asimov: “The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka!’ (I found it!) but ‘That’s funny…‘.”

The cognitive enterprise versus the cognitively impaired enterprise – which of these would your organization prefer to be? Get moving now with sentinel, precursor, and cognitive analytics through data innovation and insights discovery with your on-prem enterprise data sources.

Read more about analytics innovation from on-prem enterprise data sources in this 3-part blog series:

Source: https://www.azquotes.com/author/605-Isaac_Asimov

Low-Latency Data Delivery and Analytics Product Delivery for Business Innovation and Enterprise AI Readiness

Leave a reply

This article has been divided into 2 parts now:

Read other articles in this series on the importance of low-latency enterprise data infrastructure for business analytics:

Other related articles on the importance of data infrastructure for enterprise AI initiatives:

https://blog.purestorage.com/solutions/ai-breakthrough-awards-best-ai-solution-airi-s/

The Data Space-Time Continuum for Analytics Innovation and Business Growth

Leave a reply

We discussed in another article the key role of enterprise data infrastructure in enabling a culture of data democratization, data analytics at the speed of business questions, analytics innovation, and business value creation from those innovative data analytics solutions. Now, we drill down into some of the special characteristics of data and enterprise data infrastructure that ignite analytics innovation.

First, a little history – years ago, at the dawn of the big data age, there was frequent talk of the three V’s of big data (data’s three biggest challenges): volume, velocity, and variety. Though those discussions are now considered “ancient history” in the current AI-dominated era, the challenges have not vanished. In fact, they have grown in importance and impact.

While massive data volumes appear less frequently now in strategic discussions and are being tamed with excellent data infrastructure solutions from Pure Storage, the data velocity and data variety challenges remain in their own unique “sweet spot” of business data strategy conversations. We addressed the data velocity challenges and solutions in our previous article: “Solving the Data Daze – Analytics at the Speed of Business Questions”. We will now take a look at the data variety challenge, and then we will return to modern enterprise data infrastructure solutions for handling all big data challenges.

Okay, data variety—what is there about data variety that makes it such a big analytics challenge? This challenge often manifests itself when business executives ask a question like this: “what value and advantages will all that diversity in data sources, venues, platforms, modalities, and dimensions actually deliver for us in order to outweigh the immense challenges that high data variety brings to our enterprise data team?”

Because nearly all organizations collect many types of data from many different sources for many business use cases, applications, apps, and development activities, consequently nearly every organization is facing this dilemma.

[continue reading the full article here]

Solving the Data Daze – Analytics at the Speed of Business Questions

Leave a reply

Data is more than just another digital asset of the modern enterprise. It is an essential asset. And data is now a fundamental feature of any successful organization. Beyond the early days of data collection, where data was acquired primarily to measure what had happened (descriptive) or why something is happening (diagnostic), data collection now drives predictive models (forecasting the future) and prescriptive models (optimizing for “a better future”). Business leaders need more than backward-looking reports, though those are still required for some stakeholders and regulators. Leaders now require forward-looking insights for competitive market advantage and advancement.

So, what happens when the data flows are not quarterly, or monthly, or even daily, but streaming in real-time? The business challenges then become manifold: talent and technologies now must be harnessed, choreographed, and synchronized to keep up with the data flows that carry and encode essential insights flowing through business processes at light speed. Insights discovery (powered by analytics, data science, and machine learning) drives next-best decisions, next-best actions, and business process automation.

In the early days of the current data analytics revolution, one would often hear business owners say that they need their data to move at the speed of business. Well, it soon became clear that the real problem was the reverse: how can we have our business move at the speed of our data? Fortunately, countless innovative products and services in the data analytics world have helped organizations in that regard, through an explosion in innovation around data analytics, data science, data storytelling, data-driven decision support, talent development, automation, and AI (including the technologies associated with machine learning, deep learning, generative AI, and ChatGPT).

[continue reading the full article here]

Source: https://www.purestorage.com/knowledge/what-is-hybrid-cloud.html

SAP Datasphere Powers Business at the Speed of Data

Leave a reply

We live in a data-rich, insights-rich, and content-rich world. Data collections are the ones and zeroes that encode the actionable insights (patterns, trends, relationships) that we seek to extract from our data through machine learning and data science. The insights are used to produce informative content for stakeholders (decision-makers, business users, and clients). Content includes reports, documents, articles, presentations, visualizations, video, and audio representations of the insights and knowledge that have been extracted from data.

We could further refine our opening statement to say that our business users are too often in a state of being data-rich, but insights-poor, and content-hungry. With all the data in and around the enterprise, users would say that they have a lot of information but need more insights to assist them in producing better and more informative content. This is where we dispel an old “big data” notion (heard a decade ago) that was expressed like this: “we need our data to run at the speed of business.” Instead, what we really need is for our business to run at the speed of data. It is a major digital transformation challenge for businesses to keep up with data flows coming from a multitude of diverse sources, in different formats, at different cadences, on different dimensions of the enterprise, captured “safely” in different business silos. Transforming data to actionable insights and informative content needs some help!

AI is now helping in all these steps – not simply because it is “Artificial” intelligence, but primarily because AI is a tool for assisted, amplified, and augmented intelligence (the “new AI”) and because AI powers accelerated and automated intelligence, in order to deliver actionable intelligence. It appears that it’s AI everywhere all the time.

AI can help business users extract and produce (i.e., generate) informative content from insights. Plus, AI can also help find key insights encoded in data. And AI can help users find the appropriate data that they need from across the enterprise. In the language of Information Retrieval, AI delivers both high Recall (“did I get all the data that I need?”) and high Precision (“did I get only the data that I need?).

Discover the essential data – that’s AI.

Extract the essential insights from the data – that’s AI.

Produce essential content from the insights – that’s AI.

Well, okay, we can slap the “AI” label on everything, but what good is that? How are we helped when we board the AI hype train? In fact, by putting a single label like AI on all the steps of a data-driven business process, we have effectively not only blurred the process, but we have also blurred the particular characteristics that make each step separately distinct, uniquely critical, and ultimately dependent on specialized, specific technologies and business domain expertise at each step. This is where SAP Datasphere (the next generation of SAP Data Warehouse Cloud) comes in.

The new SAP Datasphere comprehensive data service provides powerful, seamless, and scalable access across the enterprise, across business departments, and across business silos to the specific mission-critical business data collection(s) that are needed for each unique business use case: access to external insights from data on the marketplace and competitors, access to internal insights from data on business processes and enterprise resources, and access to insights on customer-facing business products and services at the intersection of internal and external data sources.

So, if your business users don’t have access to the right data in the right context at the right time for the right business questions, then the whole business data workflow breaks down. SAP Datasphere has arrived to address those pain points, by enabling discovery, access, and integration of the heterogeneous data distributed across the enterprise. And so begins the process of insight discovery and content creation that meets the most significant, timely, and case-specific needs of decision-makers, business knowledge workers, and other stakeholders.

The release of SAP Datasphere was launched and announced globally on March 8, 2023. Live online presentations, demos, and customer testimonials were complemented with new content posted at sap.com/datasphere. Here are just 10 of the many key features of Datasphere that were covered during the launch day announcements:

Datasphere works with the SAP Analytics Cloud and runs on the existing SAP BTP (Business Technology Platform), with all the essential features: security, access control, high availability.
Datasphere accesses and integrates both SAP and non-SAP data sources into end-users’ data flows, including on-prem data warehouses, cloud data warehouses and lakehouses, relational databases, virtual data products, in-memory data, and applications that generate data (such as external API data loads).
Datasphere manages and integrates structured, semi-structured, and unstructured data types.
Datasphere empowers data democratization, by providing all business users with self-service data access, including virtual data products that can be stored, re-used, and shared.
Datasphere is a data discovery tool with essential functionalities: recommendations, data marketplace, and business content (i.e., incorporates the business context of the data and data products that are being recommended and delivered).
Datasphere goes beyond the “big three” data usage end-user requirements (ease of discovery, access, and delivery) to include data orchestration (data ops and data transformations) and business data contextualization (semantics, metadata, catalog services).
Datasphere is an enhanced data warehousing service that includes business semantics (through both analytic and relational models) and a knowledge graph (linking business content with business context).
Datasphere provides full-spectrum data governance: metadata management, data catalogs, data privacy, data quality, and data lineage (provenance) tracking.
Datasphere provides all the outgoing data orchestration functions and incoming data ingestion functions, including replication, federation, real-time stream processing, and application integration.
Datasphere is not just for data managers. It thrives with data consumers, who are doing planning, analytics, data science, and developing intelligent data applications – by providing those users with an end-to-end view of their data landscape in a trusted, secure, and actionable data environment.

Source: https://news.sap.com/2023/03/sap-datasphere-business-data-fabric/

SAP also announced key partners that further enhance Datasphere as a powerful business data fabric. These partners are:

Collibra – providing data governance and discovery (metadata, catalogs) across the entire data landscape.
Confluent – providing access and discovery across real-time event data and streaming data. This emphatically addresses the “data in motion” challenge of enabling “business to run at the speed of data.”
Databricks – providing the complete business context across the evolved Data Warehouse Cloud – the new Data Lakehouse platform.
DataRobot – provides the AI, machine learning (ML), and AutoML capabilities that address the augmented intelligence requirements described at the beginning of this article.

I will finish with three quotes. The first is SAP customer testimonial from Mr. David Johnston, the Chief Information Officer at Messer Americas (leading provider of industrial and medical gases for over 120 years):

“The [Datasphere] business data fabric architecture enables us to bring SAP and non-SAP data together in the seamless and self-service way we’ve been envisioning. SAP Datasphere provided us with a solution to build a harmonized layer, or business data fabric, across SAP and non-SAP, cloud or on-premise data sources, making the best use of our existing investments [in both SAP and non-SAP data services].”

The second quote is from independent analyst Tony Baer:

“SAP’s goal is not simply pairing a data transformation factory with a data warehouse, but instead delivering a service that preserves the context of source data. As you would guess, maintaining context relies on metadata. The challenge is that when you use existing tools for replicating, moving and transforming data, the metadata typically does not usually go along with it. … SAP’s applications are a rich treasure store for business data and the process semantics that go with them. So, it’s logical that SAP has expanded on the business semantic layer of its Data Warehouse Cloud to deliver a data fabric that surfaces the metadata in business terms.”

The third quote is from Juergen Mueller, SAP Chief Technology Officer and member of the Executive Board of SAP SE:

“With SAP customers generating 87% of total global commerce, SAP data is among a company’s most valuable business assets and is contained in the most important functions of an organization, from manufacturing to supply chains, finance, human resources and more. We want to help our customers take the next step to easily and confidently integrate SAP data with non-SAP data from third-party applications and platforms, unlocking entirely new insights and knowledge to bring digital transformation to another level.”

Read, take a tour, try the free tier, deep dive, and learn more about SAP Datasphere here:

https://news.sap.com/2023/03/sap-datasphere-power-of-business-data/

Isn’t it now the time to accelerate your digital transformation efforts and get your business moving at the speed of data?

——————

This article has also been published at https://www.linkedin.com/pulse/sap-datasphere-powers-business-speed-data-kirk-borne-ph-d-/

Follow me on Twitter at @KirkDBorne

My top learning and pondering moments at Splunk .conf22

Leave a reply

I recently attended the Splunk .conf22 conference. While the event was live in-person in Las Vegas, I attended virtually from my home office. Consequently I missed the incredible in-person experience of the brilliant speakers on the main stage, the technodazzle of 100’s of exhibitors’ offerings in the exhibit arena, and the smooth hip hop sounds from the special guest entertainer — guess who?

What I missed in-person was more than compensated for by the incredible online presentations by Splunk leaders, developers, and customers. If you have ever attended a major expo at one of the major Vegas hotels, you know that there is a lot of walking between different sessions — literally, miles of walking per day. That’s good for you, but it often means that you don’t attend all of the sessions that you would like because of the requisite rushing from venue to venue. None of that was necessary on the Splunk .conf22 virtual conference platform. I was able to see a lot, learn a lot, be impressed a lot, and ponder a lot about all of the wonderful features, functionalities, and future plans for the Splunk platform.

One of the first major attractions for me to attend this event is found in the primary descriptor of the Splunk Platform — it is appropriately called the Splunk Observability Cloud, which includes an impressive suite of Observability and Monitoring products and services. I have written and spoken frequently and passionately about Observability in the past couple of years. For example, I wrote this in 2021:

“Observability emerged as one of the hottest and (for me) most exciting developments of the year. Do not confuse observability with monitoring (specifically, with IT monitoring). The key difference is this: monitoring is what you do, and observability is why you do it. Observability is a business strategy: what you monitor, why you monitor it, what you intend to learn from it, how it will be used, and how it will contribute to business objectives and mission success. But the power, value, and imperative of observability does not stop there. Observability meets AI – it is part of the complete AIOps package: ‘keeping an eye on the AI.’ Observability delivers actionable insights, context-enriched data sets, early warning alert generation, root cause visibility, active performance monitoring, predictive and prescriptive incident management, real-time operational deviation detection (6-Sigma never had it so good!), tight coupling of cyber-physical systems, digital twinning of almost anything in the enterprise, and more. And the goodness doesn’t stop there.”
Continue reading my thoughts on Observability at http://rocketdatascience.org/?p=1589

The dominant references everywhere to Observability was just the start of awesome brain food offered at Splunk’s .conf22 event. Here is a list of my top moments, learnings, and musings from this year’s Splunk .conf:

Observability for Unified Security with AI (Artificial Intelligence) and Machine Learning on the Splunk platform empowers enterprises to operationalize data for use-case-specific functionality across shared datasets. (Reference)
The latest updates to the Splunk platform address the complexities of multi-cloud and hybrid environments, enabling cybersecurity and network big data functions (e.g., log analytics and anomaly detection) across distributed data sources and diverse enterprise IT infrastructure resources. (Reference)
Splunk Enterprise 9.0 is here, now! Explore and test-drive it (with a free trial) here.
The new Splunk Enterprise 9.0 release enables DevSecOps users to gain more insights from Observability data with Federated Search, with the ability to correlate ops with security alerts, and with Edge Management, all in one platform. (Reference)
Security information and event management (SIEM) on the Splunk platform is enhanced with end-to-end visibility and platform extensibility, with machine learning and automation (AIOps), with risk-based alerting, and with Federated Search (i.e., Observability on-demand). (Reference)
Customer success story: As a customer-obsessed bank with ultra-rapid growth, Nubank turned to Splunk to optimize data flows, analytics applications, customer support functions, and insights-obsessed IT monitoring. (Reference)
The key characteristics of the Splunk Observability Cloud are Resilience, Security, Scalability, and EXTENSIBILITY. The latter specifically refers to the ease in which developers can extend Splunk’s capabilities to other apps, applying their AIOps and DevSecOps best practices and principles! Developers can start here.
The Splunk Observability Cloud has many functions for data-intensive IT, Security, and Network operations, including Anomaly Detection Service, Federated Search, Synthetic Monitoring, Incident Intelligence, and much more. Synthetic monitoring is essentially digital twinning of your network and IT environment, providing insights through simulated risks, attacks, and anomalies via predictive and prescriptive modeling. [Reference]
Splunk Observability Cloud’s Federated Search capability activates search and analytics regardless of where your data lives — on-site, in the cloud, or from a third party. (Reference)
The new release of the Splunk Data Manager provides a simple, modern, automated experience of data ingest for Splunk Cloud admins, which reduces the time it takes to configure data collection (from hours/days to minutes). (Reference)
Splunk works on data, data, data, but the focus is always on customer, customer, customer — because delivering best outcomes for customers is job #1. Explore Splunk’s amazing Partner ecosystem (Partnerverse) and the impressive catalog of partners’ solutions here.
Splunk .conf22 Invites Organizations to Unlock Innovation With Data.

In summary, here is my list of key words and topics that illustrate the diverse capabilities and value-packed features of the Splunk Observability Cloud Platform that I learned about at the .conf22 event:

– Anomaly Detection Assistant
– Risk-based Alerting (powered by AI and Machine Learning scoring algorithms)
– Federated Search (Observability on-demand)
– End-to-End Visibility
– Platform Extensibility
– Massive(!) Scalability of the Splunk Observability Cloud (to billions of transactions per day)
– Insights-obsessed Monitoring (“We don’t need more information. We need more insights.”)
– APIs in Action (to Turn Data into Doing™)
– Splunk Incident Intelligence
– Synthetic Monitoring (Digital Twin of Network/IT infrastructure)
– Splunk Data Manager
– The Splunk Partner Universe (Partnerverse)

My closing thought — Cybersecurity is basically Data Analytics: detection, prediction, prescription, and optimizing for unpredictability. This is what Splunk lives for!

Follow me on LinkedIn here and on Twitter at @KirkDBorne.

Splunk Cloud Platform Dashboard.
Source: https://www.splunk.com/en_us/products/splunk-cloud-platform.html

Data Insights for Everyone — The Semantic Layer to the Rescue

Leave a reply

What is a semantic layer? That’s a good question, but let’s first explain semantics. The way that I explained it to my data science students years ago was like this. In the early days of web search engines, those engines were primarily keyword search engines. If you knew the right keywords to search and if the content providers also used the same keywords on their website, then you could type the words into your favorite search engine and find the content you needed. So, I asked my students what results they would expect from such a search engine if I typed the following words into the search box: “How many cows are there in Texas?” My students were smart. They realized that the search results would probably not provide an answer to my question, but the results would simply list websites that included my words on the page or in the metadata tags: “Texas”, “Cows”, “How”, etc. Then, I explained to my students that a semantic-enabled search engine (with a semantic meta-layer, including ontologies and similar semantic tools) would be able to interpret my question’s meaning and then map that meaning to websites that can answer the question.

This was a good opening for my students to the wonderful world of semantics. I brought them deeper into the world by pointing out how much more effective and efficient the data professionals’ life would be if our data repositories had a similar semantic meta-layer. We would be able to go far beyond searching for correctly spelled column headings in databases or specific keywords in data documentation, to find the data we needed (assuming we even knew the correct labels, metatags, and keywords used by the dataset creators). We could search for data with common business terminology, regardless of the specific choice or spelling of the data descriptors in the dataset. Even more than that, we could easily start discovering and integrating, on-the-fly, data from totally different datasets that used different descriptors. For example, if I am searching for customer sales numbers, different datasets may label that “sales”, or “revenue”, or “customer_sales”, or “Cust_sales”, or any number of other such unique identifiers. What a nightmare that would be! But what a dream the semantic layer becomes!

When I was teaching those students so many years ago, the semantic layer itself was just a dream. Now it is a reality. We can now achieve the benefits, efficiencies, and data superhero powers that we previously could only imagine. But wait! There’s more.

Perhaps the greatest achievement of the semantic layer is to provide different data professionals with easy access to the data needed for their specific roles and tasks. The semantic layer is the representation of data that helps different business end-users discover and access the right data efficiently, effectively, and effortlessly using common business terms. The data scientists need to find the right data as inputs for their models — they also need a place to write-back the outputs of their models to the data repository for other users to access. The BI (business intelligence) analysts need to find the right data for their visualization packages, business questions, and decision support tools — they also need the outputs from the data scientists’ models, such as forecasts, alerts, classifications, and more. The semantic layer achieves this by mapping heterogeneously labeled data into familiar business terms, providing a unified, consolidated view of data across the enterprise.

The semantic layer delivers data insights discovery and usability across the whole enterprise, with each business user empowered to use the terminology and tools that are specific to their role. How data are stored, labeled, and meta-tagged in the data cloud is no longer a bottleneck to discovery and access. The decision-makers and data science modelers can fluidly share inputs and outputs with one another, to inform their role-specific tasks and improve their effectiveness. The semantic layer takes the user-specific results out of being a “one-off” solution on that user’s laptop to becoming an enterprise analytics accelerant, enabling business answer discovery at the speed of business questions.

Insights discovery for everyone is achieved. The semantic layer becomes the arbiter (multi-lingual data translator) for insights discovery between and among all business users of data, within the tools that they are already using. The data science team may be focused on feature importance metrics, feature engineering, predictive modeling, model explainability, and model monitoring. The BI team may be focused on KPIs, forecasts, trends, and decision-support insights. The data science team needs to know and to use that data which the BI team considers to be most important. The BI team needs to know and to use which trends, patterns, segments, and anomalies are being found in those data by the data science team. Sharing and integrating such important data streams has never been such a dream.

The semantic layer bridges the gaps between the data cloud, the decision-makers, and the data science modelers. The key results from the data science modelers can be written back to the semantic layer, to be sent directly to consumers of those results in the executive suite and on the BI team. Data scientists can focus on their tools; the BI users and executives can focus on their tools; and the data engineers can focus on their tools. The enterprise data science, analytics, and BI functions have never been so enterprisey. (Is “enterprisey” a word? I don’t know, but I’m sure you get my semantic meaning.)

That’s empowering. That’s data democratization. That’s insights democratization. That’s data fluency/literacy-building across the enterprise. That’s enterprise-wide agile curiosity, question-asking, hypothesizing, testing/experimenting, and continuous learning. That’s data insights for everyone.

Are you ready to learn more how you can bring these advantages to your organization? Be sure to watch the AtScale webinar “How to Bridge Data Science and Business Intelligence” where I join a panel in a multi-industry discussion on how a semantic layer can help organizations make smarter data-driven decisions at scale. There will be several speakers, including me. I will be speaking about “Model Monitoring in the Enterprise — Filling the Gaps”, specifically focused on “Filling the Communication Gaps Between BI and Data Science Teams With a Semantic Data Layer.”

Rocket-Powered Data Science

Data Reflections by Dr. Kirk Borne @KirkDBorne

Category Archives: Analytics

An AI Data Platform for All Seasons

Pure Storage empowers enterprise AI with advanced data storage technologies and validated reference architectures for emerging generative AI use cases.

Summary

AI Then and AI Now

Pure Storage Meets the Demands of Enterprise AI

Pure Storage’s Announcements at GTC 2024

A Data Platform for AI

Optimizing GenAI Apps with RAG—Pure Storage + NVIDIA for the Win!

OVX Validated Reference Architecture for AI-ready Infrastructures

More Pure Storage Announcements at GTC 2024

Looking at AI Now and at What’s Next

Register Now for Pure//Accelerate 2024

Three Types of Actionable Business Analytics Not Called Predictive or Prescriptive

My top learning moments at Splunk .conf23

Three Emerging Analytics Products Derived from Value-driven Data Innovation and Insights Discovery in the Enterprise

Low-Latency Data Delivery and Analytics Product Delivery for Business Innovation and Enterprise AI Readiness

The Data Space-Time Continuum for Analytics Innovation and Business Growth

Solving the Data Daze – Analytics at the Speed of Business Questions

SAP Datasphere Powers Business at the Speed of Data

My top learning and pondering moments at Splunk .conf22

Data Insights for Everyone — The Semantic Layer to the Rescue