Spring XD is a unified, distributed, and extensible system for data ingestion, real time analytics, batch processing, and data export. The project's goal is to simplify the development of big data applications.
Building upon Spring Boot and Spring Cloud capabilities, Spring XD is redesigned as Spring Cloud Data Flow - a cloud-native orchestration service for composable microservice applications on modern runtimes. For more information and the reasons behind this redesign, please refer to Spring Cloud Data Flow's launch blog.
Spring XD's 1.3 GA will be the last release in the 1.x line and with this release, 1.x will be officially in maintenance mode, addressing only bug fixes. Spring XD is scheduled for End-of-Life, End-of-Availability, and End-of-Support by July 2017.
For any new feature requests or improvements, please submit them on GitHub issues. We welcome your help and feedback!
Spring XD is a unified platform for a fragmented Hadoop ecosystem. It’s built on top of battle-tested open source projects, and dramatically simplifies orchestration of Big Data workloads and data pipelines.
Spring XD is built to be adapted from the ground up to suit your enterprise’s unique needs, not dictate your technology choices for you. Extend in any direction with open plug-in points for your existing technology investments, implemented with simple Java classes.
Developers new to Big Data can use a no-coding, configuration driven tool to develop Spring XD applications. Java developers can also easily extend the platform or the DSL with familiar extensibility, testing, and automation tools inherited directly from Spring Batch & Integration.
Data-driven apps require refined and consolidated data at scale. Spring XD’s stream and batch workflow lets you build pipelines to consume data from various endpoints and consolidate them in Hadoop, in-memory data grids such as Redis or GemFire, and virtually any data store.
Flexibility in distributing workload across your existing cloud, or on-prem hardware is key for maximizing ROI on hardware or IaaS spend. That’s why the Spring XD runtime is distributed, scalable, fault-tolerant, and highly available. It is instrumented to intelligently recover under failure conditions, load balance and dynamically scale on demand -- all out-of-the-box.
Spring XD provides PMML model scoring to compute predictions in real-time. Apache Spark Streaming is an out-of-the-box processor module in Spring XD, and can be plugged in to perform online machine learning with the help of MLLib algorithms.
It’s easy to integrate data with Hadoop and any data store - like Greenplum Database, HAWQ or GemFire. No coding is required to use the DSL (Domain Specific Language) and interacting with the server is done via REST, in any programming language.
Remote monitoring and management of the runtime components are supported via JMX endpoints. A built-in Admin UI allows visualization and remote management of containers in the distributed setup.
Spring XD runs anywhere Java does - on-prem, Pivotal Cloud Foundry, YARN, EC2, Mesos, Docker, etc. A plug-in based architecture allows Java/Hadoop experts to extend the runtime components, allowing DSL (Domain Specific Language) users to leverage the extensions immediately.
Spring XD orchestrates the entire analytics loop - gathering data from any source, triggering actions, handling feedback loops from machine learning models, and computing real-time predictions.
Enable predictive analytics in-real time over large amounts of machine data, driving business and operation improvements in real-time. Spring XD’s data-integration adapters connect with various data-producing devices, and can be extended to meet any unique device or protocol.
Traditional enterprise “Big Data” was often done with batch processing. Get productive by using out-of-the-box jobs as templates - avoiding the need to write code. The infrastructure, environment specifics and automation is handled by Spring XD, allowing the enterprise to solely focus on business logic.
Spring XD provides integration with Project Reactor Streams, RxJava Observables, and Spark Streaming. Creating a data stream processor in XD allows you to use a functional programming model to filter, transform and aggregate data in a very concise and performant way. By working with events as you would with collections, Spring XD’s reactive-stream integration allows you to build complex event processors to respond to events in real-time.
$ cd spring-xd-1.3.1.RELEASE
<root-install-dir>\spring-xd\xd
xd/bin>$ ./xd-singlenode
./xd-shell
xd:> stream create --definition "time | log" --name ticktock --deploy