For years, builds had the simple requirements of compiling and packaging software. But the landscape of modern software development has changed, and so have the needs for build automation. Today, projects involve large and diverse software stacks, incorporate multiple programming languages, and apply a broad spectrum of testing strategies. With the rise of agile practices, builds must support early integration of code as well as frequent and easy delivery to both test and production environments.
Established build tools regularly fall short in meeting these goals. How many times have your eyes glazed over while looking at XML to figure out how a build works? And why can't it be easier to add custom logic to your build? All too often, when adding on to a build script, you can't shake the feeling that you're implementing a workaround or hack. I feel your pain. There has to be a better way of doing these things in an expressive and maintainable way. There is — it's called Gradle.
Gradle is the next evolutionary step in JVM-based build tools. It draws on lessons learned from established tools such as Ant and Maven and takes their best ideas to the next level. Following a build-by-convention approach, Gradle allows for declaratively modeling your problem domain using a powerful and expressive domain-specific language (DSL) implemented in Groovy instead of XML. Because Gradle is a JVM native, it allows you to write custom logic in the language you're most comfortable with, be it Java or Groovy.
In the Java world, a remarkably large number of libraries and frameworks are used. Dependency management is employed to automatically download these artifacts from a repository and make them available to your application. Having learned from the shortcomings of existing dependency management solutions, Gradle provides its own implementation. Not only is it highly configurable, it also strives to be as compatible as possible with existing dependency management infrastructures (such as Maven and Ivy). Gradle's ability to manage dependencies isn't limited to external libraries. As your project grows in size and complexity, you'll want to organize the code into modules with clearly defined responsibilities. Gradle provides powerful support for defining and organizing multiproject builds, as well as modeling dependencies between projects.
To get started with Gradle, all you need to bring to the table is a good understanding of the Java programming language. If you're new to project automation or haven't used a build tool before, my book Gradle in Action is a good place to start.
In this two-part series on using Gradle, I first compare existing JVM-language build tools with the features Gradle has to offer and describe Gradle's compelling feature set. In the next installment, I cover installing Gradle and writing and executing a simple Gradle script.
Why Gradle? Why Now?
If you've ever dealt with build systems, frustration may be one of the feelings that comes up when thinking about the challenges you've faced. Shouldn't the build tool naturally help you accomplish the goal of automating your project? Instead, you had to compromise on maintainability, usability, flexibility, extendibility, or performance.
Let's say you want to copy a file to a specific location when you're building the release version of your project. To identify the version, you check a string in the metadata describing your project. If it matches a specific numbering scheme (for example, 1.0-RELEASE), you copy the file from point A to point B. From an outside perspective, this may sound like a trivial task. If you have to rely on XML, the build language of many traditional tools, expressing this simple logic becomes fairly difficult. The build tool's response is to add scripting functionality through nonstandard extension mechanisms. You end up mixing scripting code with XML or invoking external scripts from your build logic. It's easy to imagine that you'll need to add more and more custom code over time. As a result, you inevitably introduce accidental complexity, and maintainability goes out the window. Wouldn't it make sense to use an expressive language to define your build logic in the first place?
Here's another example. Maven follows the paradigm of convention-over-configuration by introducing a standardized project layout and build lifecycle for Java projects. That's a great approach if you want to ensure a unified application structure for a green-field project — a project that lacks any constraints imposed by prior work. However, you may be the lucky one who needs to work on one of the many legacy projects that use different conventions. One of the conventions Maven is very strict about is that one project needs to produce one artifact, such as a JAR file. But how do you create two different JAR files from one source tree without having to change your project structure? Just for this purpose, you'd have to create two separate projects. Again, even though you can make this happen with a workaround, you can't shake off the feeling that your build process will need to adapt to the tool, not the tool to your build process.
These are only some of the issues you may have encountered with existing solutions. Often you've had to sacrifice nonfunctional requirements to model your enterprise's automation domain. But enough with the negativity — let's see how Gradle fits into the build tool landscape.
Evolution of Java Build Tools
Let's look at how build tools have evolved over the years. Two tools have dominated building Java projects: Ant and Maven. Over the course of years, both tools improved significantly and extended their feature set. But even though both are highly popular and have become industry standards, they have one weak point: build logic has to be described in XML. XML is great for describing hierarchical data, but falls short on expressing program flow and conditional logic. As a build script grows in complexity, maintaining the build code becomes a nightmare.
Ant's first official version was released in 2000. Each element of work (a target in Ant's lingo) can be combined and reused. Multiple targets can be chained to combine single units of work into full workflows. For example, you might have one target for compiling Java source code and another one for creating a JAR file that packages the class files. Building a JAR file only makes sense if you first compiled the source code. In Ant, you make the JAR target depend on the compile target. Ant doesn't give any guidance on how to structure your project. Though it allows for maximum flexibility, Ant makes each build script unique and hard to understand. External libraries required by your project are usually checked into version control, because there is no automated mechanism to pull them from a central location. Early versions of Ant required a lot of discipline to avoid repetitive code. Its extension mechanism was simply too weak. As a result, the bad coding practice of copying and pasting code was the only viable option. To unify project layouts, enterprises needed to impose standards.
Maven 1, released in July 2004, tried to ease that process. It provided a standardized project and directory structure, as well as dependency management. Unfortunately, custom logic is hard to implement. If you want to break out of Maven's conventions, writing a plugin, called a Mojo, is usually the only solution. The name Mojo might imply a straightforward, easy, and sexy way to extend Maven; in reality, writing a plugin in Maven is cumbersome and overly complex.
Later, Ant caught up with Maven by introducing dependency management through the Apache library Ivy, which can be fully integrated with Ant to declaratively specify dependencies needed for your project's compilation and packaging process. Maven's dependency manager, as well as Ivy, support resolving transitive dependencies. When I speak of transitive dependencies, I mean the graph of libraries required by your specified dependencies. A typical example of a transitive dependency would be the XML parser library Xerces that requires the XML APIs library to function correctly. Maven 2, released in October 2005, took the idea of convention over configuration even further. Projects consisting of multiple modules could define their dependencies on each other.
These days a lot of people are looking for alternatives to established build tools. We see a shift from using XML to a more expressive and readable language to define builds. A build tool that carries on this idea isGant, a DSL on top of Ant written in Groovy. Using Gant, users can now combine Groovy features with their existing knowledge of Ant without having to write XML. Even though it wasn't part of the core Maven project, a similar approach was proposed by the project Maven Polyglot that enables you to write your build definition logic, which is the project object model (POM) file, in Groovy, Ruby, Scala, or Clojure.
Gradle fits right into that generation of build tools and satisfies many requirements of modern build tools (Figure 1). It provides an expressive DSL, a convention over configuration approach, and powerful dependency management. It makes the right move to abandon XML and introduce the dynamic language Groovy to define your build logic. Sounds compelling, doesn't it?
Figure 1: Gradle combines the best features from other build tools.
Why Choose Gradle?
If you're a developer, automating your project is part of your day-to-day business. Don't you want to treat your build code like any other piece of software that can be extended, tested, and maintained? Let's put software engineering back into the build. Gradle build scripts are declarative, readable, and clearly express their intention. Writing code in Groovy instead of XML, sprinkled with Gradle's build-by-convention philosophy, significantly cuts down the size of a build script and is far more readable (see Figure 2).
Figure 2: Comparing build script size and readability between Maven and Gradle.
It's impressive to see how much less code you need to write in Gradle to achieve the same goal. With Gradle you don't have to make compromises. Where other build tools like Maven propose project layouts that are "my way or the highway," Gradle's DSL allows for flexibility by adapting to nonconventional project structures.
Never change a running system, you say? Your team already spent a lot of time on establishing your project's build code infrastructure. Gradle doesn't force you to fully migrate all of your existing build logic. Good integration with other tools like Ant and Maven is at the top of Gradle's priority list.
The market seems to be taking notice of Gradle. Popular open source projects like Groovy and Hibernate completely switched to Gradle as the backbone for their builds. Every Android project ships with Gradle as the default build system. Gradle also had an impact on the commercial market. Companies like Orbitz, EADS, and Software AG embraced Gradle as well, to name just a few. VMware, the company behind Spring and Grails, made significant investments in choosing Gradle. Many of their software products, such as the Spring framework and Grails, are literally built on the trust that Gradle can deliver.
Gradle's Compelling Feature Set
Let's take a closer look at what sets Gradle apart from its competitors: its compelling feature set (see Figure 3). To summarize, Gradle is an enterprise-ready build system, powered by a declarative and expressive Groovy DSL. It combines flexibility and effortless extendibility with the idea of convention over configuration and support for traditional dependency management. Backed by a professional services company (Gradleware) and strong community involvement, Gradle is becoming the number-one choice build solution for many open source projects and enterprises.
Figure 3: Gradle's compelling feature set.
Expressive Build Language and Deep API
The key to unlocking Gradle's power features within your build script lies in discovering and applying its domain model, as shown in Figure 4.
Figure 4: Build scripts apply the Gradle DSL and have access to its deep API.
As you can see in the figure, a build script directly maps to an instance of type
Projectin Gradle's API. In turn, the dependencies configuration block in the build script invokes the method
dependencies()of the project instance. Like most APIs in the Java world, it's available as HTML Javadoc documentation on Gradle's website. Who would have known? You're actually dealing with code. Without knowing it, you generate an object representation of your build logic in memory.
Each element in a Gradle script has a one-to-one representation with a Java class; however, some of the elements have been sugar-coated with a sprinkle of Groovy syntax. Having a "Groovy-fied" version of a class in many cases makes the code more compact than its Java counterpart and allows for using new language features such as closures.
Gradle can't know all the requirements specific to your enterprise build. By exposing hooks into lifecycle phases, Gradle allows for monitoring and configuring the build script's execution behavior. Let's assume you have the very unique requirement of sending out an email to the development team whenever a unit test failure occurs. The way you want to send an email (for example, via SMTP or a third-party email service provider) and the list of recipients are very specific to your build. Other builds using Gradle may not be interested in this feature at all. By writing a custom test listener that's notified after the test execution lifecycle event, you can easily incorporate this feature for your build.
Gradle establishes a vocabulary for its model by exposing a DSL implemented in Groovy. When dealing with a complex problem domain, in this case the task of building software, being able to use a common language to express your logic can be a powerful tool. Let's look at some examples. Most common to builds is the notation of a unit of work that you want to get executed. Gradle describes this unit of work as a task. Part of Gradle's standard DSL is the ability to define tasks very specific to compiling and packaging Java source code. It's a language for building Java projects with its own vocabulary that doesn't need to be relevant to other contexts.
Another example is the way you can express dependencies to external libraries, a very common problem solved by build tools. Out-of-the-box Gradle provides you with two configuration blocks for your build script that allow you to define the dependencies and repositories that you want to retrieve them from. If the standard DSL elements don't fit your needs, you can even introduce your own vocabulary through Gradle's extension mechanism.
This may sound a little nebulous at first, but once you're past the initial hurdle of learning the build language, creating maintainable and declarative builds comes easy. A good place to start is the Gradle Build Language Reference Guide. Gradle's DSL can be extended. You may want to change the behavior of an existing task or add your own idioms for describing your business domain. Gradle offers you plenty of options to do so.
Gradle is Groovy
Prominent build tools like Ant and Maven define their build logic through XML. As we all know, XML is easy to read and write, but can become a maintenance nightmare if used in large quantities. XML isn't very expressive. It makes it hard to define complex custom logic. Gradle takes a different approach. Under the hood, Gradle's DSL is written with Groovy providing syntactic sugar on top of Java. The result is a readable and expressive build language. All your scripts are written in Groovy as well. Being able to use a programming language to express your build needs is a major plus. You don't need to be a Groovy expert to get started. Because Groovy is written on top of Java, you can migrate gradually by trying out its language features. You could even write your custom logic in plain Java — Gradle couldn't care less. Groovy veterans will assure you that using Groovy instead of Java will boost your productivity significantly. A great reference guide is the bookGroovy in Action, Second Edition by Dirk Koenig et al. (Manning, 2009).
One of Gradle's big ideas is to give you guidelines and sensible defaults for your projects. Every Java project in Gradle knows exactly where source and test class file are supposed to live, and how to compile your code, run unit tests, generate Javadoc reports, and create a distribution of your code. All of these tasks are fully integrated into the build lifecycle. If you stick to the convention, there's only minimal configuration effort on your part. In fact, your build script is a one-liner. Seriously! Figure 5 illustrates how Gradle introduces conventions and lifecycle tasks for Java projects.
Figure 5: In Gradle, Java projects are build by convention with sensible defaults. Changing the defaults is easy and achieved through convention properties.
Default tasks are provided that make sense in the context of a Java project. For example, you can compile your Java production source code, run tests, and assemble a JAR file. Every Java project starts with a standard directory layout. It defines where to find production source code, resource files, and test code. Convention properties are used to change the defaults.
The same concept applies to other project archetypes like Scala, Groovy, web projects, and many more. Gradle calls this concept build by convention. The build script developer doesn't need to know how this works under the hood. Instead, you can concentrate on what needs to be configured. Gradle's conventions are similar those in Maven, but they don't leave you feeling boxed in. Maven is very opinionated; it proposes that a project only contains one Java source directory and only produces one single JAR file. This is not necessarily reality for many enterprise projects. Gradle allows you easily to break out of the conventions. On the opposite end of the spectrum, Ant does not give you a lot of guidance on how to structure your build script, allowing for a maximum level of flexibility. Gradle takes the middle ground by offering conventions combined with the ability to change them easily. Szczepan Faber, one of Gradle's core engineers, put it this way on his blog: "Gradle is an opinionated framework on top of an unopinionated toolkit."
Robust and Powerful Dependency Management
Software projects are usually not self-contained. All too often, your application code uses a third-party library providing existing functionality to solve a specific problem. Why would you want to reinvent the wheel by implementing a persistence framework if Hibernate already exists? Within an organization, you may be the consumer of a component or module implemented by a different team. External dependencies are accessible through repositories, and the type of repository is highly dependent on what your company prefers. Options range from a plain file system to a full-fledged enterprise repository. External dependencies may have a reference to other libraries or resources. We call these transitive dependencies.
Gradle provides an infrastructure to manage the complexity of resolving, retrieving, and storing dependencies. Once they're downloaded and put in your local cache, they're made available to your project. A key requirement of enterprise builds is reproducibility. Do you remember the last time a coworker said, "But it works on my box"? Builds have to produce the same result on different machines, independent of the contents of your local cache. Dependency managers like Ivy and Maven in their current implementation cannot fully guarantee reproducibility. Why is that? Whenever a dependency is downloaded and stored in the local cache, it doesn't take into account the artifact's origin. In situations where the repository is changed for a project, the cached dependency is considered resolved, even though the artifact's content may be slightly different. At worst, this will cause a failing build that's extremely hard to debug. Another common complaint specific to Ivy is the fact that dependency snapshot versions, artifacts currently under development with the naming convention
–SNAPSHOT, aren't updated correctly in the local cache, even though it changed on the repository and is marked as changing. There are many more scenarios where current solutions fall short. Gradle provides its own configurable, reliable, and efficient dependency-management solution.
Large enterprise projects usually consist of multiple modules to separate functionality. In the Gradle world, each of the submodules is considered a project that can define dependencies to external libraries or other modules. Additionally, each subproject can be run individually. Gradle figures out for you which of the subproject dependencies need to be rebuilt, without having to store a subproject's artifact in the local cache.
For some companies, a large project with hundreds of modules is reality. Building and testing minor code changes can consume a lot of time. You may know from personal experience that deleting old classes and resources by running a cleanup task is a natural reflex. All too often, you get burned by your build tool not picking up the changes and their dependencies. What you need is a tool that's smart enough to only rebuild the parts of your software that actually changed. Gradle supports incremental builds by specifying task inputs and outputs. It reliably figures out for you which tasks need to be skipped, built, or partially rebuilt. The same concept translates to multimodule projects, called partial builds. Because your build clearly defines the dependencies between submodules, Gradle takes care of rebuilding only the necessary parts. No more running
Automated unit, integration, and functional tests are part of the build process. It makes sense to separate short-running types of tests from the ones that require setting up resources or external dependencies to be run. Gradle supports parallel test execution. This feature is fully configurable and ensures that you're actually taking advantage of your processor's cores. The buck doesn't stop here. Gradle is going to support distributing test execution to multiple machines in a future version. The days of reading your Twitter feed between long builds are gone.
Developers run builds many times during development. That means starting a new Gradle process each time, loading all its internal dependencies, and running the build logic. You'll notice that it usually takes a couple of seconds before your script actually starts to execute. To improve the startup performance, Gradle can be run in daemon mode. In practice, the Gradle command forks a daemon process, which not only executes your build, but also keeps running in the background. Subsequent build invocations will piggyback on the existing daemon process to avoid the startup costs. As a result, you'll notice a far snappier initial build execution.
Most enterprise builds are not alike, nor do they solve the same problems. Once you're past the initial phase of setting up your basic scripts, you'll want to implement custom logic. Gradle is not opinionated about the way you implement that code. Instead, it gives you various choices to pick from, depending on your specific use case. The easiest way to implement custom logic is by writing a task. Tasks can be defined directly in your build script without special ceremony. If you feel like complexity takes over, you may want to explore the option of a custom task that allows for writing your logic within a class definition, making structuring your code easy and maintainable. If you want to share reusable code among builds and projects, plugins are your best friend. Representing Gradle's most powerful extension mechanism, plugins give you full access to Gradle's API and can be written, tested, and distributed like any other piece of software. Writing a plugin is surprisingly easy and doesn't require a lot of additional descriptors.
Integration with Other Build Tools
Gradle plays well with its predecessors Ant, Maven, and Ivy, as shown in Figure 6.
Figure 6: Gradle provides deep integration with other build tools and opens the door to gradually migrate your existing Ant or Maven build.
If you're coming from Ant, Gradle doesn't force you to migrate your full build infrastructure. Instead, it allows you to import existing build logic and reuse standard Ant tasks. Gradle builds are 100% compatible with Maven and Ivy repositories. You can retrieve dependencies and publish your own artifacts. Gradle provides a converter for existing Maven builds that can translate the build logic into a Gradle build script.
Existing Ant scripts can be imported into a Gradle build seamlessly and used as you'd use any other external Gradle script. Ant targets directly map to Gradle tasks at runtime. Gradle ships with the Ant libraries and exposes a helper class to your scripts called AntBuilder, which fully blends into Gradle's DSL. It still looks and feels like Ant's XML, but without the pointy brackets. Ant users will feel right at home, because they don't have to transition to Gradle syntax right away. Migrating from Ant to Gradle is also a no-brainer. You can take baby steps by reusing your existing Ant logic while using Gradle's benefits at the same time.
Gradle aims to reach a similar depth of integration with Maven. At the time of writing, this hasn't been realized yet. In the long run, Maven POMs and plugins will be treated as Gradle natives. Maven and Ivy repositories have become an important part of today's build infrastructure. Imagine a world without Maven Central to help access specific versions of your favorite project dependencies. Retrieving dependencies from a repository is only one part of the story; publishing to them is just as important. With a little configuration, Gradle can upload your project's artifact for company-wide or public consumption.
Community-driven and Company-backed
Gradle is free to use and ships with the Apache License 2.0. After its first release in April 2008, a vibrant community quickly started to form around it. Over the past five years, open source developers have made major contributions to Gradle's core code base. Being hosted on GitHub turned out to be very beneficial to Gradle. Code changes can be submitted as pull requests and undergo a close review process by the core committers before making it into the code base. If you're coming from other build tools like Maven, you may be used to a wide range of reusable plugins. Apart from the standard plugins shipped with the runtime, the Gradle community releases new functionality frequently. Every community-driven software project needs a forum to get immediate questions answered. Gradle connects with the community through the Gradle forum. In addition, there's a commercial company, Gradeleware, that supports the product.
Additional Useful Features
Gradle is equipped with a rich command-line interface. Using command-line options, you can control everything from specifying the log level, to excluding tests, to displaying help messages. This is nothing special; other tools provide that, too. Some of the features stand out, though. Gradle allows for running commands in an abbreviated, camel-cased form. In practice, a command named
runMyAwesomeTaskwould be callable with the abbreviation
rMAT. Even though I present most of next week's examples by them running commands in a shell, bear in mind that Gradle provides an out-of-the-box graphical user interface.
The Bigger Picture: Continuous Delivery
Being able to build your source code is only one aspect of the software delivery process. More importantly, you want to release your product to a production environment to deliver business value. Along the way, you want to run tests, build the distribution, analyze the code for quality-control purposes, potentially provision a target environment, and deploy to it.
There are many benefits to automating the whole process. First and foremost, delivering software manually is slow, error-prone, and nerve-wracking. I'm sure every one of us hates the long nights due to a deployment gone wrong. With the rise of agile methodologies, development teams are able to deliver software faster. Release cycles of two or three weeks have become the norm. Many organizations now even ship code to production several times a day! Optimally, you want to be able to release software by selecting the target environment simply by pressing a button. Practices like automated testing, CI, and deployment feed into the general concept of continuous delivery.
Let's look at how Gradle can help get your project from build to deployment. It can enable you to automate many of the tasks required to implement continuous delivery, be they compiling your source code, deploying a deliverable, or calling external tools that help you with implementing the process. For a deep dive on continuous delivery and all of its aspects, I recommend the Jolt Award-winning Continuous Delivery by Jez Humble and David Farley.
Automating Your Project from Build to Deployment
Continuous delivery introduces the concept of a deployment pipeline, also referred to as the build pipeline. A deployment pipeline represents the technical implementation of the process for getting software from version control into your production environment. The process consists of multiple stages, as shown in Figure 7.
Figure 7: Stages of a deployment pipeline.
- Commit stage: Reports on the technical health level of your project. The main stakeholder of this phase is the development team as it provides feedback about broken code and finds "code smells." The job of this stage is to compile the code, run tests, perform code analysis, and prepare the distribution.
- Automated acceptance test stage: Asserts that functional and nonfunctional requirements are met by running automated tests.
- Manual test stage: Verifies that the system is actually usable in a test environment. Usually, this stage involves QA personnel to verify requirements on the level of user stories or use cases.
- Release stage: Either delivers the software to the end user as a packaged distribution or deploys it to the production environment.
Let's see what stages of the deployment pipeline can benefit from project automation. It's obvious that the manual test stage can be excluded from further discussion, because it only involves manual tasks. The concrete tasks we're interested in are:
- Compiling the code
- Running unit and integration tests
- Performing static code analysis and generating test coverage
- Creating the distribution
- Provisioning the target environment
- Deploying the deliverable
- Performing smoke and automated functional tests
Figure 8 shows the order of tasks within each of the stages. While there are no hard rules that prevent you from skipping specific tasks, it's recommended that you follow the order. For example, you could decide to compile your code, create the distribution, and deploy it to your target environment without running any tests or static code analysis. However, doing so increases the risk of undetected code defects and poor code quality.
Figure 8: Tasks performed in stages of build pipeline.
Topics such as infrastructure provisioning, automated deployment, and smoke testing can also be applied to the release stage. In practice, applying these techniques to a production environment is more complex than in a controlled test environment. In a production environment, you may have to deal with clustered and distributed server infrastructures, zero-downtime release rollouts, and automated rollbacks to the previous release.
Covering these advanced topics would go beyond the scope of this series. However, there are some great examples of deployment management tools in the wild that you may want to check out, such as Asgard, a Web-based cloud management and deployment tool built and used by Netflix. But enough pure theory — in the follow-up article next week, I'll discuss installing Gradle on your machine and building your first project.