Java @Scheduled Tasks in Kubernetes: A Rethink

Historically, most scheduled tasks in Java applications I’ve worked on have used Spring’s scheduling feature. Spring handles methods that you annotate with @Scheduled in the background of the application. This works fine if only one instance of the application is running.

However, applications are increasingly becoming containerized and are being run in container orchestration platforms, such as Kubernetes, to take advantage of horizontal scaling so that multiple instances of an application are running. This creates a problem in the way scheduled tasks have been used historically: Because scheduled tasks are run in the background of the application, we have duplicated (and possibly competing) scheduled tasks as we horizontally scale the application.

To address this problem of scaling Java scheduled tasks in Kubernetes, I’ve created a new pattern that works with three popular open source dependency injection frameworks: Spring Boot, Micronaut, and Guice with Java Spark. Let’s walk through the scenario below to understand the pattern.

The Scenario

VIDEO: Kubernetes With Java - Jobs | One Time Job And Scheduled CronJob

Tech Hub

Let’s say we have a requirement to run some business logic that lives in the service layer of a Spring Boot API as a scheduled task. For the purposes of this article, let’s say the service looks like this:

@Service

public HelloService {

public String sayHello() {

return "Hello World!";

}

Historically, we would accomplish this by writing a class in the Spring Boot API that calls the service logic and annotate a method with @Scheduled, like so:

@Component

@Slf4j

public class ScheduledTasks {

private final HelloService helloService;

@Autowired

public ScheduledTasks(HelloService helloService) {

this.helloService = helloService;

}

@Scheduled(cron = "0 8 * * MON-FRI")

public void runHelloService() {

String hello = this.helloService.sayHello();

log.info(hello);

}

While this solution is straightforward, it limits our ability to scale the application horizontally in a modern container orchestration platform like Kubernetes. As this API horizontally scales to 2, 3, 4 … n pods, we’ll have 2, 3, 4 … n scheduled tasks duplicating the same scheduled task logic, which could cause duplicated logic, race conditions and inefficient use of resources.

There are solutions like ShedLock and Quartz that address this problem. Both ShedLock and Quartz use an external database to allow only one of the scheduled tasks in the n pods to execute at a given time. While this approach works, it requires an external database. Also, an instance of the scheduled task still runs in each pod, which consumes application/pod memory, even though only one of them will execute its business logic. We can improve these solutions by eliminating the multiple scheduled task instances altogether.

Is There a Better Way to Schedule Tasks in Kubernetes?

VIDEO: How to use cron with Kubernetes to schedule tasks

How To Make Tech Work from TechRepublic

Yes, with Kubernetes CronJob. We can overcome these disadvantages by separating the concerns of running the scheduled task and serving the application. This requires us to expose the service logic as an API endpoint by writing a controller that calls the service logic, like this:

@RestController

public MyController {

private final HelloService helloService;

@Autowired

public MyController(HelloService helloService) {

this.helloService = helloService;

}

@PostMapping("/hello")

public ResponseEntity<String> sayHello() {

String hello = this.helloService.sayHello();

return ResponseEntity.ok(hello);

}

Next, we create a CronJob resource that will call this new endpoint on a set schedule:

apiVersion: batch/v1

kind: CronJob

metadata:

name: hello

spec:

schedule: "0 8 * * MON-FRI"

jobTemplate:

spec:

template:

spec:

containers:

- name: hello

image: busybox:1.28

imagePullPolicy: IfNotPresent

command:

- /bin/sh

- -c

- curl -X POST http://path.to.the.java.api/hello

restartPolicy: OnFailure

Now we have a horizontally scalable solution.

However, what if we have a regulation that prevents us from exposing HelloService as an API endpoint? Or what if the security team said that we need to retrieve a JSON Web Token (JWT) and put it in the curl request’s Authorization header before calling the API endpoint? At best, it would require more time and shell expertise than the team might have and, at worst, this would make the above solution infeasible.

Is There an Even Better Way to Schedule Tasks in Kubernetes?

VIDEO: CRON Job | Job Scheduler | Task Scheduler in Spring boot Application | CodeDebugger

Code Debugger by Dhananjay

Yes. We can alleviate these concerns by using Java’s multiple entry points feature.

However, the unique challenge in our case is that the service logic lives in a Spring Boot API, so certain Spring dependency injection logic needs to execute so that the service layer and all its dependencies are instantiated before an alternative entry point is executed.

How can we give Spring Boot the time it needs to configure the application before we run the alternative entry point? I found that the code below accomplishes this:

@SpringBootApplication

public class SpringBootEntryPoint {

public static void main(String[] args) {

ConfigurableApplicationContext applicationContext = SpringApplication.run(SpringBootEntryPoint.class, args);

* If an alternative entry point environment variable exists, then determine if there is business logic that is mapped to

* that property. If so, run the logic and exit. If an alternative entry point property does not exist, then

* allow the application to run as normal.

Optional.ofNullable(System.getenv("alternativeEntryPoint"))

.ifPresent(

arg -> {

int exitCode = 0;

try(applicationContext) {

if (arg.equals("sayHello")) {

String hello = applicationContext.getBean(HelloService.class).sayHello();

System.out.println(hello);

}

else {

throw new IllegalArgumentException(

String.format("Did not recognize alternativeEntryPoint, %s", arg)

);

}

catch (Exception e) {

exitCode = 1;

e.printStackTrace();

}

finally {

System.out.println("Closing application context");

}

If there is an alternative entry point listed, then we always want to exit the JVM so the

spring app does not throw an exception after we close the applicationContext. Both the

applicationContext and JVM should be closed/exited to prevent exceptions.

System.out.println("Exiting JVM");

System.exit(exitCode);

});

}

This pattern also works with other Java frameworks such as Micronaut and Guice with Java Spark, so it is relatively framework agnostic. Below is the same pattern using Micronaut:

public class MicronautEntryPoint {

public static void main(String[] args) {

ApplicationContext applicationContext = Micronaut.run(MicronautEntryPoint.class, args);

* If an alternative entry point environment variable exists, then determine if there is business logic that is mapped to

* that property. If so, run the logic and exit. If an alternative entry point property does not exist, then

* allow the application to run as normal.

Optional.ofNullable(System.getenv("alternativeEntryPoint"))

.ifPresent(

arg -> {

int exitCode = 0;

try(applicationContext) {

if (arg.equals("sayHello")) {

String hello = applicationContext.getBean(HelloService.class).sayHello();

System.out.println(hello);

}

else {

throw new IllegalArgumentException(

String.format("Did not recognize alternativeEntryPoint, %s", arg)

);

}

catch (Exception e) {

exitCode = 1;

e.printStackTrace();

}

finally {

System.out.println("Closing application context");

}

If there is an alternative entry point listed, then we always want to exit the JVM so the

spring app does not throw an exception after we close the applicationContext. Both the

applicationContext and JVM should be closed/exited to prevent exceptions.

System.out.println("Exiting JVM");

System.exit(exitCode);

});

}

The only major difference is that the class does not need an annotation, and the Micronaut equivalents of Spring methods are used (ex: Micronaut#run).

Here is the same pattern using Guice and Java Spark:

public class GuiceEntryPoint {

private static Injector injector;

public static void main(String[] args) {

GuiceEntryPoint.injector = Guice.createInjector(new GuiceModule());

* If an alternative entry point environment variable exists, then determine if there is business logic that is mapped to

* that property. If so, run the logic and exit. If an alternative entry point property does not exist, then

* allow the application to run as normal.

Optional.ofNullable(System.getenv("alternativeEntryPoint"))

.ifPresent(

arg -> {

int exitCode = 0;

try {

if (arg.equals("sayHello")) {

String hello = injector.getInstance(HelloService.class).sayHello();

System.out.println(hello);

}

else {

throw new IllegalArgumentException(

String.format("Did not recognize alternativeEntryPoint, %s", arg)

);

}

catch (Exception e) {

exitCode = 1;

e.printStackTrace();

}

finally {

System.out.println("Closing application context");

}

If there is an alternative entry point listed, then we always want to exit the JVM so the

spring app does not throw an exception after we close the applicationContext. Both the

applicationContext and JVM should be closed/exited to prevent exceptions.

System.out.println("Exiting JVM");

System.exit(exitCode);

});

Run the Java Spark RESTful API.

injector.getInstance(GuiceEntryPoint.class)

.run(8080);

}

void run(final int port) {

final GoodByeService goodByeService = GuiceEntryPoint.injector.getInstance(GoodByeService.class);

port(port);

get("/", (req, res) -> {

return goodByeService.sayHello();

});

}

The main differences are that you retrieve the beans from the Guice Injector rather than from an ApplicationContext object like in Spring and Micronaut, and that there is a run method that contains all the controller endpoints rather than there being a controller class.

You can see these code samples and run them by following the directions in this repo’s README.

In each of these examples, you’ll notice that I control whether the alternative entry point’s logic is invoked by checking if an environment variable exists and, if it does exist, what its value is. If the environment variable does not exist or its value is not what we expect, then the HelloService bean will not be retrieved from the ApplicationContext or the Injector (depending on the framework being used) and will not be executed. While this is not exactly an alternative entry point, it functions in a similar way. Instead of using multiple main methods like traditional alternative entry points, this pattern uses a single main method and uses environment variables to control the logic that is executed.

Note that when using Spring and Micronaut, the applicationContext is closed using try with resources, regardless of whether the service method call executes successfully or throws an Exception. This guarantees that if an alternative entry point is specified, it will always result in the application exiting. This will prevent the Spring Boot application from continuing to run to service HTTP requests with the controller API endpoints.

Last, we always exit the JVM if an alternative entry point environment variable is detected. This prevents Spring Boot from throwing an Exception because the ApplicationContext is closed but the JVM is still running.

Effectively, this solution allows dependency injection to occur before the entry point routing logic occurs.

This solution allows us to write a Kubernetes CronJob resource that uses the same docker image that we would use if we were to run the Spring Boot application as an API, but we simply add an environment variable in the spec as seen below.

apiVersion: batch/v1

kind: CronJob

metadata:

name: my-service

spec:

schedule: "0 8 * * MON-FRI"

jobTemplate:

spec:

template:

spec:

containers:

- name: hello-service

image: helloImage:1.0.0 # This is the Java API image with the second entry point.

imagePullPolicy: IfNotPresent

env:

- name: alternativeEntryPoint

value: "helloService"

restartPolicy: OnFailure

By using a Kubernetes CronJob, we can guarantee that only one scheduled task is running at any given time (provided that the task is scheduled with sufficient time between invocations). In addition, we did not expose HelloService through an API endpoint or need to use shell scripting — everything was implemented in Java. We also eliminated duplicated scheduled tasks instead of managing them.

I like to visualize this pattern as making a jar act like a Swiss Army knife: Each entry point is like a tool in the Swiss Army knife that runs the jar’s logic in a different way. Just as a Swiss Army knife has different tools, like a screwdriver, knife, scissors, etc., so does this pattern make a jar act on its embedded business logic as a RESTful API, scheduled task, etc.

FAQs

VIDEO: How to create repeating jobs (aka CronJobs) in Kubernetes?

Peter Jausovec

Question:

VIDEO: Spring Boot Scheduler | Spring Job Scheduler | @Scheduled Annotation | Async Scheduler

Techno Town Techie

Wouldn’t it be easier to write a @Scheduled method and disable it based on some configuration property?

Answer:

VIDEO: How the Kubernetes scheduler works

Microsoft Azure

First, it’s worth considering that other frameworks like Micronaut do not have the ability to disable a @Scheduled method. Moreover, Java Spark cannot schedule tasks. On the other hand, the pattern described in this article (I’ll call it the Swiss Army knife pattern) works across more frameworks than just Spring.

But even if your project does use Spring, one of the main disadvantages I see in using @Scheduled in general is that we’re requiring the Spring app to run 24/7 in order for the Spring task scheduler to run and invoke the @Scheduled task based on the cron schedule. This would require a Kubernetes pod that’s running 24/7 with the Spring app running inside it. I see this use of resources (and probably money) as unnecessary because Kubernetes provides its own task scheduler that we can take advantage of by creating a CronJob resource. Kubernetes resources will only be used for the life of the CronJob rather than having a pod running at all times with the @Scheduled task inside it.

In other words, I liken the @Scheduled and CronJob options to this: We wouldn’t spin up an EC2 instance and create a cronjob on the EC2 instance that invokes a Lambda function because we can invoke a Lambda function with a CloudWatch cron rule. One of the reasons why we don’t do this is because the EC2 instance would be more expensive compared to the free CloudWatch rule. Like the EC2 instance in this example, I see a @Scheduled pod as an unnecessary provisioning of resources because we already have a scheduling tool available in Kubernetes’ CronJob (which is like CloudWatch cron rules).

Question:

VIDEO: Kubernetes Cron/CronJobs in 10 minutes

Productivity for Programmers

Does this pattern work in a multicluster environment?

Answer:

VIDEO: Jobs and CronJobs in Kubernetes

Pavan Elthepu

This pattern has not been tested in a multicluster environment, and it likely would not work because this pattern does not include a way for a scheduled task running in Cluster A to be aware of another instance of the scheduled task running in Cluster B. Quartz and ShedLock use an external, centralized database to orchestrate these multicluster scheduled tasks. This pattern does not include an external database.

Article information

Author: Madison Riggs

Last Updated: 1699696682

Rating: 3.7 / 5 (47 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Madison Riggs

Birthday: 1939-03-30

Address: PSC 2131, Box 1967, APO AP 58769

Phone: +3574694957224553

Job: Article Writer

Hobby: Quilting, Chess, Astronomy, Photography, Juggling, Poker, Tea Brewing

Introduction: My name is Madison Riggs, I am a venturesome, rich, skilled, Open, enterprising, daring, variegated person who loves writing and wants to share my knowledge and understanding with you.