Process Monitoring with BPMN
Contents
Business Process Model and Notation (BPMN) is a standard for business process modeling that provides a graphical notation for specifying business processes. BPMN makes it far easier to collaborate in a cross-functional manner, as it bridges the gap between developers and business specialists: Business specialists work on the process definition, while developers enrich the BPMN with technical bindings. A BPMN-compliant workflow engine directly instantiates new process instances based on a provided definition.
The Project
At 3ap, we recently undertook a project using BPMN. More specifically, this project — which is ongoing — was a digital hospitality platform we developed for Stay KooooK. Its distributed architecture uses best-in-class cloud services with a custom developed fully digitalized guest journey, in which the guest does everything themselves on their mobile device. This allows Stay KooooK to operate a property with two to three hosts and to move the booking, check-in, stay, and checkout to an app, thereby decreasing overhead costs.
However, where digital guest experiences are concerned, it’s important to be aware of potential process failures. To do that, it’s necessary to monitor each stage of the guest journey and detect potential failures before they happen or cause guests to be inconvenienced, thereby ensuring guest satisfaction.
This project perfectly fit within our overarching architecture principles at 3ap. More specifically:
- We develop cloud-native/cloud-agnostic digital solutions
- We follow the Twelve-Factor App methodology
- For backing services, we use managed services from a best-in-class provider that fits our principles
- For functional services (components), we adhere to a use over buy over build philosophy
This project not only fit perfectly with our overarching principles; it took them to the next level. 😮 This is because the digital solution we created is an orchestration of cloud services, which resulted in fast time to market.
However, there are some non-negotiable weaknesses inherent to such a digital solution, and we needed to address them for the project to be successful.
The Problem Statement
Our aim for this project was to detect guest journey process failures before guests encountered issues that prevented them from accessing the property they booked. For illustrative purposes, we’ll use a simplified guest journey to showcase how we addressed this problem.
Simplified Guest Journey
Here’s a simplified example of a guest journey:
- Guest creates online booking
- Guest receives onboarding notification
- Guest completes online check-in
- Guest receives checked-in notification with the digital key to access the hospitality property
Potential Process Failures
Below is a list of potential ways the guest journey could have issues:
- Guest data on online booking is incomplete
- Guest didn’t receive an onboarding notification because the provided email or phone number was invalid
- Guest didn’t finish the online check-in because they forgot to
- Guest can’t access the hospitality property because the digital key wasn’t issued on time
The Solution
Before doing anything else, the first goal was to achieve a common understanding between the business team and the IT team about what the guest journey entails. BPMN is ideal for this, because it helps describe the business process in a way that both parties (developers and non-developers) can understand.
For the sake of simplicity, we only talk about BPMN on a reduced scale in this post, so as to more easily explain how we achieved process monitoring based on it.
Unlike process orchestration, process monitoring is passive. That’s why we decided to use state-related terms for the process task names; these names don’t correspond to the common BPMN naming conventions.
Receive Task
The BPMN observes the running guest journey and defines a concept for listening to messages or events, i.e. the receive task.
A receive task indicates that a process has to wait for a message to arrive before continuing. The task is completed once the message has been received.
Subscription to Guest Journey Events
To observe the event-driven communication between the cloud services, a new component (monitoring) is established.
Webhooks are used as an integration pattern for event sourcing. According to an article on The New Stack, “a webhook programmatically provides a notification when something happens on a platform or service.”
Today, most cloud services provide integration capabilities for synchronous integration as REST APIs and for event-driven integration as webhooks. We strongly recommend carefully studying the integration capabilities of the planned cloud services so that the selected services at least support those two integration patterns. Real-time integration with cloud services that don’t support an event-driven integration would result in non-negligible restrictions in terms of scalability.
The following image shows an example of the guest journey:
- Whenever data changes, the cloud service sends an event to the POST endpoint and publishes it as a webhook.
- The monitoring service returns HTTP 200 when the event can be processed.
The posting service mostly expects a response with a 2xx success code. When something goes wrong on the receiver side, a response with either a 4xx or 5xx error code tells the posting service that the delivery attempt failed. In such a failure, the posting service tries to resend the event with a defined backoff strategy until the post succeeds or the service gives up.
Connecting Guest Journey Events with the Workflow Engine
Next, we looked for a workflow engine that was BPMN capable.
For backing services, our search led us to Camunda Cloud, a provider of a BPMN solutions. Camunda Cloud 1.0 was released 11 May 2021, so it’s fairly new.
The following image outlines how the guest journey interacts with Camunda Cloud as part of it:
We recommend checking out Drafting Your Camunda Cloud Architecture – Part 1: Connecting the Workflow Engine with Your World to get some insight into different integration patterns.
For the process monitor use case we implemented, the consumed webhook events are passed as messages to the workflow engine.
Starting the Workflow Service
When a booking is initially created, a new workflow instance begins, and we receive a webhook event entitled booking created. The code below demonstrates this:
package ch.aaap.bpmn.monitoring.service;
import ch.aaap.bpmn.monitoring.MonitoringApplication;
import ch.aaap.bpmn.monitoring.domain.Booking;
import io.zeebe.client.api.response.WorkflowInstanceEvent;
import io.zeebe.spring.client.ZeebeClientLifecycle;
import lombok.RequiredArgsConstructor;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Mono;
@Service
@RequiredArgsConstructor
public class StartWorkflowService {
private final ZeebeClientLifecycle client;
public Mono<WorkflowInstanceEvent> start(Booking booking) {
return Mono.fromCallable(
() -> {
return client
.newCreateInstanceCommand()
.bpmnProcessId(MonitoringApplication.PROCESS_ID)
.latestVersion()
.variables(booking)
.send()
.join();
});
}
}
Sending Guest Journey Events
This section outlines what happens next in the guest journey after a booking is created.
The Property Management System webhook receives a booking change event.
The Message Carrier webhook receives a message sent event.
The Door Access webhook receives an access issued event.
package ch.aaap.bpmn.monitoring.service;
import ch.aaap.bpmn.monitoring.domain.Message;
import io.zeebe.client.api.response.PublishMessageResponse;
import io.zeebe.spring.client.ZeebeClientLifecycle;
import lombok.RequiredArgsConstructor;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Mono;
@Service
@RequiredArgsConstructor
public class MessageSenderService {
private final ZeebeClientLifecycle client;
public Mono<PublishMessageResponse> send(Message message) {
return Mono.fromCallable(
() -> {
return client
.newPublishMessageCommand()
.messageName(message.getType().name())
.correlationKey(message.getId())
.send()
.join();
});
}
}
Notifying Camunda Cloud
Next, Camunda Cloud is notified of these events:
- Start Booking
curl -X 'POST' \
'http://localhost:9085/guest-journey' \
-H 'accept: */*' \
-H 'Content-Type: application/json' \
-d '{
"arrival": "2021-10-19T14:00:00.000Z",
"departure": "2021-10-21T11:00:00.00Z",
"id": "booking-001"
}'
- Booking confirmed
curl -X 'POST' \
'http://localhost:9085/guest-journey/booking-001/BOOKING' \
-H 'accept: */*' \
-d ''
- Guest onboarding notification sent
curl -X 'POST' \
'http://localhost:9085/guest-journey/booking-001/ONBOARDING' \
-H 'accept: */*' \
-d ''
- Guest check-in done
curl -X 'POST' \
'http://localhost:9085/guest-journey/booking-001/CHECKIN' \
-H 'accept: */*' \
-d ''
- Property access issued
curl -X 'POST' \
'http://localhost:9085/guest-journey/booking-001/PROPERTY_ACCESS' \
-H 'accept: */*' \
-d ''
- Guest online checkout done
curl -X 'POST' \
'http://localhost:9085/guest-journey/booking-001/CHECKOUT' \
-H 'accept: */*' \
-d ''
- Workflow in action
Alerting Process Failures
The above scenario assumes everything worked as it should have. But to test alerting process failures, we focused on what a failure scenario could look like. In this case, it’s that the guest data in an online booking is incomplete.
In a productive running digital solution, process failures are sent to various channels. Some are pushed to Slack, and others are pushed to a ticketing system as tickets. In our example, the alert is logged in the console as an error.
The extended BPMN looks like this:
- The message receive task, Booking Confirmed, must receive the message BOOKING.
- A boundary timer event waits one minute for the message to be received.
- When the message isn’t received on time, the Alert Booking service task is triggered and a process incident is created.
- The Alert Booking service task has a boundary message receive event, which allows it to listen for the BOOKING message, which may be received later. If it’s received, the process can continue.
Alert Booking Incomplete
The following code shows a service task implementation of an incomplete booking alert:
package ch.aaap.bpmn.monitoring.service;
import ch.aaap.bpmn.monitoring.MonitoringApplication;
import io.zeebe.client.api.response.ActivatedJob;
import io.zeebe.client.api.worker.JobClient;
import io.zeebe.spring.client.annotation.ZeebeWorker;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Service;
@Service
@RequiredArgsConstructor
@Slf4j
public class AlertService {
@ZeebeWorker(type = MonitoringApplication.ALERT_BOOKING)
public void alertBooking(final JobClient client, final ActivatedJob job) {
client
.newFailCommand(job.getKey())
.retries(0)
.errorMessage("Booking data incomplete")
.send()
.join();
log.error("Booking data incomplete for {}", job.getVariables());
}
@ZeebeWorker(type = MonitoringApplication.DEFAULT_WORKER)
public void defaultWorker(final JobClient client, final ActivatedJob job) {
log.error("Default worker for {}", job.getVariables());
}
}
- Workflow in action with alert
In the example above, an incomplete booking triggers an alert, which logs the alert to the console. In the application, however, it creates a ticket in the ticketing app of the property, and the support team responds to the incident.
Sample Project on GitHub
If you want to learn more about how this works, check out the sample project on GitHub.
Conclusion
The use of BPMN to bridge the gap between development roles and business specialists works perfectly for this scenario. Both parties have a common definition of the guest journey, and as a result, new requirements and the evolution of the guest journey can be designed and discussed with the same level of understanding.
As a reader, you may have the feeling that the solution just partially uses the power of the Camunda Cloud BPMN solution. 🤔
We fully agree. 👍 This blog addresses the first evolution state of a digital hospitality platform, where process monitoring and detecting process failures was the scope. The next evolution state addresses the process orchestration where we extended BPMN with the concept of service tasks to actively orchestrate the guest journey. Let us know if you like this blog and we may write a follow-up blog that addresses the topic of process orchestration.