MCPA MuleSoft Certified Platform Architect Level 1 – Designing Effective APIs Part 2
3. API Data Models
Hi. In this lecture, let us discuss about the API data modeling. So, what is this API data model? The API data model is nothing but the data structure that is exchanged between the parties. So when I say parties, I mean like the data structure that is exchanged between the API APA clients and the API providers or the APA implementation are between the API implementations as well.
For example, one example of the APA client implementation is a call from a postman to the API Implementation Experience layer, right? And the example for the API implementation to API implementation data structure exchanges experience layer calling process layer and process layer calling system layer APIs. So these data structures which are exchanged between the parties is nothing but a particular data model. Okay, so now all data types that appear in an API, I mean the API interface, right? Actually they form the API data model for that particular API.
So the data model should be specified in their RAML definition of the API. And the API data models are clearly visible across the application network because they form an important part of the interface contact, because the moment we put it in the RAML and publish, they will be visible on the Exchange Design Center public portal everywhere, correct? So they play a very important part.
So the data model is conceptually clearly separate from the similar models that may be used inside the AP implementation. Such as for example, the AP implementation might be implemented in an object oriented way, could be a Java, could be a dotnet, or could be a mule way. If it’s a mule application, AP implementation is a mule. It could be mule way, or it could be a Java based or net based or C sharp based.
So internal to the implementation, there might be different model representations. Like it could be a class in a Java or a different structure in a functional domain language and all current. So that data model is different clearly from the API data model. So the API data model is what is the piece that belongs to the interface of the API that is in the RAML. Okay, so only this API data model is visible to the API clients in particular. Okay? So this is what we are discussing, like what is visible in the Exchange and the public portal and application at work and all. So the other forms of models actually are not discussed in this part of this lecture because they are completely independent to the way the APIs are implemented internally.
So it could be a JSON that is coming in as a request to an API, which is our a data model. But internally, that JSON might be converted to our map to a class structure using some technology in your back end language or high level language and translated or implemented in the functionality. OK, so that is not part of this discussion. So let us now go and see what types of data models do we have? So like we gave in the introduction of this section explained in there, so there are two types. One is enterprise data model and the other one is a bounded context data model.
OK, so let’s see what are those in detail. So in an enterprise data model, it’s a way where it is also called as a popularly canonical data model. You might have heard this term before, canonicals and all canonical documents. So that model is called enterprise data model. Here there is exactly one canonical definition for each data type. Okay? And this definition is reused in all the APIs that require that particular data type across your organization. Say, for example, how to tell. Yeah, let’s take in our logistics or supply chain organization scenario, let’s take we have a data type called item. Okay? So we have seen item in our creatures, like a scenario of item we have defined, it has quantity and then it has item ID. And all right, so let’s say there is a definition called item.
Now, if we are doing it in an enterprise data model, then the item data model should be defined or created such a way that it is used across all lobby. Meaning not only create sales order, it should be satisfying the purchase sales order, purchasing team, sales team, sales order, return sales order which might be return salo b team. It could be a buyback order, which is a buyback trading team. So that same item should cater for all these types of order types.
Okay? So in our sales order, what we saw, we have inside item, item ID and the quantity ordered correct. Maybe for purchase order, it could be item ID and quantity purchased. We also have shipped quantity in the response rate, which means it is applicable for sales order, but purchased quantity would be for the purchase order. Similar return quantity could be for the return order. So the item definition should be enriched accordingly. So it should have all those fields catered by making them optional and all and relevant lobby or relevant contextual people will use those fields and ignore others. Okay, but there will be only one item. So this is an enterprise data model where definitely there should be only one fully enriched recapable data type defined for one particular definition, and that should be reused among all the APIs. Now let’s go to the bounded context data model. Okay, so this is really a new term for you. So it is a greatest trend as well. So in a bounded context data model, several bounded contexts are identified within your organization. So how they are identified is based on their usage of common terminology and concepts and all.
Okay? So each bounded context then has its own set of data type definitions. That means every boundary context will have its own set of data types created or ramble definitions created within the boundary context. Okay? So these kind of data models are called bounded context data model, meaning those data models or data types belong to that particular bounded context only. Okay? So to compare, it relevantly to the, for example, previous example. So here we can say one bonded context is purchasing, another bounded context is sales order, another is returned third of a boundary context. Fourth boundary contest is buyback trading here. Now, the item data type can be individually created within each boundary context. For sales order, we can create our item definition as item ID and then quantity, and then quantity shipped order to quantity ship quantity.
That’s it. And there in the purchase order, we can say quantity, item ID, quantity ordered, and the quantity bought are purchased. Okay? Similarly, return and Bbti orders. So that is how the trend changes every boundary context. You may have the same definition according to their context, the fields may differ a little bit or some of them may match, but they belong to only their particular bonded context. So the bonded context data models of separate bonded context are unrelated, but also they may share some fields under like we discussed.
Okay? So all APIs in a bonded context reuse the boundary context data model only. So, for example, in the purchasing order, bounded context, whatever APIs that are going to be created related to the purchasing order, for example, create, purchase order, retrieve purchase order, delete purchase order, purchase order. Similarly, sales also create, upgrade, deleted. Whatever API is created within that particular boundary context, they have to use the same data models, okay? They cannot go and pick the data model from the other boundary context.
All right? So some of the companies, what they follow is to avoid all this confusion or to not make it more complicated. Generally, the companies are most of the companies, how they follow is they to put a simple thumb rule that one API is one boundary context. Okay? So which means, just like how we created, they create only one API called sales order API, like expand sales order or expand pi process P or sales order API like that. Okay, so there will be one API created and they set that. Okay, this is the context of the boundary context. For this API, one AP is one boundary context. And all the definitions are the RAML data types, request, response, et cetera.
For that API, meaning for that boundary context are created for that AP itself and not shared with other APIs, and they are bound for that particular whole project only. I mean API project only. Okay, so this will make it simple so that we won’t get confused, okay? These four boundary contexts use this particular API or these scenarios fall under this boundary contexts and all, no confusion. One API is one boundary context. All right? So now let us see how we can be selecting between the enterprise data model and boundary context data model how to choose which one and all. Okay? So little bit background. So when it comes to this data models, right? So the coordination of this API data models between multiple APIs adds lots of overhead, okay? Because you know, the APIs are built between different teams are among multiple lobbyists, right? So if we have to build a proper data type that is common between many APIs or many Lobs, then why it’s an extra overhead is it should be well coordinated, correct?
We have to talk to all the lob teams, their managers or the business and get the idea understanding of the scenario and then compose the documents so that it is catered for all of them and correct. So it needs a lot of coordination effect. And this coordination effort applies not only during the initial data modeling, also at the time of changing something for that particular API data model in the middle of the project or in the future. And also while rolling out these changes to all the APIs as well. Okay? So this coordination is required in all these three stages. Initially in the modeling, it has to be coordinated to get the understanding. Once done, even the change also with coordination. Because if the same data type is used in ten APIs, even a small change creates a friction among the teams and APIs, everyone hesitates.
That okay. This change may break their functionality, even if it is optional field or anything, hesitation is always there. So this also again needs proper coordination also while rolling it out, meaning while deploying to higher environments, to the production, same thing should be coordinated. This is one of the reason why the enterprise data models, although a seemingly attractive idea, are often not successful, okay? Because it needs a lot of coordination, effort and all. And the same data type should be used everywhere. It is a bit difficult. And the greater the organization or more bigger the organization is, the coordination effort will be very difficult. Okay? That is why it is not much successful. So if there is no successful enterprise data model already in your organization, then it is better to go for bounded context data model only.
Okay? So if this is best fit for, let’s say if there are very big companies, if you’re talking about big companies, they might already have established well defined enterprise data models is some ten years back or five years back only, okay? Because this data model concept is agnostic to very much agnostic to the technology. But if they know their business well and the organization working model well, the data model should have been created with whatever is the world technology. Maybe. They might be XSD Files. Access D schemas. That is fine here. The importance we have to give is the model, okay? Not the XML or JSON. So even if in XSD, if the enterprise has its all data models well defined, a grid among lobs and working with the legacy systems as of now, then it is fine. The same Enterprise Data model can be taken just they need to be converted to JSON structure whatever is a grid which is okay if task anyone can do it as a developer. And that can be used in the API modeling also.
Okay? This is good for some of the big organizations where they already have well established enterprise data model. But if your organization is starting new altogether from nowhere or no existing data models, then it is better to go for bounded context data models. Okay? So that is why if there is a successful Enterprise Data model, then all process APIs and System APIs should reuse the Enterprise Data Model as much as possible. But however, the API data model of experience API? Right. On the other hand, is determined by the needs of the top level API clients only. Okay? Meaning the API consumers only because it is very unlikely to use the Enterprise Data model same as your Experience API request input data model because the consumers may have different needs.
Okay? The Enterprise Data model fits inside your organization data structure exchange data structure that is exchange internal to organization. But externally the needs may change. So for Experience API, the API data model needs will be depending on the APA consumer experience and the process and system API can use the Enterprise Data Model if we already have a successful one. Okay, so the Enterprise Data model if you do not have that, then the boundary context is the best fit, right? Okay. Now let us see what are all the steps that are required in identifying the boundary context and the boundary context data model? Okay, so the MuleSoft or the general It architects, what they say is to establish the boundary context data model properly.
What we need to do is if your organization starting with this model, you have to first, number one, identify the boundary context, okay? Then once the bounded context are identified, then number two, assign each API to exactly one boundary context, okay. Based on the defining data types for that API and all. Okay? Now, once we assign one API to one boundary context here, okay, one thing you need to take care is, say, if an API has no clear set of defining data types or if those data types are used in significantly different variations in different operations. The resources of the API. Then that means it is an indication that your API design is bit wrong. Okay? The API is likely too coarse grained. Otherwise, why between different operations or resources your data type is changing. You should not change, right? If you retrieve has a data type, then recreate also should have somewhat similar or actually similar if it is drastically varying, that means your API how your plan to do is beat coarse grained. So you have to break that. Break up that API and make it different and make it as different bounded context. Okay? Assign different boundary context to different API.
So you have to assign each API end of the day to one bounded context only. Like we discussed is the best practice that all or many companies are following. Then, once you identify the boundary context, assign each API to your boundary context. The third part is define a boundary context data model for each of your identified bounded context or APIs, okay? Meaning now you start your ML definitions and all create your data types. Request response for each boundary context. Separate separately as per the needs of that particular API. And fourth one, always reuse the boundary context data model in the APIs of that boundary context only. Meaning, technically speaking, always use the data types which you created for a particular API after assigning it to boundary context within that API only, okay? Among different operations or resources, but don’t share it between different other APIs. So how do you identify this boundary context?
Okay, we said we have to identify, assign, defined, reuse assigning, defining and reusing is easy. But in the first place, how do you identify the boundary context? How many you have? So again, there is a recommendation. What here is so you have to number one, start with your organization structure. Okay? Like every time we have to again start with the organization structure and then see aim for the structural units where important business concepts are used. Okay? Like we discussed, purchase order, sales order, return orders, buybacks, all these different kinds of business concepts should be should be aimed first. Then once you know that, you have to prefer how small are those can be further defined. Okay? So you can see, okay, if the purchase order is the lowest or finest business scenario, then you can stop there and market as one business context. But if you have a doubt, if you’re in a doubt, then prefer even smaller bounded context. If you still have doubt, then the lowest well, I could discuss this. Put each APA in its own business context.
Okay? So if you still doubt that, okay, man, I do not know how down we can go or what exactly are the business concepts that are under the organization structure and functionally or domain knowledge where if you don’t know to set or identify the boundary context, simply go from bottom up approach kind of thing where simply mark. Okay? Each APA develop is one boundary context. Meaning for every APA of yours, create different data types and use that inside that APA only, okay? Don’t share it with others so that it has its own home and own things to clean up, not to mess up everywhere. All right? So if you do this way, the last way which bit unclean way but still people are doing where if you assign blindly each API to one boundary context, then you need not coordinate the API data models between other APIs, right? So it will be clean still.
That way, indirectly from project perspective is clean. Okay, so bounded context data model like any other RAML and all should be published still. Okay? Because we said one API is one that bounded context, it doesn’t mean that, okay, we need not publish it to RAML fragment because there is no reusability and all, why we are not sharing other API? Why do publish to exchange and all? No, still, we have to publish as a ramble fragments with RAML types and library in the any point design center and the any point exchange, okay? Because even if you’re not sharing with other APIs, it has to be visible in the exchange so that everybody knows it exists, point number one. And point number two, it has to be anywhere used among other operations, right? Resources between retrieve, create, update, delete, it has to be used. I’m just giving four or four of the resources. But there can be many more. So for them to be reused, it should be still on the exchange.
So they have to be published. So generally C four E in your organization owns this activity and the harvesting of the AP types and all will be done by those particular teams, they have to sign off and do all these activities. Okay, so just one last concept in these data models. Is that how to map between the data context? Boundary context data models. Okay, so let’s see how can we map the different boundary contrary data models among the boundary context meaning so we keep on saying that, okay? In boundary context data model concept, each boundary context will have its own set of data types and every API will not share with other APIs, right? So building wise, it is fine. Let’s say we built it. But end of the day, one API may have to call other API, correct? So while calling the API, let’s say if API x is having element called item in the data type, it has item data defined in it item x meaning purchase order say it has item data type with particular fields ABC.
Okay? Now, for whatever reason, this API is calling another API, say sales order which also has the item data type with different fields XYZ. Okay? Now obviously, once we call an API, either we have to pass the request to that APA, the request may have the item, or even if it is not in the request, let’s see if it is a retrieve call in the response of the sales order. Also it may have item, correct? Now, it is a conflict, right? So, because there is an item in that API as well, and the item in this business context, a boundary context and that boundary context also, how will they differentiate and all? Okay, so distinct boundary conduct data models generally have or there’s a chance to have the similar data models among them with minor variations in the variable names.
So in this place, there is nothing we can do. There should be pure cooperation between the APS. So you may think, what is this guy talking? What he mean by cooperation? Because, okay, humans can cooperate. What about APS? So even there is a concept of cooperation in the APS as well. So for example, like I said, if an APA is calling APA B, and if both of them has similar one, then the concept here they have to follow is the calling API always, okay? The calling API, if APA is calling APA B, the calling API has to cooperate and introduce something called an anticorruption layer in the middle. Meaning, when I say in the middle, it’s not between like A and B, okay? It’s inside a only. So when I say just before calling, when the API A, just before calling API B, after it does whatever it has to do, gather information, blah, blah, blah, blah blah, just before calling the API B, this is the place we call as anticorruption layer. What it has to do is it has to convert the structure data model into the type that that particular APA B is expecting, okay?
It might be JSON format, right? It has to convert the structure into that format and send it. Similarly, if it is in the response, for example, APA is called AP. And if the AP has given a response back which has item in it, then APB has to take the details from the response and map it to its own structure APA’s item structure. If it is request, then APA it has to come from item A structure to item B structure. So this approach to mapping between the bounded context data model is called anticorruption layer. So there are other variants of mapping between bounded context data models depending on where transformation occurs. But this fails into the domain, falls into the domain of the detailed design and it is not visible on the enterprise archer level, okay? And therefore we will not be discussing this in this course. There are other ways as well. But for the APA model in generally NewsHour projects and all what people do is better. The caller always compromise or cooperate and convert into or convert from the target API format into local.
So that exposer or APA B need not worry, okay, whether A has my section always the APA B will receive requests in the format APA B is expecting and sends the request out as B is expecting. That way the AP is cool A-B-A because he’s the one who needs the information from B. He has to cooperate and translate. Similarly, B for some reason has to call A. B will also do same thing. In that case, the A will not be A, will not do any extra thing. So this is how the cooperation should come. So the max or best thing that as architect, what we can do here is to identify, if possible I know everything is not possible often, if possible, to reduce such interactions or identify what such API interactions may come. In future where by drawing some intersection diagram or something and see where they may intersect between the APIs and minimize the marriage as possible. If it is possible. So that such kind of anticorruption layers will be less okay, that we can help somehow.
But if such situation is demanding or needed by the business, there is no option but to introduce the anticorruption layer in the calling service. Whoever is calling, whoever is the caller in the caller or calling a service, it has to be implemented. Okay? So this is how you have to understand the relationship between the boundary context. Okay? So more generally, there are important, for example, power relationships at play in your organization. Definitely there will be some powerful lobster. Not all lobs will be equal. Some you might have seen in your company, some companies IG team is the most powerful team because of their CTO. And in some companies the business team or HR team or sales team are powerful. So every company definitely will have some powerful themes, right? So when such teams come into play, they try to use their influence first what they’ll say no, we should not have our data model shared with them. They have to start introducing anticorruption their side and also such kind of firefighting will happen.
So whenever an AP implementation from one boundary context, invokes an APA from another boundary context, for instance, how to say in the domain driven design terminology, this is how let’s put into three parts, okay? The partnership. Partnership means the coordination of caller and call in terms of features and timeline. Okay? So AP B and AP A and B are the partners and the partnership is the coordination between the caller and the call. Okay? Now similarly, it could be a customer supplier, okay. The caller request features from the call who may have to coordinate many callers feature requests. And then finally the confirmed the caller must work with whatever call provides. Meaning whatever API B provides, the caller should take it and convert it.
So this is the best principle so that the egoistic factor influence factor won’t come here. The simple rule is if you are calling me, you want something from me. So you take care of the things inside your layer. Okay? That is anticorruption layer. All right? So this is what is about the data models, enterprise data model and boundary context data model. So just to paraphrase it again, if your organization is well established and if already there is an enterprise data model, even if it is in XSDE or other format, if they are there, then better to go with that model by just converting it to whatever new structure like JSON or fragments and all and publish them.
That best fits if you are a very large organization. But if you are starting new or you don’t have well defined entrepreneurial data model, then bounded context data model are the best fit. And the rules you have to remember are like things like better to have one boundary context map to one API. And each data model skated in a particular boundary context should be reusable within that boundary context only meaning within that API do not share them across.
And whenever there should be a cross cut or cross cut or a call between different boundary contexts, then the caller means whoever is calling, they should better introduce the anticorruption layer and sort it out inside their boundary context by converting into the target format the call format. Okay, so let’s move on to the next sections. There are some exercises as well given just to brainstorm on your side about this boundary context and the data model. Please try to do the assignments and exercises. And let’s move on to the next lecture which is on how to design our APIs for the system APIs by abstracting them from back end systems. Okay, happy learning.
Interesting posts
The Growing Demand for IT Certifications in the Fintech Industry
The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »
CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared
In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »
The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?
If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »
SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification
As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »
CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?
The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »
Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?
The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »