Isolating your Data Layer

In a truly layered architecture (MVVM, Viper, etc), the data layer should be relegated to its own layer - every other piece just wants the data. Unfortunately with Core Data and other technologies such as Realm, the actual implementation details (threads, contexts) of this layer tend to leak into other layers or view logic. This makes for an architecture that is harder to scale. This can be solved using plain old Swift objects (POSOs), paired with an understanding of what you lose, and how to overcome it. We will discuss how to move to POSOs while maintaining performance.


Intro (0:00)

In America, there is a type of cheese we call Swiss cheese. As a kid, my dad took us around the world, including a brief stop in Switzerland. While there my dad took us to a market, and suggested that we ask for some Swiss cheese. With a big smile, the deli worker announced, “It’s all Swiss cheese.” While technically true, our understanding of Swiss cheese was different. I should have asked for Emmentaler.

I’m Jon Bott, and I want to talk about how we can clear up these types of misunderstandings within our code, specifically in the data layer. I’ll make the assumption that we’ve all heard of MVP type architectures.

Which is best?

Well, I would argue that it doesn’t matter as much as the decision to separate the layers into their different responsibilities. I’m more interested in the general idea of an MVA, that is a Minimum Viable Architecture. MVA includes the same separation of:

Get more development news like this

  1. The view layer
  2. The business logic layer
  3. The model or data layer, including its network and database pieces

Each of these may have other classes to help keep things organized using solid principles.

Isolating the Data Layer (1:36)

With this layering concept comes the responsibility of isolating the data layer that is hiding the implementation details of that layer.

Why?

Have you ever experienced a Core Data or Realm error up by the view layer? Would you agree that few developers understand the nuances of a complex data layer, regardless of the technology? A good question to ask is, are we ignoring the warning signs of leaking our data layer details?

These type of concerns shouldn’t go beyond the data layer:

  • What thread were these objects created on?
  • What thread should they be saved on?
  • Does my context exist anymore?
  • What about transactions and deletion rules?
  • Is my API erroring out?

I’m not arguing against using ORM-like technologies, but just like with my Swiss cheese story, let’s remove the confusion about what these objects are and how they should be used outside of the model layer.

Outside the Model Layer (2:47)

Objects beyond this layer should be simple and easy to work with. In the view layer, when I ask a person’s address, I don’t want to worry about threads, faults, or complex object graphs. I just want the address.

Let’s think of this in terms of a “toll free” object, with no strings attached. On this, Robert Martin has said:

“Typically the data that crosses the boundaries is simple data structures. You can use basic structs or simple Data Transfer objects if you like… The important thing is that isolated, simple, data structures are passed across the boundaries.”

These data transfer objects are separate from the concrete implementation details of the model layer.

DTOs (3:36)

What are these DTOs? Today, we’ll just call them POSOs, Plain Old Swift Objects. They’re simple objects with little to no logic in them. They contain all the data needed from the model layer objects, and they’re not tied to any single context or thread; they’re simple and clear to work with.

How do we create these DTOs? We convert them from the model objects, the model layer objects, to the POSOs, and now we’ll have the toll free object that can be passed anywhere.

Example (4:22)

Here’s a DTO object with its corresponding Core Data definition. You can see that all the properties are the same.

	class EventDTO {
		var type: EventType
		var date: Date?
		var info: String?
		var location: LocationDTO?
		var targetPerson: PersonDTO?

		init(type: EventType,
			 date: Date? = nil,
			 info: String? = nil,
			 location: LocationDTO? = nil,
			 targetPerson: PersonDTO? = nil)
		{
			self.type = type
			self.date = date
			self.info = info
			self.location = location
			self.targetPerson = targetPerson
		}
	}

Here’s an example of the translation methods. This is one where we’re converting from Core Data to the DTO, and we’re using other translators to translate other type of entities or objects underneath.

	func translate(event: Event?)-> EventDTO? {
		guard let event = event else { return nil }

		let date = event.date as Date?

		let location	= locationTranslator.translate(location: event.location)
		let targetPerson = personTranslator.translate(person: event.targetPerson)

		let eventDTO = EventDTO(type: event.type,
								date: date,
								info: event.info,
								location: location,
								targetPerson: targetPerson)
                
		return eventDTO
	}

Here’s an example of using the DTO back to Core Data. Again we’re using those translators and we have a context coming in.

	func translate(dto: EventDTO?, context: NSManagedObjectContext) -> Event? {
		guard let dto = dto else {return nil}

		let date = dto.date as NSDate?

		let location	= locationTranslator.translate(dto: dto.location, context: context)
		let targetPerson = personTranslator.translate(dto: dto.targetPerson, context: context)

		let event = Event(context: context)
			event.type = dto.type
			event.date = dto.date
			event.info = dto.info
			event.location = location
			event.targetPerson = targetPerson
		
		return event
	}

Pros and Cons (4:59)

What are the pros? We’ll have:

  1. No more context or threading issues
  2. No faulted data
  3. No model complexities exposed to the upper layers

In general, we’ll have fewer bugs because working with this data is easier to consume. Most importantly, this scales well for large or multiple teams as developers can specialize in areas where they work best, in the UI or in the back end. Simply put, there will be fewer headaches.

There are cons though. Basically, there’s no magic. We have to manage the object graph ourselves, meaning we need to determine how many generations of the objects to load, how many of their dependency objects do we load, and when to clear out unused objects.

We also have to handle the translation to and from POSOs, which takes more time to write and process, though with small data sets the run time impact is negligible. I would argue that we spend more time chasing down threading bugs than the few days that would take to write these translators.

Summary (6:09)

By using DTOs, we’re effectively isolating the data–the details and complexities of our data layer–to where they belong. The view and the business logic layers will only be responsible for how to use that data.

When the developer asks for the Swiss cheese of data objects, let’s give them Emmentaler, or DTOs, or as I like to say, “the deets.”

Q&A (6:41)

Q: (asked in Japanese)

Bott: While I was preparing this presentation, I built an example. I work at Ancestry.com, a genealogy company, and we basically work off of the fact that a person has two parents, and those people each have two parents, and so on. It’s an exponential data problem. People have events in their lives, their births, their deaths, and those events are tied to other people, and they’re tied to locations.

The data set that you load in quickly gets very large, and so the question to me sounds like, where do we draw the line between being able to have this loosely coupling, and the over burdensome process of translating objects?

If you can imagine a person–for example, me: I’m married, and I have kids. I have events that are tied to other people. The object graph gets very large, and as soon as we start moving to these DTOs or these POSOs we have to decide, how big does that object graph go? It’s a very difficult question.

I think in normal data sets that we’re presenting, though, we don’t need to show all 10 million of my ancestors or my family. We only need to show nine to twenty rows of data.

What I’m arguing for is that as your team grows, you want to have objects so that people who work in the UI layer or the upper layers don’t have to understand threads in order to get this process to happen.

You would have to ask yourself and your team: “At what point do we want to make it easier to create this data, and store the data?”

If your team is comfortable with many people working in the database and you’re on a single threaded Core Data app or Realm, you’re great. If you are dealing with 24, 30 team members, and only two developers really understand the data layer, then it might be worth it to move towards this direction.

If you use MVP, MVVM, or Viper, all of those have a built in separate object that moves between them. So how much do you want to adhere to MVP, MVVM, Viper, those type of technologies? Are you 100% MVP type architecture, or is it more of a fluid boundary between the two? You would have to answer that as your team.

Next Up: Ready for Realtime and Scale: Announcing Realm Mobile Platform 1.0

General link arrow white

Jon Bott

Jon Bott is a senior iOS developer at Ancestry.com and a training consultant. He has a large range of experience, from front-end development (iOS, Android, and web) to back-end programming, both in large-scale commercial apps and educational apps. Jon is an aspiring photographer and has also worked with media creation (video and audio), as well as developed on platforms for streaming and consuming that media.

Transcribed by Joseph Buelow
Edited by Curtis Chen