Encapsulation

What is encapsulation?

Encapsulation is on of the core concepts of Object Oriented Programming. Some people call those concepts ‘the pillars of OOP’. Other core concepts are Inheritance, Polymorphism and Abstraction. wel. It is quite hard to really define encapsulation, because there are so many different opinions.

Generally, in the context of Object Oriented Programming,  people always seem to mention 2 points when talking about encapsulation: bundling data with operations on that data and data hiding. The opinions are scattered. Encapsulation is either one of those two points or it is both and some groups even say it is different depending on the language (and thus seeing encapsulation as a language mechanism). I will leave all that in the middle  I am interested in software design, so I am looking at encapsulation from a software designing point of view. 

According to the dictionary, encapsulation is the act of enclosing something in or as if in a capsule. I really like that wording. I imagine a capsule to be something like this. You can’t look inside it and there are limited ways of ‘interacting with it’.

In the context of software design, I would say that encapsulation is a designing technique, using both bundling data with operations and hiding that data. We try to mimic the time capsule. The main idea is that you bundle data and the operations on that data into a class or any code container. What data and operations are we talking about here? Truth is, it can be anything, but in order to have a decent design, you should encapsulate data and operations that belong together, that form a process for example. Then there is the part where we ‘hide’ that data and the operations as well! In fact, you want to ‘hide’ as much of the data and the operations as possible. Hide them from who? From users, other developers who plan on using your code. 

An example

Maybe we need an example to better explain this. The following class comes from a small application which is based on a real application I created a couple of years ago. I have adjusted it a tiny bit for the sake of making an example. I will have to explain the context of the application a bit in order for it to make sense. 

The application parses csv files and checked the files on errors before they were automatically processed. The csv files were database exports and they came without a header. The header (that is usually the first line of a cvs file) containing the column names came separately. So a possible error in a line of one of those csv files  was that the line contained either more or less columns then the header did. 

These csv data files sometimes contained values that were unknown in our system. Therefor we had these ‘translation files’. That were just files containing all the possible unknown values followed by the matching value on our end. For example product codes.  A product in a datafile could have the code ‘ProductA’ while in our system the code was simply ‘A’. Anyway, each column could have its own set of translations. That means that each csv file could have an entire set of translation files. So another possible error was a value that is not in the translation files. 

Here is a reworked version of the class that did the parsing of the csv files. This class is a good example of encapsulation. Don’t let the size of the class scare you. You don’t have to understand the implementation in order to understand encapsulation :).  Do mind, the class still has it’s design defects. I did this on purpose so don’t worry.

public class ValidationComponent implements FileValidation {

    private Path processDirectory = Paths.get("/filevalidation/processing");
    private Path headerFiles = Paths.get("/filevalidation/headerfiles");
    private Path translationFiles = Paths.get("/filevalidation/translationfiles");
    private Path fileToProcess;
    private String delimiter;
    private List errors = new LinkedList<>();
    private int currentLineNumber;
    private DataFileType type;


    private String[] currentHeaders;
    private Map<String, Map<String, String>> currentTranslations;

    public ValidationComponent(DataFileType type, Path dataFileToValidate, String delimiter) {
        this.delimiter = delimiter;
        this.type = type;
        this.fileToProcess = dataFileToValidate;
    }

    @Override
    public List checkFile() {
        Path dataFileInProcessing = copyDataFileToProcessDirectory();
        this.currentHeaders = readHeaders();
        this.currentTranslations = readTranslationFiles();
        processFile(dataFileInProcessing);
        return this.errors;
    }

    private Path copyDataFileToProcessDirectory() {
        File fileToCopy = fileToProcess.toFile();
        File copiedFile = Paths.get(processDirectory.toString(), fileToCopy.getName()).toFile();

        try (FileInputStream fis = new FileInputStream(fileToCopy);
             FileOutputStream fos = new FileOutputStream(copiedFile)) {

            int read;
            byte[] buffer = new byte[512];

            while ((read = fis.read(buffer)) != -1) {
                fos.write(buffer, 0, read);
            }

            return Paths.get(copiedFile.toString());
        } catch (FileNotFoundException e) {
            e.printStackTrace();
            return null;
        } catch (IOException e) {
            e.printStackTrace();
            return null;
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }

    private String[] readHeaders() {
        Path headerFile = Paths.get(headerFiles.toString(), type.getHeaderFileName());
        try (BufferedReader reader = new BufferedReader(new FileReader(headerFile.toString()))) {
            String line = reader.readLine();
            return line.split(delimiter);
        } catch (IOException e) {
            e.printStackTrace();
            return null;
        }
    }

    private Map<String, Map<String, String>> readTranslationFiles() {
        Map<String, Map<String, String>> allTranslations = new HashMap<>();
        Path translationDirectory = Paths.get(translationFiles.toString(), type.getTranslationDirectoryName());
        File[] translationFiles = translationDirectory.toFile().listFiles();

        for (File translationFile : translationFiles) {
            Map<String, String> translation = new HashMap<>();
            String translationName = translationFile.getName().split("\\.")[0];
            allTranslations.put(translationName, translation);

            try (BufferedReader reader = new BufferedReader(new FileReader(translationFile.getAbsolutePath()))) {
                String translationLine = reader.readLine();
                while (translationLine != null) {
                    String[] splitTranslation = translationLine.split(":");
                    translation.put(splitTranslation[0], splitTranslation[1]);
                    translationLine = reader.readLine();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return allTranslations;
    }

    private void processFile(Path dataFileInProcessing) {
        File toProcess = dataFileInProcessing.toFile();
        try (BufferedReader reader = new BufferedReader(new FileReader(toProcess.getAbsolutePath()))) {
            String dataLine = reader.readLine();
            currentLineNumber = 1;
            while (dataLine != null) {
                String[] splitDataLine = dataLine.split(delimiter);

                checkLineLength(splitDataLine);
                checkTranslationsForLine(splitDataLine);

                dataLine = reader.readLine();
                currentLineNumber++;
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    private void checkLineLength(String[] splitLine) {
        if (splitLine.length < currentHeaders.length) { errors.add(String.format("Error on line %s: this line does not contain enough fields", currentLineNumber)); } else if (splitLine.length > currentHeaders.length) {
            errors.add(String.format("Error on line %s: this line contains more fields than the header", currentLineNumber));
        }
    }

    private void checkTranslationsForLine(String[] splitLine) {
        for (int i = 0; i < splitLine.length; i++) {
            String currentHeader = currentHeaders[i];
            Map<String, String> translations = currentTranslations.get(currentHeader);

            String fieldTranslation = translations.get(splitLine[i]);
            if (fieldTranslation == null) {
                errors.add(String.format("Error on line %s : no translation found for field %s", currentLineNumber, splitLine[i]));
            }
        }
    }
}

I am not completely happy with the presentation of the above code. I am glad I got syntax highlighting to work, but I guess this blog template (which has a fairly small column width) is not the best for showing code. Here is the link to github where you can watch the file in an optimal format.

As you can see, this class has a lot of fields, which are all private. They can be considered the data. It also has quite a number of methods. Those are also all private, except for one. These can be considered as the operations. So this class bundles all of these together in one unit of code. Because all of the fields and most of the members are private, they are ‘invisible’ to other classes, they cannot be invoked. Other classes can only call the ‘checkFile()’ method and thus are not aware of the complexity that lies behind it. And because of that, this class is easy to use.

These complex details are hidden from the users of this code. You could say these details are ‘removed’, stripped away, because they do not matter for users of the code. Where have we seen this before? Details that do not matter are stripped (or hidden in this case). That sounds a lot like abstraction! When a process is encapsulated in a class and made easy to use by hiding complex details from the user, then you have made an abstraction of that process. 

Why is encapsulation important?

The bundling of data and operations and then hiding those behind a user interface makes users or clients of the code independent of implementation details. And if clients are not dependent on details, you have the power to change those details without forcing changes in client code. As change is inevitable in software, being able to change code without forcing change in other code is Awesome! So we could say that encapsulation protects users of the code (other classes that depend on the encapsulation) from changes. This is very important!

Encapsulation also hides complexity. By bundling both data and operations behind a simple user interface, you can shield other developers from the complexity of processes. They don’t need to understand the how’s and why’s of the process but they can use it anyway. 

Hiding the internal state of an object also protects the internal state of the object. Imagine that ‘ValidationComponent’ is a long lived object and that other objects could manipulate the location of the translation files or the current delimiter that needs to be used. They could cause the “ValidationComponent” to behave in unexpected ways! Maybe an error will be thrown, maybe not. Maybe all is fine as far as the system knows and no error is thrown, but the result is just plain wrong. This can cause strange behaviour and very hard to find ‘bugs’. These ‘brainfucks’ will result in losing time and a ton of frustration for the one that needs to fix it. 

On the other hand, you do have to realise that encapsulation is not waterproof. It is not the same as immutability. Immutable objects (you know, instances from immutable classes) are read only objects. All the variables are set when the object is created and they cannot be changed ever again. If you want to know more about immutable objects, I happily direct you to a post made by Marcus Biel. The aim of encapsulation is not to create immutable objects. It makes sense for a process to have input that can be changed by other objects. It does not make sense however for a process to have constants available for users. 

Getters and Setters

It has become a standard in java development to give fields private access and then write accessors and mutators for them. It almost goes automatically. When you ask why these getters and setters are there, the standard answer is “its for encapsulation”. And it usually remains silent when asked why or how this helps encapsulation. Let’s be critical here. Imagine we provide the “ValidationComponent” class with getters and setters and make all the fields private again. Would that be an improvement? 

private String[] currentHeaders;
public String[] getCurrentHeaders() {
return currentHeaders;
}
public void setCurrentHeaders(String[] currentHeaders) {
this.currentHeaders = currentHeaders;
}

A client can’t directly access the field “currentHeaders”, right? This is nonsense of course. There is still the setter method which can be used to directly set the field. Other objects have equal amount of access to fields when using getters and setters. So making properties private, adding mutators and accessors and calling it encapsulation is pretty naive. Instead you should ask yourself if you actually need getters and setters. Asking yourself this question for every field in your class is a lot closer to encapsulation then just adding getters and setters.

Then getter and setters are useless and unnecessary? Nope. Without getters and setters, fields need to be public accessible for clients in order to use them (ex validationComponent.delimiter = “&”). This way of accessing fields is really easy, but it also makes it very easy to make mistakes.

Getters and setters are a form of encapsulation after all. They can help to shield changes from clients and users. Another benefit of getters and setters is that they can hide some necessary functionality before either returning a field or a setting a field, such as validation or cloning of referenced objects. So getters and setters are useful, but as a developer, you should ask yourself if you really need them for your fields. Maybe you don’t, maybe you only need a getter for a particular field.

Important to remember

  • Encapsulation is bundling data and operations on that data and hiding this behind a user interface.

  • Encapsulation can be considered a form of abstraction because it ‘hides’ the details from objects that use it.

  • Encapsulation helps in dealing with change because the user interface shields implementation details from clients.