Tag Archives: software engineering

Architects Need a Pragmatic Software Development Process

I have been a non-stop software architect since 2006. During my experience, I realized that it’s really hard to perform the role of architect in an organization that doesn’t have a software development process or have it too simplified. When the development is not fairly organized, project managers don’t find a room in their schedule to implement architectural recommendations. They probably have time, people and resources, but since they don’t have a precise idea of the team’s productivity, they feel afraid of accepting new non-functional requirements or changing existing ones. I’ve noticed that, in a chaotic environment, people become excessively pragmatic, averse to changes.

Architects expect that the organization adopts a more predictable and transparent software process. This way, it’s possible to visualize the impact of recommendations and negotiate when they are going to be implemented. They minimally expect a process that has iterations inspired on the classical PDCA (Plan, Do, Check and Act) cycle because it has loops with feedback, which are the foundation for continuous improvement.

The figure below depicts what could be considered as a pragmatic software process.

Iterations are overlapped in time in order to optimize people allocation, use of resources and guarantee the feedback from previous iterations. Each iteration is performed in a fixed period of time. This time depends on the context and it tends to fall as the organization gains more maturity. An iteration is composed of 4 phases (plan, do, check and act) and 5 events that may occur according to the planning. They are:

  • T1: It represents the beginning of the iteration, starting with its planning. The scope of the planning covers only the period of the current iteration. It should not be mixed with the general project planning, which is produced in one of the initial iterations to plan all other iterations. All members of the team participate in the planning.
  • T2: The execution of what was planned for the iteration starts. All members of the team must have something to do within the scope of the iteration. Nothing planned for future iterations should be done in the current iteration. People may produce all sort of output, such as documents, code, reports, meeting minutes, etc.
  • T3: Everything that is produced should be checked. Documents should be reviewed, code should be tested, user interfaces and integrations with other systems should be tested, etc. All found issues must be registered to be solved in due time.
  • T4: Solve all issues found during the check phase and release all planned deliverables. Everybody should deliver something. In case time is not enough to solve some found issues, they must be included in the planning of the next iteration with the highest priority. Statistics should be produced during this phase in order to compare the planning with the execution. The planning of the next iteration also starts at this point, taking advantage of the experience from the previous iteration.
  • T5: Once everything is released, the current iteration finishes. T2 of the next iteration immediately starts because most of people and resources are already available.

T1 to T5 repeats several times, in a fixed period of time, until the end of the project. This suggestion is process-agnostic, thus it can be implemented no matter what software process we claim to have in place or any other modern process we can think of.

In addition to the process, there are also some good practices:

  1. Consider everything that describes how the system implements business needs as use cases. It can also be functional and nonfunctional requirements, user stories, scenarios, etc; but there must be only one sort of artefact to describe business needs.
  2. Write use cases in a way that the text can be reused to: a) help business people to visualize how their needs will work; b) guide testers on the exploratory tests; and c) help the support team to prepare the user manual.
  3. Avoid technical terms in use cases. If really needed, technical details may be documented in another artefact, such as use case realizations.
  4. If needed, create use case realizations using UML models only. Representing use case realisations as documents implies on a huge overhead. Any necessary textual information can be added in the comments area of UML’s elements.
  5. Fix the size of use cases according to the effort to realize it. For example: we can fix that the maximum size of a use case is 1 week. If the estimation is higher than that, then the use case must be divided in two others. If the estimation is far lower than that, then the use case must be merged with another closely related use case. By simply counting the number of use cases we immediately know the effort and the resources required to execute the project. This fixed number is a parameter to compare the planning with the execution. By performing this comparison after every iteration, we gradually know how precise our estimations are becoming.
  6. Use a wiki to document use cases and other required documentations, such as test cases, release notes, etc. Create a wiki page for each use case and use signs to indicate what is still pending to be released. The advantages of the wiki are: a) use cases are immediately available for all stakeholders as they are gathered; b) stakeholders can follow the evolution of the use cases by following updates in the page; c) it’s possible to know everyone who contributed to the use case and what exactly they did; and d) it’s possible to add comments to use cases, preserving all the discussion around it.
  7. If the organization has business processes, which is another thing that architects also love, then put references in the business process’ activities pointing to the use cases that implement them. A reference is a link to the page where the use case is published on the wiki.
  8. Follow up use cases using an issue tracking system, such as Jira. Each use case must have a corresponding Jira ticket and every detail of the use case’s planning, execution, checking and delivery must be registered in that ticket. The advantages of linking Jira tickets with use cases are: a) Jira tickets represent the execution of the planning and their figures can be compared with the planning, generating statistics on which managers can rely on; b) we know exactly every person who contributed to the use case, what they did, and for how long; and c) it’s an important source of lessons learned.
  9. Test, test, test! It must be an obsessive compulsive behaviour. Nothing goes out without passing through the extensive test session.
  10. Constantly train and provide all needed bibliography to the team on the technologies in use. The more technical knowledge we have inside of the team, the highest is our capability to solve problems and increase productivity.

Working this way, everything becomes quantifiable, predictable, comparable and traceable.

From the practices above we can extract the traceability flow from business to the lowest IT level, as depicted in the figure below.

Business process elements such as swimlanes and activities may inspire actors and use cases. Use cases and actors are documented on wiki pages. Each use case and actor has a page on the wiki, which has a unique URL and can be used to refer the element on email messages, documents, Jira tickets and so on. An Jira ticket is created for each use case and it contains a link to the use case’s wiki page. This wiki page can also have a link to the ticket since it also has a unique URL. Jira tickets can be automatically linked the source code through the version control system (SVN) and declaratively linked to system’s features and user interfaces. Since it’s possible to create mock-ups in wiki pages, then we also link those wiki pages with user interfaces to compare the mock-ups with the final user interface. We finally have actors linked to security roles.

I admit that architects are not qualified to define and implement a software development process in the organization (they actually believe more in the Programming Motherfucker philosophy :D), but they are constantly willing to contribute to have one in place. As they have instruments to monitor servers, releases, tests, performance and so on, they also want project managers having instruments to estimate effort, predict events, anticipate problems and, therefore, produce better planning and results. Warning: Whatever we put in our software processes that is not quantifiable or measurable will become an expensive overhead.

Some Interview Questions to Hire a Java EE Developer

The Internet is full of interview questions for Java developers. The main problem of those questions is that they only prove that the candidate has a good memory, remmembering all that syntax, structures, constants, etc. There is not real evaluation of his/her logical reasoning.

I’m listing bellow some examples of interview questions that check the knowledge of the candidate based on his/her experience. The questions were formulated to verify whether the candidate is capable of fulfilling the role of a Java enterprise applications developer. I’m also putting the anwsers in case anybody want to discuss the questions.

1. Can you give some examples of improvements in the Java EE5/6 specification in comparison to the J2EE specification?

The new specification favours convention over configuration and introduces annotations to replace the use of XML for configuration. Inheritance is not used to define components anymore. They are defined, instead, as POJOs. To empower those POJOs with enterprise features, dependency injection was put in place, simplifying the use of EJBs. The persistence layer was fully replaced by the Java Persistence API (JPA).

2. Considering two enterprise systems developed in different platforms, which good options do you propose to exchange data between them?

We can see as potential options nowadays the use of web services and message queues, depending on the scenario. For example: when a system needs to send data, as soon as they are available, to another system or make data available for several systems, then a message queuing system is recommended. When a system has data to be processed by another system and needs back the result of this processing synchronously, then web service is the most indicated option.

3. What do you suggest to implement asynchronous code in Java EE?

There are several options: one can post messages to a queue to be consumed by a Message-Driven Bean (MDB); or annotate a method with @Timer to define the time to execute the code programmatically; or annotate a method with @Scheduler to define the time to execute the code declaratively.

4. Can you illustrate the use of Stateless Session Bean, Statefull Session Bean and Singleton Session Bean?

Stateless Session Beans are used when there is no need to preserve the state of objects between several business transactions. Every transaction has its own instances and instances of components can be retrieved from pools of objects. It is recommended for most cases, when several operations are performed within a transaction to keep the database consistency.

Statefull Session Beans are used when there is the need to preserve the state of objects between business transactions. Every instance of the component has its own objects. These objects are modified by different transactions and they are discarded after reaching a predefined time of inactivity. They can be used to cache those data with intensive use, such as reference data and long record sets for pagination, in order to reduce the volume of IO operations with the database.

A singleton session bean is instantiated once per application and exists for the lifecycle of the application. Singleton session beans are designed for circumstances in which a single enterprise bean instance is shared across and concurrently accessed by clients. They maintain their state between client invocations, which requires a careful implementation to avoid conflicts when accessed concurrently. This kind of component can be used, for example, to initialize the application at its start-up and share a specific object across the application.

5. What is the difference between queue and topic in a message queuing system?

In a queue there is only one producer of messages and only one consumer of these messages (1 – 1). In a topic there is a publisher of messages and several subscribers that will receive those messages (1 – N).

6. Which strategies do you consider to import and export XML content?

If the XML document is formally defined in a schema, we can use JAXB to serialize and deserialize objects into/from XML according to the schema. If the XML document does not have a schema, then there are two situations: 1) when the whole XML content should be consider: In this case, serial access to the whole document is recommended using SAX, or accessed randomly using DOM; 2) when only parts of the XML content should be considered, than XPath can be used or StAX in case operations should be executed immediately after each desired part is found in the document.

7. Can you list some differences between a relational model and an object model?

An object model can be mapped to a relational model, but there are some differences that should be taken into consideration. In the relational model a foreign key has the same type of the target’s primary key, but in the object model and attribute points to the entire related object. In the object model it is possible to have N-N relationships while in the relational model an intermediary entity is needed. There is no support for inheritance, interface, and polymorphism in the relational model.

8. What is the difference between XML Schema, XSLT, WSDL and SOAP?

A XML Schema describes the structure of an XML document and it is used to validate these documents. WSDL (Web Service Definition Language) describes the interface of SOAP-based web services. It can refer to XML schemas to define existing complex types passed by parameter or returned to the caller. SOAP (Simple Object Access Protocol) is the format of the message used to exchange data in a web service call. XSLT (eXtensible Stylesheet Language Transformation) is used to transform XML documents into other document formats.

9. How would you configure an environment to maximize productivity of a development team?

Every developer should have a personal environment capable of executing the whole application in his/her local workstation. The project should be synchronized between developers using a version control system. Integration routines must be executed periodically in order to verify the compatibility and communication between all components of the system. Unit and integration tests must be executed frequently.

You can increment this set of questions covering other subjects like unit testing, dependence injection,  version control and so on. Try to formulate the questions in a way that you don’t get a single answer, but a short analysis from the candidate. People can easily find answers on the Internet, but good analysis can be provided only with accumulated experience.

What about my PhD?

My PhD is something that was not planned but it is happening. The full story about how I got into it is too long and too complicated, but to summarise, it was a consequence of some good results I got at work, which gave a good confidence to my adviser to put me in. I couldn’t say “no” because the opportunity to do a PhD in a prestigious university like UCL was really good.

For a guy who had a long experience in the corporate world, the decision to do a PhD was really tough. I don’t have the practice and the personality of a researcher, but as an entrepreneur, I like to take risks and face challenges and a PhD is definitively a challenge. Surprisingly, it is working well. I didn’t expect that because the way the research environment works is pretty different and I had to get used to it.

A week ago I did my PhD confirmation, which is a kind of acceptance of the work performed so far. I had to present my research to an internal committee composed of local full professors. Besides the feeling of uncertainty, everything went well and they approved my research. In some sense, I was expecting such approval, but I didn’t expected that it would come without any serious remarks. I’m almost sure that it was a consequence of the publications I’ve been doing. Since other scientific committees had a look on the work before, then nothing very disparate would come out from the confirmation committee at that moment.

Of course, a positive feedback is a great motivation to continue my research, but nothing is compared to a bunch of opportunities brought by the PhD student status. Some of these opportunities were actually a dream for many computer science students (mainly those who don’t live in the US, of course). Last summer, I presented my research in a PhD Consortium hosted by the Carnegie Mellon University (CMU). Definitively a dream for me because the Java programming language was created and evolved by people who graduated in that university. The prestigious Software Engineering Institute (SEI) is located there and it was responsible for one of the biggest revolution on the software engineering field with the creation of the CMMI (Capability Maturity Model Integration), a process improvement approach used by several companies worldwide to prove they are ready to perform complex software projects under restricted constraints of cost, resources and time. Not least, the CMU School of Computer Science was also the lab of Randy Pausch, a computer graphics researcher who passed away because of a pancreatic cancer, but before that, he left a vast contribution on his field and also created a project called Alice to teach programming to children. His testimony is published on the book The Last Lecture (a must read).

Fortunately, CMU was not the last big dream to come true during my PhD. Tomorrow, I’m going to Boston to present my work in a conference at MIT (Massachusetts Institute of Technology). Wow! Yeah! Difficult to believe, but it is real. I can’t wait to blog about this experience here. 😉

The Importance of Software Modeling

During my free time I made a very big refactoring of the Planexstrategy business layer. I transformed static business methods to EJB3 (Enterprise Java Beans version 3). All business classes are now stateless session beans and my entity classes are now mapped using JPA (Java Persistence API) to an Oracle database. Putting it in numbers, I converted 73 business classes to EJBs and I mapped 81 entity classes using JPA. I also had to update 270 references to business components in the control layer, fix a lot of bad code and retest the whole application. I spent at least two months to do everything.

Why did I make such madness? If I have to give a reason, just one is enough to justify: Business layer as static methods. Yes, It is pretty ugly for an object-oriented implementation! Anyway, it worked very well with a very good performance for, at least, 2 years. Other good reasons push me out to continue insistently:

  1. EJB is finally easy to implement, maintain and test. Some people disagree about the “easy” adjective, but I assure it after 73 conversions. I had just to add the necessary annotations and make some changes in my business classes because they already look like EJBs, since I tried to simulate the stateless property using static methods. Now, any business method can be exposed as web services, the business layer can be accessed by different types of clients (web or desktop, locally or remotely) and a set of integration possibilities is now available to interchange data with systems of the organization or even outside it.
  2. I hate to create a bizarre package .har to pack entity classes in JBoss. If you want to use Hibernate in an optimal way under JBoss, it’s necessary to put all your entity classes in a .har file, which is deployed within the .ear package. I definitely decided for JPA because I can still use Hibernate but in a standard way, without any explicit reference to the Hibernate library, which gives more portability to the application.
  3. I love to think about the future and when I assume the risk of changing entire layers of an application it’s because I visualize it as part of a big scenario, where workflows will be even more automated, companies will not survive without a minimal level of integration with other companies, there is no programming language generic enough to address all technical issues and such languages should be integrated anyway, and many other things.

This was not my first experience with a big refactoring, but it was the most important in terms of knowledge and convictions. In other moments I will share this knowledge. Now I will talk about an important conviction that I had in the past and now it was weakened in my culture by agile methodologies: the importance of software modeling as a requirement to implement software. I will start the subject citing a Martin Fowler’s text:

“The only checking we can do of UML-like diagrams is peer review. While this is helpful it leads to errors in the design that are often only uncovered during coding and testing. Even skilled designers, such as I consider myself to be, are often surprised when we turn such a design into software.”

Martin is an important agile guru. Anyone who wants to follow an agile methodology should read his ideas about the subject. I did that and I still believe in most of them, but during the Planexstrategy refactoring I learned how important is to see a global view of the implementation before implementing a new functionality. I will illustrate it using UML component models, however, I tried to simplify them in order to save time. When you see a direct link between components, actually you are seeing a link to an interface of the component, as illustrated by the figure below:

The text in the figure is in Portuguese, but it will not confuse your understanding about the problem. It isn’t necessary to explain what each element means. The component CotacaoProdutoBsn uses the component ProdutoBsn through its interface ProdutoLocal. This interface is used by the EJB container to make components available for clients running in the same application server. Because this interface should be defined for all components, I will assume it is present in all links between components and hide it from the component diagram.

After an extensive work to convert all business classes to EJB, I finally went to test the EJBs on the application server. I had to solve a lot of annotation mistakes and finally all error messages disappeared. However, just a subset of EJBs were started. No reasonable message was listed in the log to justify why many others EJBs just didn’t start as expected. I spent some days until I realized that there is no limitation on the number of EJBs, no memory leaks and no additional annotation mistakes, but a serious and dangerous high coupling between EJBs. The figure below illustrates the situation:

As you can see on the highlighted links, AtividadeBsn and TarefaBsn are interdependent and AtividadeBsn and UsoRecursoBsn are also interdependent. These were only two cases in a list of 20. Well, how something terrible like this can happen in a Java application developed by an expert programmer with years of experience? 😉

I should be humble and honest with myself to assume my mistake and publish it in my blog. At the same time, I can list some reasons for that:

  • I received a lot of pressure from costumers to implement new features in a short period of time
  • The runtime infrastructure is not robust enough to validate the application before a deployment process
  • I don’t have a complete overview of the software components and it isn’t easy to detect a high coupling without a complete source code reading of other components.

Then, I realized that, if I had a component diagram like the figure below before starting to implement, that kind of mistake might have been avoided. Don’t you think?

In my opinion: yes. We can visualize if it’s necessary to add new components or just increment an existent one, reuse existent implementations from other components and remember that AtividadeBsn should not invoke a TarefaBsn method and neither should UsoRecursoBsn be aware of the existence of AtividadeBsn.

Finally, contrary to some agile practices, UML diagrams should be strongly considered and it should be drawn using a software instead of sketches in a piece of paper, since we have to update such diagrams all the time. If you disagree with my position, please tell me if I can avoid high coupling mistakes without UML diagrams.