ATOR: Basic Principles of 3D Computer Graphics Applied to Health Sciences

Dear friends,

This post is an introductory material, created for our online and classroom course of "Basic Principles of 3D Computer Graphics Applied to Health Sciences". The training is the result of a partnership that began in 2014, together with the renowned Brazilian orthognathic surgeon, Dr. Everton da Rosa.

Initially the objective was to develop a surgical planning methodology using only free and freeware software. The work was successful and we decided to share the results with the orthognathic surgery community. As soon as we put the first contents related to the searches in our social media, the demand was great and it was not only limited to the professionals of the Dentistry, but extended to all the fields of the human health as well as veterinary.

In view of this demand, we decided to open the initial and theoretical contents of the topics that cover our course (which is pretty practical). In this way, those interested will be able to learn a little about the concepts involved in the training, while those in the area of computer graphics will have at hand a material that will introduce them to the field of modeling and digitization in the health sciences.

In this first post we will cover the concepts related to 3D objects and scenes visualization.

We hope you enjoy it, good reading!

Chapter 1 - Scene Visualization

You already know much of what you need

Cicero Moraes
Arc-Team Brazil

Everton da Rosa
Hospital de Base, Brasília, Brazil

What does it take to learn how to work with 3D?

If you are a person who knows how to operate a computer and at least have already edited a text, the answer is, little.

When editing a text we use the keyboard to enter the information, that is, the words. The keyboard helps us with the shortcuts, for example the most popular CTRL + C and CTRL + V for copy and paste. Note that we do not use the system menu to trigger these commands for a very simple reason, it is much faster and more convenient to do them by the shortcut keys.

When writing a text we do not limit ourselves to writing a sentence or writing a page. Almost always we format the letters, leaving them in bold, setting them as a title or tilting them and importing images or graphics. These latter actions can also be called interoperability.

The name is complex, but the concept is simple. Interoperability is, roughly speaking, the ability of programs to exchange information with one another. That is, you take the photo from a camera, save it on the PC, maybe use an image editor to increase the contrast, then import that image into your document. Well, the image was created and edited elsewhere! This is interoperability! The same is true of a table, which can be made in a spreadsheet editor and later imported into the text editor.

This amount of knowledge is not trivial. We could say that you already have 75% of all the computational skills needed to work with 3D modeling.

Now, if you are one of those who play or have already played a first-person shooter game, you can be sure that you have 95% of everything you need to model in 3D.

How is this possible?

Very simple. In addition to all the knowledge surrounding most computer programs, as already mentioned, the player still develops other capabilities inherent in the field of 3D computer graphics.

When playing on these platforms it is necessary first of all to analyze the scene to which one is going to interact. After studying the field of action, the player moves around the scene and if someone appears on the line the chance of this individual to take a shot is quite large. This ability to move and interact in a 3D environment is the starting piece for working with a modeling and animation program.

Observation of the scene

When we get to an unknown location, the first thing we do is to observe. Imagine that you will take a course in a certain space. Hardly anyone "arrives rushed in’’ an environment. First of all we observe the scene, we make a general survey of the number of people and even study the escape routes in case of a very serious unforeseen event. Then we move through the studied scene, going to the place where we will wait for the beginning of the activities. In a third moment, we interact with the scenario, both using the course equipment such as notebook and pen, as well as talking to other students and / or teachers.

Notice that this event was marked by three phases:

1) Observation

2) Displacement

3) Interaction

In the virtual world of computer graphics the sequence is almost the same. The first part of the process consists in observing the scene, in having an idea of what it is like. This command is known as orbit. That is, an observer orbit (Orbit) the scene watching it, as if it were an artificial satellite around the earth. It maintains a fixed distance and can see the scene from every possible angle.

But, not only orbiting man lives, one must approach to see the details of some specific point. For this we use the zoom commands, already well known to most computer operators. Besides zooming in and out (+ and - zooming) you also need to walk through the scenes or even move horizontally (a movement known as Pan).

A curious fact about these scene-observation commands is that they almost always focus on the mouse buttons. See the table below:

We have above the comparative of three programs that will be discussed later. The important thing now is to know that in the three basic zoom commands we see the direct involvement of the mouse. This makes it very clear that if you come across an open a 3D scene and use these combinations of commands, at least you will shift the viewer .

The phrase "move the observer" has been spelled out, so that you are aware of a situation. So far we are only dealing with observation commands. By the characteristic of its operation, it can very well be confused with the command of rotation of the object. As some would say, "Slow down. It's not quite that way. This is this, and that is that. ". It is very common for beginners in this area to be confused between one and the other.

To illustrate the difference between them, observe in the figure above the scene to the center (Original) that is the initial reference. On the left we observe the orbit command in action (Orbit). See that the grid element (in light gray) that is reference of what would be the floor of the scene accompanies the cube. This is because in fact the one who moves in the scene is the observer and not the elements. At the right side (Rotate) we see the grid in the same position as in the scene in the center, that is, the observer remained at the same point, except that the cube underwent rotation.

Why does this seem confusing?

In the real world, the one we live in, the observer is ... you. You use your eyes to see the space with all the three-dimensional depth that this natural binocular system offers. When we work with 3D modeling and animation software, your eyes become 3D View, that is, the working window where the scene is being presented.

In the real world, when we walk through a space, we have the ground to move. It is our reference. In a 3D scene usually this initial ground is represented by the grid that we saw in the example figure. It is always important to have a reference to work, otherwise it is almost impossible, especially for those who are starting, to do something on the computer.

Display Type

"Television makes you fatten".

Surely you have already heard this phrase in some interview or even some acquainted or someone that had already been filmed and saw the result on the screen. In fact, it can happen that the person seems more robust than the "normal", but the answer is that, in fact, we are all more full-bodied than the structure that our eyes present to us when we look at ourselves in front of the mirror.

In order for you to have a clear idea of what this means, you need to understand some simple concepts that involve viewing from an observer in a 3D modeling and animation program.

The observer in this case is represented by a camera.

Interestingly, one of the most used representations for the camera within a 3D scene is an icon of a pyramid. See the figure above, where three examples are presented. Both Blender 3D software and MeshLab have a pyramid icon to represent the camera in space. The simplest way to represent this structure can be a triangle, like the one on the right side (Icon).

All this is not for nothing. This representation holds in itself the basic principles of photography.

You may have heard of the pinhole camera(dark chamber). In a free translation it means photographic camera of hole. The operation of this mechanism is very simple, it is an archaic camera made with a small box or can. On one side it has a very thin hole and on the other side a photo paper is placed. The hole is covered by a dark adhesive tape until the photographer in question positions the camera in a point. Once the camera is positioned and still, the tape is removed and the film receives the external light for a while. Then the hole is again capped, the camera is transported to a studio and the film revealed, presenting the scene in negative. All simple and functional.

For us what matters is even a few small details. Imagine that we have an object to be photographed (A), the light coming from outside enters the camera through a hole made in the front (B) and projects the inverted image inside the box (C). Anything outside this capture area will be invisible (illustration on the right).

At that point we already have the answer of why the camera icons are similar in different programs. The pyramid represents the projection of the visible area of the camera. Notice that projection of the visible area is not the same as the ALL visible area, that is, we have a small presentation of how the camera receives the external scene.

Anything outside this projection simply will not appear in the scene, as in the case of the above sphere, which is partially hidden.

But there's still one piece left in this puzzle, which is why we seem more robust to TV cameras.

Note the two figures above. Looking at each other, we can identify some characteristics that differentiate them. The image on the left seems to be a structure that is being squeezed, especially when we see the eyes, which seem to jump sideways. On the right, we have a structure that, in relation to another, seems to have the eyes more centered, the nose smaller, the mouth more open and a little more upwards, we see the ears showing and the upper part of the head is notoriously bigger.

Both structures have a lot of visual differences ... but they are all about the same 3D object!

The difference lies in the way the photographs were made. In this case, two different focal lengths were used.

Above we see the two pinhole camera on top. The image on the left indicates the focal length value of 15 and on the right we see the focal length value of 50. On one side we see a more compact structure (15), where the background is very close to the front and on the other a more stretched structure, with a more closed catch angle (50).

But why in this case of 15 focal length, the ears do not appear in the scene?

The explanation is simple and can be approached geometrically. Note that in order to frame the structure in the photo it was necessary to bring it close enough to the light inlet. In doing so, the captured volume (BB) only picks up the front of the face (Visible), hiding the ears (Invisible). At the end, we have a limited projection (CC) that suffers from certain deformation, giving the impression of the eyes being slightly separated.

With the focal length of 50 the visible area of the face is wider. We can attest this to the projection of the visible region, as we have done previously.

In this example we chose to frame the structure very close to the camera capture limits and thus to highlight the capture differences. Thus we clearly see how a larger value of focal length implies in a wider capture of the photographed structure. A good example is that, with a value of 15, we see the lower tips of the ears very discreetly, in 35 the structures are already showing, at 50 the area is almost doubled, and at 100 we have an almost complete view of the ears. Note also that in 100, the marginal region of the eyes transverse the structure of the head and in orthogonal (Ortho) the marginal region of the eyes is aligned with the same structure.

But what is an orthogonal view?

For comprehension to be more complete, let us go by parts.

If we isolate the edges of all the views, align the eyebrows and base of the chin and put the superimposed forms, we will see at the end that the smaller the focal distance, the smaller the structural area visualized. Among all the forms that stand out the most is the orthogonal view. It simply has more area than all the others. We see this to the extreme right by attesting to the blue color appearing in the marginal regions of the overlap.

But, and orthogonal projection, how does it work?

The best example is the facade of a house. Above the left we have a vision with focal length 15 (Perspective) and right in orthogonal.

Analyzing the capture with focal length 15, we have the lines in blue, as usual, representing the boundary of the visible area (limit of the image generated) and in the other lines the projection of some key parts of the structure.

The orthogonal view in turn does not suffer from deformation of the focal length. It simply receives the structural information directly, generating a graph consistent with the measurements of the original, that is, it shows the house "as it is." The process is very reminiscent of the x-ray projection, which represents the x-ray structure without (or almost without) perspective deformation.

Looking at the images side by side, from another point of view, it is possible to attest a marked difference between them. The bottom and top of the side walls are parallel, but if you draw a line in each of these parts in perspective, that path will end up at an intersection that is known as the vanishing point (A and B). In the case of the orthogonal view, the lines are not found, because ... they are parallel! That is, we again see that the orthogonal projection respects the actual structure of the object.

So you mean that orthogonal view is always the best option?

No, it is not always the best option because it all depends on what you are doing. Take as an example the front views, discussed earlier. Even if the orthogonal view offers a larger capture area (D) if we compare the exclusive regions of the orthogonal (E) with the exclusive regions viewed by the focal length perspective 15 (F), we will attest that even covering a smaller area of pixels, The view with perspective deformation contemplated regions that were hidden in the orthogonal view.

Moraes & Salazar-Gamarra (2016)

That answers the question about whether or not people gain weight. The longer the focal length, the more robust the face looks. But this does not mean to fatten or not, but to actually show its structure, that is, the orthogonal image is the individual in his measurements more coherent with the real volumetry.

The interesting thing about this aspect is that it shows that the eyes deceive us, the image we see of people does not correspond to what they are actually structurally speaking. What we see in the mirror does not either.

Professional photographers, for example, are experts for how to exploit this reality and to extract the maximum quality in their works.

View 3D

Have you ever wondered why you have two eyes and not just one? Most of the time we forget that we have two eyes, because we see only one image when we observe things around us.

Take this quick test.

Look for a small object to look at (A), which is about a meter away. Position the indicator (B) pointing up at 15cm from the front of the eyes (C), aligned with the nose.

When looking at the object you will see an object and two fingers.

When looking at the finger, you will see a finger and two objects.

If you observe with just one eye, you will attest that each has a distinct view of the scene.

This is a very simple way to test the limits of the binocular visualization system characteristic of humans. It is also very clear why classic painters close one eye by measuring the proportions of an object with the paint-brush in order to replicate it on the canvas (see the bibliography link for more details). If they used both eyes it just would not work!

You must be wondering how we can see only one image with both eyes. To understand this mechanism a little better, let's take 3D cinema as an example.

What happens if you look at a 3D movie screen without the polarized glasses?

Something like the figure above, a distortion well known to those who have already overdone alcoholic beverages. However, even though it seems the opposite, there is nothing wrong with this image.

When you put on the glasses, each lens receives information related to your eye. We then have two distinct images, such as when we blink to see with only one side. "

Let's reflect a little. If the blurred image enters through the glasses and becomes part of the scenery, transporting us into the movies to the point of being frightened by debris of explosions that seem to be projected onto us ... it may be that the information we receive from the world Be blurred with it. Except that, in the brain, somewhere "magical" happens that instead of showing this blur, the two images come together and form only one.

But why two pictures, why two eyes?

The answer lies precisely in the part of the debris of the explosion coming to us. If you watch the same scene with just one eye, the objects do not "jump" over you. This is because stereoscopic vision (with both eyes) gives you the power to perceive the depth of the environment. That is, the notion of space that we have is due to our binocular vision, without it, although we have notion of the environment because of the perspective, we will very much lose the ability to measure its volume.

Para que você entenda melhor a questão da profundidade da cena, veja a seguinte imagem.

To better understand the depth of the scene, see the following image.

If a group of individuals were asked which of the two objects is ahead of the scene, it is almost certain that most respondents would say that it is the object on the left.

However, not everything is what it seems. The object on the left is further away. This example illustrates how we can be deceived by monocular vision even though it is in perspective.

Would not it be easier for modeling and animation programs to support stereoscopic visualization?

In fact it could be, but the most popular programs still do not offer this possibility. In view of the popularization of virtual reality glasses and the convergence of graphic interfaces, the possibility of this niche has full support for the stereoscopic visualization in the production phase. However, this possibility is more a future projection than a present reality and the interfaces of today still count on many elements that go back decades.

It is for these and other reasons that we need the help of an orthogonal view when working on 3D software.

If on one hand we do not yet have affordable 3D visualization solutions with depth, on the other hand we have robust tools tested and approved for years and years of development. In 1963, for example, the Sketchpad graphic editor was developed at MIT. Since then the way of approaching 3D objects on a digital screen has not changed so much.

The most important of all, is that the technique works very well and with a little training you calmly adapt the methodology, to the point of forgetting that one day you had difficulties with that.

Almost all modeling programs, similar to Sketchpad, offer the possibility of dividing the workspace into four views: Perspective, Front, Right, and Top.

Even though it is not a perspective from which we have the notion of depth, and even the other views being a sort of "facade" of the scene, what we have in the end is a very clear idea of the structure of the scene and the positioning of the objects .

If, on the one hand, dividing the scene into four parts reduces the visual area of each view, on the other hand the specialist can choose to change those views in the total area of the monitor.

Over time, the user will specialize in changing the point of view using the shortcut keys, in order to complete the necessary information and not make mistakes in the composition of the scene.

A sample of the versatility of 3D orientation from orthogonal views is the exercise of the "hat in the little monkey" passed on to beginner students of three-dimensional modeling. This exercise involves asking the students to put a hat (cone) on the primitive Monkey. When trying to use only the perspective view the difficulties are many, because it is very difficult those who are starting to locate in a 3D scene. They are then taught how to use orthogonal views (front, right, top, etc.). The tendency is that the students position the "hat" taking only a view as a reference, in this case front (Front). Only, when they change their perspective view, the hat appears dislocated. When viewed from another point of view, such as right (Right), they realize that the object is far from where it should be. Over time the students "get the hang of it" and change the point of view when working with object positioning.

If we look at the graph of the axes that appear to the left of the figures, we see that in the case of Front we have the information of X and Z, but Y is missing (precisely the depth where the hat was lost) and in the case of Right we have Y and Z , But the X is

missing. The secret is always to orbit the scene or to alternate the viewpoints, so as to have a clear notion of the structure of the scene, thus grounding its future interventions.

Conclusion

For now that’s it, we will soon return with more content addressing the basic principles of 3D graphics applied to health sciences. If you want to receive more news, point out some correction, suggestion or even better know the work of the professionals involved with the composition of this material, please send us a message or even, like the authors pages on Facebook:

3D designer Cícero Moraes's Page

Dr. Everton da Rosa's Page

We thank you for your attention and we leave a big hug here.

See you next time!